Benchmarking Agents
Benchmarking Agents#
We’ll shortly explain how to benchmark agents.
# Define a small scale experiment config
experiment_config = ExperimentConfig(
n_seeds=1,
n_steps=5_000,
max_interaction_time_s=1 * 30,
log_performance_indicators_every=1000,
)
# Take the default colosseum benchmark for the episodic ergodic and the continuous communicating settings
b_e = ColosseumDefaultBenchmark.EPISODIC_QUICK_TEST.get_benchmark()
b_e.experiment_config = experiment_config
b_c = ColosseumDefaultBenchmark.CONTINUOUS_QUICK_TEST.get_benchmark()
b_c.experiment_config = experiment_config
# Randomly sample some episodic agents
agents_configs_e = {
PSRLEpisodic : sample_agent_gin_configs_file(PSRLEpisodic, n=1, seed=seed),
QLearningEpisodic : sample_agent_gin_configs_file(QLearningEpisodic, n=1, seed=seed),
}
# Randomly sample some continuous agents
agents_configs_c = {
QLearningContinuous : sample_agent_gin_configs_file(QLearningContinuous, n=1, seed=seed),
UCRL2Continuous : sample_agent_gin_configs_file(UCRL2Continuous, n=1, seed=seed),
}
# Obtain the experiment instances for the agents configurations and the benchmark
agents_and_benchmarks = [
(agents_configs_e, b_e),
(agents_configs_c, b_c),
]
experiment_instances = instantiate_and_get_exp_instances_from_agents_and_benchmarks(agents_and_benchmarks)
# Run the experiment instances
# Note that if multiprocessing is enabled, Colosseum will take advantage of it
run_experiment_instances(experiment_instances)