experience.py notes:

Methods :

  • __init__(self, env_info, algo_info, device, aux_tensor_dict=None)

    • self.horizon_length = algo_info[‘horizon_length’]

      • In play_steps() we take env steps for the horizon length and update the experience buffer. If I added HER observations would the horizon length need to be doubled? How does the buffer update work?

    • self.obs_base_shape = (self.horizon_length, self.num_agents * self.num_actors)

    • self.state_base_shape = (self.horizon_length, self.num_actors)