tmrl.actor module
- class tmrl.actor.ActorModule(observation_space, action_space)[source]
Bases:
ABCImplement this interface for the RolloutWorker(s) to interact with your policy.
Note
If overidden, the __init()__ definition must at least take the two following arguments (args or kwargs): observation_space and action_space. When overriding __init__, don’t forget to call super().__init__ in the subclass.
- Parameters:
observation_space (gymnasium.spaces.Space) – observation space (here for your convenience)
action_space (gymnasium.spaces.Space) – action space (here for your convenience)
- abstract act(obs, test=False)[source]
Must compute an action from an observation.
- Parameters:
obs (object) – the observation
test (bool) – True at test time, False otherwise
- Returns:
the computed action
- Return type:
numpy.array
- load(path, device)[source]
Load and return an instance of your ActorModule from the hard drive.
This method loads your ActorModule from the binary file saved by your implementation of save
If not implemented, load defaults to returning this output of pickle.load(…). By default, the device argument is ignored (but you may want to use it in your implementation).
You need to override this method if your ActorModule is not picklable.
Note
You can use this function to load attributes and return self, or you can return a new instance.
- Parameters:
path (pathlib.Path) – a filepath to load your ActorModule from
device – device to load relevant attributes to (e.g., “cpu” or “cuda:0”)
- Returns:
An instance of your ActorModule
- Return type:
- save(path)[source]
Save your ActorModule on the hard drive.
If not implemented, save defaults to pickle.dump(obj=self, …).
You need to override this method if your ActorModule is not picklable.
Note
Everything needs to be saved into a single binary file. tmrl reads this file and transfers its content over network.
- Parameters:
path (pathlib.Path) – a filepath to save your ActorModule to
- to_device(device)[source]
Set the ActorModule’s relevant attributes to the designated device.
By default, this method is a no-op and returns self.
- Parameters:
device – the device where to move relevant attributes (e.g., “cpu” or “cuda:0”)
- Returns:
an ActorModule whose relevant attributes are moved to device (can be self)
- class tmrl.actor.TorchActorModule(observation_space, action_space, device='cpu')[source]
Bases:
ActorModule,Module,ABCPartial implementation of ActorModule as a torch.nn.Module.
You can implement this instead of ActorModule when using PyTorch. TorchActorModule is a subclass of torch.nn.Module and may implement forward(). Typically, your implementation of act() can call forward() with gradients turned off.
When using TorchActorModule, the act method receives observations collated on device, with an additional dimension corresponding to the batch size.
Note
If overidden, the __init()__ definition must at least take the two following arguments (args or kwargs): observation_space and action_space. When overriding __init__, don’t forget to call super().__init__ in the subclass.
- Parameters:
observation_space (gymnasium.spaces.Space) – observation space (here for your convenience)
action_space (gymnasium.spaces.Space) – action space (here for your convenience)
device – device where your model should live and where observations for act will be collated
- load(path, device)[source]
Load and return an instance of your ActorModule from the hard drive.
This method loads your ActorModule from the binary file saved by your implementation of save
If not implemented, load defaults to returning this output of pickle.load(…). By default, the device argument is ignored (but you may want to use it in your implementation).
You need to override this method if your ActorModule is not picklable.
Note
You can use this function to load attributes and return self, or you can return a new instance.
- Parameters:
path (pathlib.Path) – a filepath to load your ActorModule from
device – device to load relevant attributes to (e.g., “cpu” or “cuda:0”)
- Returns:
An instance of your ActorModule
- Return type:
- save(path)[source]
Save your ActorModule on the hard drive.
If not implemented, save defaults to pickle.dump(obj=self, …).
You need to override this method if your ActorModule is not picklable.
Note
Everything needs to be saved into a single binary file. tmrl reads this file and transfers its content over network.
- Parameters:
path (pathlib.Path) – a filepath to save your ActorModule to
- to(device)[source]
Move and/or cast the parameters and buffers.
This can be called as
- to(device=None, dtype=None, non_blocking=False)[source]
- to(dtype, non_blocking=False)[source]
- to(tensor, non_blocking=False)[source]
- to(memory_format=torch.channels_last)[source]
Its signature is similar to
torch.Tensor.to(), but only accepts floating point or complexdtypes. In addition, this method will only cast the floating point or complex parameters and buffers todtype(if given). The integral parameters and buffers will be moveddevice, if that is given, but with dtypes unchanged. Whennon_blockingis set, it tries to convert/move asynchronously with respect to the host if possible, e.g., moving CPU Tensors with pinned memory to CUDA devices.See below for examples.
Note
This method modifies the module in-place.
- Parameters:
device (
torch.device) – the desired device of the parameters and buffers in this moduledtype (
torch.dtype) – the desired floating point or complex dtype of the parameters and buffers in this moduletensor (torch.Tensor) – Tensor whose dtype and device are the desired dtype and device for all parameters and buffers in this module
memory_format (
torch.memory_format) – the desired memory format for 4D parameters and buffers in this module (keyword only argument)
- Returns:
self
- Return type:
Module
Examples:
>>> # xdoctest: +IGNORE_WANT("non-deterministic") >>> linear = nn.Linear(2, 2) >>> linear.weight Parameter containing: tensor([[ 0.1913, -0.3420], [-0.5113, -0.2325]]) >>> linear.to(torch.double) Linear(in_features=2, out_features=2, bias=True) >>> linear.weight Parameter containing: tensor([[ 0.1913, -0.3420], [-0.5113, -0.2325]], dtype=torch.float64) >>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_CUDA1) >>> gpu1 = torch.device("cuda:1") >>> linear.to(gpu1, dtype=torch.half, non_blocking=True) Linear(in_features=2, out_features=2, bias=True) >>> linear.weight Parameter containing: tensor([[ 0.1914, -0.3420], [-0.5112, -0.2324]], dtype=torch.float16, device='cuda:1') >>> cpu = torch.device("cpu") >>> linear.to(cpu) Linear(in_features=2, out_features=2, bias=True) >>> linear.weight Parameter containing: tensor([[ 0.1914, -0.3420], [-0.5112, -0.2324]], dtype=torch.float16) >>> linear = nn.Linear(2, 2, bias=None).to(torch.cdouble) >>> linear.weight Parameter containing: tensor([[ 0.3741+0.j, 0.2382+0.j], [ 0.5593+0.j, -0.4443+0.j]], dtype=torch.complex128) >>> linear(torch.ones(3, 2, dtype=torch.cdouble)) tensor([[0.6122+0.j, 0.1150+0.j], [0.6122+0.j, 0.1150+0.j], [0.6122+0.j, 0.1150+0.j]], dtype=torch.complex128)
- to_device(device)[source]
Set the ActorModule’s relevant attributes to the designated device.
By default, this method is a no-op and returns self.
- Parameters:
device – the device where to move relevant attributes (e.g., “cpu” or “cuda:0”)
- Returns:
an ActorModule whose relevant attributes are moved to device (can be self)