Server#
- class fl_sim.nodes.Server(model: Module, dataset: FedDataset, config: ServerConfig, client_config: ClientConfig, lazy: bool = False)[source]#
Bases:
Node
,CitationMixin
The class to simulate the server node.
The server node is responsible for communicating with clients, and perform the aggregation of the local model parameters (and/or gradients), and update the global model parameters.
- Parameters:
model (torch.nn.Module) – The model to be trained (optimized).
dataset (FedDataset) – The dataset to be used for training.
config (ServerConfig) – The configs for the server.
client_config (ClientConfig) – The configs for the clients.
lazy (bool, default False) – Whether to use lazy initialization for the client nodes. This is useful when one wants to do centralized training for verification.
TODO
Run clients training in parallel.
Use the attribute _is_convergent to control the termination of the training. This perhaps can be achieved by comparing part of the items in self._cached_models
- add_parameters(params: Iterable[Parameter], ratio: float) None [source]#
Update the server’s parameters with the given parameters.
- Parameters:
params (Iterable[torch.nn.parameter.Parameter]) – The parameters to be added.
ratio (float) – The ratio of the parameters to be added.
- Return type:
None
- aggregate_client_metrics(ignore: Sequence[str] | None = None) None [source]#
Aggregate the metrics transmitted from the clients.
- Parameters:
ignore (Sequence[str], optional) – The metrics to ignore.
- Return type:
None
- avg_parameters(size_aware: bool = False, inertia: float = 0.0) None [source]#
Update the server’s parameters via averaging the parameters received from the clients.
- Parameters:
size_aware (bool, default False) – Whether to use the size-aware averaging, which is the weighted average of the parameters, where the weight is the number of training samples. From the view of optimization theory, this is recommended to be set False.
inertia (float, default 0.0) – The weight of the previous parameters, should be in the range [0, 1).
- Return type:
None
- abstract property config_cls: Dict[str, type]#
Class of the client node config and server node config.
Keys are “client” and “server”.
- evaluate_centralized(dataloader: DataLoader) Dict[str, float] [source]#
Evaluate the model on the given dataloader on the server node.
- Parameters:
dataloader (DataLoader) – The dataloader for evaluation.
- Returns:
metrics – The metrics of the model on the given dataloader.
- Return type:
- get_cached_metrics(client_idx: int | None = None) List[Dict[str, float]] [source]#
Get the cached metrics of the given client, or the cached aggregated metrics stored on the server.
- get_client_data(client_idx: int) Tuple[Tensor, Tensor] [source]#
Get all the data of the given client.
This method is a helper function for fast access to the data of the given client.
- Parameters:
client_idx (int) – The index of the client.
- Returns:
Input data and labels of the given client.
- Return type:
Tuple[Tensor, Tensor]
- get_client_model(client_idx: int) Module [source]#
Get the model of the given client.
This method is a helper function for fast access to the model of the given client.
- Parameters:
client_idx (int) – The index of the client.
- Returns:
The model of the given client.
- Return type:
- train(mode: str = 'federated', extra_configs: dict | None = None) None [source]#
The main training loop.
- Parameters:
mode ({"federated", "centralized", "local"}, optional) – The mode of training, by default “federated”, case-insensitive.
extra_configs (dict, optional) – The extra configs for the training mode.
- Return type:
None
- train_centralized(extra_configs: dict | None = None) None [source]#
Centralized training, conducted only on the server node.
This is used as a baseline for comparison.
- Parameters:
extra_configs (dict, optional) – The extra configs for centralized training.
- Return type:
None
- train_federated(extra_configs: dict | None = None) None [source]#
Federated (distributed) training, conducted on the clients and the server.
- Parameters:
extra_configs (dict, optional) – The extra configs for federated training.
- Return type:
None
TODO
Run clients training in parallel.