RNN_StackOverFlow#
- class fl_sim.models.RNN_StackOverFlow(vocab_size: int = 10000, num_oov_buckets: int = 1, embedding_size: int = 96, latent_size: int = 670, num_layers: int = 1)[source]#
Bases:
Module
,CLFMixin
,SizeMixin
,DiffMixin
Creates a RNN model using LSTM layers for StackOverFlow (next word prediction task).
This replicates the model structure in the paper [Reddi et al.[1]].
Modified from FedML.
- Parameters:
vocab_size (int, default 10000) – The number of different words that can appear in the input.
num_oov_buckets (int, default 1) – The number of out-of-vocabulary buckets.
embedding_size (int, default 96) – The size of each embedding vector.
latent_size (int, default 670) – The number of features in the hidden state h.
num_layers (int, default 1) – The number of recurrent layers (
torch.nn.LSTM
).
References
- forward(input_seq: Tensor, hidden_state: Tensor | None = None) Tensor [source]#
Forward pass.
- Parameters:
input_seq (torch.Tensor) – Shape
(batch_size, seq_len)
, dtypetorch.long
.hidden_state (torch.Tensor, optional) – Shape
(num_layers, batch_size, latent_size)
, dtypetorch.float32
.
- Returns:
Shape
(batch_size, extended_vocab_size, seq_len)
, dtypetorch.float32
.- Return type: