RNN_StackOverFlow

RNN_StackOverFlow#

class fl_sim.models.RNN_StackOverFlow(vocab_size: int = 10000, num_oov_buckets: int = 1, embedding_size: int = 96, latent_size: int = 670, num_layers: int = 1)[source]#

Bases: Module, CLFMixin, SizeMixin, DiffMixin

Creates a RNN model using LSTM layers for StackOverFlow (next word prediction task).

This replicates the model structure in the paper [Reddi et al.[1]].

Modified from FedML.

Parameters:

vocab_size (int, default 10000) – The number of different words that can appear in the input.
num_oov_buckets (int, default 1) – The number of out-of-vocabulary buckets.
embedding_size (int, default 96) – The size of each embedding vector.
latent_size (int, default 670) – The number of features in the hidden state h.
num_layers (int, default 1) – The number of recurrent layers (torch.nn.LSTM).

References

forward(input_seq: Tensor, hidden_state: Tensor | None = None) → Tensor[source]#

Forward pass.

Parameters:

input_seq (torch.Tensor) – Shape (batch_size, seq_len), dtype torch.long.
hidden_state (torch.Tensor, optional) – Shape (num_layers, batch_size, latent_size), dtype torch.float32.

Returns:

Shape (batch_size, extended_vocab_size, seq_len), dtype torch.float32.

Return type:

torch.Tensor

pipeline(truncated_sentence: str, word_to_id: Callable[[str], int] | Dict[str, int] | None = None, id_to_word: Callable[[int], str] | Dict[int, str] | None = None) → str[source]#

Predict the next word given a truncated sentence.

Parameters:

truncated_sentence (str) – The truncated sentence.
word_to_id (Callable[[str], int] or Dict[str, int], optional) – A function that maps a word to its id.
id_to_word (Callable[[int], str] or Dict[int, str], optional) – A function that maps an id to its word.

Returns:

The predicted next word.

Return type:

str

RNN_StackOverFlow

Contents

RNN_StackOverFlow#