olm.nn.feedforward.classic_ffn¶

`ClassicFFN`(args, *kwargs)	Standard Multi-Layer Perceptron (MLP) used in Transformer blocks.

Standard Multi-Layer Perceptron (MLP) used in Transformer blocks.

Implements a position-wise feed-forward network consisting of two linear transformations with a non-linear activation function in between.

Structure: : Input -> Linear(embed_dim -> hidden_dim) -> Activation -> Dropout -> Linear(hidden_dim -> embed_dim) -> Dropout

Dimension of the inner hidden layer.

Projection from embedding dim to hidden dim.

Activation function.

Projection from hidden dim to embedding dim.

Dropout layer.

Forward pass of the feedforward network.

Parameters: x (torch.Tensor) – Input tensor of shape (batch, seq_len, embed_dim).
Returns: Output tensor of shape (batch, seq_len, embed_dim).
Return type: torch.Tensor

Bases: Module, ABC

Abstract base class for feedforward networks in a transformer block.

Defines the interface for FFNs/MLPs. Subclasses must implement the forward method.

The input and output dimension.

Forward pass of the feedforward network.

Parameters: x (torch.Tensor) – Input tensor of shape (batch, seq_len, embed_dim).
Returns: Output tensor of shape (batch, seq_len, embed_dim).
Return type: torch.Tensor

GELU activation wrapper.

Apply activation to x.

Bases: Linear