olm.nn.norms¶
class olm.nn.norms.LayerNorm(*args: Any, **kwargs: Any)¶
Bases: NormBase
Layer Normalization layer.
Implements Layer Normalization as described in “Layer Normalization” (https://arxiv.org/abs/1607.06450). Normalizes the input across the features dimension.
- Parameters:
- d_model (int) – The dimension of the model to normalize.
- eps (float , optional) – Small constant for numerical stability. Defaults to 1e-5.
- device (torch.device , optional) – Target device.
- dtype (torch.dtype , optional) – Target data type.
gamma¶
Learnable scale parameter.
- Type: nn.Parameter
beta¶
Learnable shift parameter.
- Type: nn.Parameter
forward(x: torch.Tensor) → torch.Tensor¶
Forward pass of LayerNorm.
- Parameters: x (torch.Tensor) – Input tensor of shape (batch_size, sequence_length, d_model).
- Returns: Normalized output tensor of the same shape.
- Return type: torch.Tensor
class olm.nn.norms.RMSNorm(*args: Any, **kwargs: Any)¶
Bases: NormBase
RMSNorm (Root Mean Square Layer Normalization) layer.
Implements RMSNorm as described in “Root Mean Square Layer Normalization” (https://arxiv.org/abs/1910.07467). A simplified version of LayerNorm that scales invariance properties.
- Parameters:
- d_model (int) – The dimension of the model to normalize.
- eps (float , optional) – Small constant for numerical stability. Defaults to 1e-5.
- device (torch.device , optional) – Target device.
- dtype (torch.dtype , optional) – Target data type.
weight¶
Learnable scale parameter.
- Type: nn.Parameter
forward(x: torch.Tensor) → torch.Tensor¶
Forward pass of RMSNorm.
- Parameters: x (torch.Tensor) – Input tensor of shape (batch_size, sequence_length, d_model).
- Returns: Normalized output tensor of the same shape.
- Return type: torch.Tensor
Modules¶
base |
|
|---|---|
layer_norm |
|
rms_norm |