Source: src/olm/nn/activations/geglu.py:1
Classes
GeGLU(*, device: torch.device | None = None, dtype: torch.dtype | None = None) -> None
Bases: olm.nn.activations.base.ActivationBase
Source: src/olm/nn/activations/geglu.py:6
GeGLU activation function.
Implements the GeGLU variant from "GLU Variants Improve Transformer". GeGLU(x, W, V) = GELU(xW) * (xV) Here: GeGLU(x) = GELU(gate) * value
Parameters
device(torch.device, optional): Target device.dtype(torch.dtype, optional): Target data type.
Methods
forward(self, x: torch.Tensor) -> torch.Tensor
Source: src/olm/nn/activations/geglu.py:20
Forward pass of GeGLU.
Parameters
x(torch.Tensor): Input tensor.
Returns
torch.Tensor: Output tensor with half the last dimension.