OLM API Reference

`olm.models.facebook`

Source: src/olm/models/facebook/__init__.py:1

Classes

OPT125M()

Bases: olm.models.facebook.opt.OPTModel

Source: src/olm/models/facebook/opt.py:131

OPT 125M Model Definition.

Methods

forward(self, x: torch.Tensor) -> torch.Tensor (inherited from Block)

Source: src/olm/nn/structure/block.py:26

Apply each block to the input in sequence.

Parameters

  • x: Input tensor.

Returns

Output tensor after all blocks have been applied.

OPTModel(vocab_size, embed_dim, intermediate_size, num_layers, num_heads, dropout=0.1, tie_weights=True)

Bases: olm.nn.structure.block.Block

Source: src/olm/models/facebook/opt.py:69

OPT Model Definition.

Implements a decoder-only Transformer with specific OPT optimizations:

  • Pre-normalization with LayerNorm
  • Multi-Head Attention with Causal Masking
  • ReLU activation in Feed-Forward Networks
  • Tied output projection through OutputHead by default

Forward

Accepts token IDs shaped [batch, seq_len] and returns logits shaped [batch, seq_len, vocab_size].

Parameters

  • vocab_size (int): Vocabulary size.
  • embed_dim (int): Embedding dimension.
  • intermediate_size (int): FFN dimension.
  • num_layers (int): Number of layers.
  • num_heads (int): Number of heads.
  • dropout (float, optional): Dropout probability. Defaults to 0.1.

Methods

forward(self, x: torch.Tensor) -> torch.Tensor (inherited from Block)

Source: src/olm/nn/structure/block.py:26

Apply each block to the input in sequence.

Parameters

  • x: Input tensor.

Returns

Output tensor after all blocks have been applied.