OLM API Reference

`olm.models.openai`

Source: src/olm/models/openai/__init__.py:1

Classes

GPT2()

Bases: olm.models.openai.gpt2.GPT2Model

Source: src/olm/models/openai/gpt2.py:63

GPT-2 Small (124M).

Methods

forward(self, x: torch.Tensor) -> torch.Tensor (inherited from Block)

Source: src/olm/nn/structure/block.py:26

Apply each block to the input in sequence.

Parameters

  • x: Input tensor.

Returns

Output tensor after all blocks have been applied.

GPT2Large()

Bases: olm.models.openai.gpt2.GPT2Model

Source: src/olm/models/openai/gpt2.py:85

GPT-2 Large (774M).

Methods

forward(self, x: torch.Tensor) -> torch.Tensor (inherited from Block)

Source: src/olm/nn/structure/block.py:26

Apply each block to the input in sequence.

Parameters

  • x: Input tensor.

Returns

Output tensor after all blocks have been applied.

GPT2Medium()

Bases: olm.models.openai.gpt2.GPT2Model

Source: src/olm/models/openai/gpt2.py:74

GPT-2 Medium (355M).

Methods

forward(self, x: torch.Tensor) -> torch.Tensor (inherited from Block)

Source: src/olm/nn/structure/block.py:26

Apply each block to the input in sequence.

Parameters

  • x: Input tensor.

Returns

Output tensor after all blocks have been applied.

GPT2Model(vocab_size: int, embed_dim: int, num_layers: int, num_heads: int, max_seq_len: int, dropout: float = 0.1, tie_weights: bool = True)

Bases: olm.nn.structure.block.Block

Source: src/olm/models/openai/gpt2.py:34

Base class for GPT-2 models.

Structure

Token embedding + learned positional embedding -> GPT2Block x N -> tied OutputHead.

Forward

Accepts token IDs shaped [batch, seq_len] and returns logits shaped [batch, seq_len, vocab_size].

Methods

forward(self, x: torch.Tensor) -> torch.Tensor (inherited from Block)

Source: src/olm/nn/structure/block.py:26

Apply each block to the input in sequence.

Parameters

  • x: Input tensor.

Returns

Output tensor after all blocks have been applied.

GPT2XL()

Bases: olm.models.openai.gpt2.GPT2Model

Source: src/olm/models/openai/gpt2.py:96

GPT-2 XL (1.5B).

Methods

forward(self, x: torch.Tensor) -> torch.Tensor (inherited from Block)

Source: src/olm/nn/structure/block.py:26

Apply each block to the input in sequence.

Parameters

  • x: Input tensor.

Returns

Output tensor after all blocks have been applied.