`olm.models.openai`

Source: src/olm/models/openai/__init__.py:1

Classes

`GPT2()`

Bases: olm.models.openai.gpt2.GPT2Model

Source: src/olm/models/openai/gpt2.py:63

GPT-2 Small (124M).

Methods

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

Source: src/olm/nn/structure/block.py:26

Apply each block to the input in sequence.

Parameters

x: Input tensor.

Returns

Output tensor after all blocks have been applied.

`GPT2Large()`

Bases: olm.models.openai.gpt2.GPT2Model

Source: src/olm/models/openai/gpt2.py:85

GPT-2 Large (774M).

Methods

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

Source: src/olm/nn/structure/block.py:26

Apply each block to the input in sequence.

Parameters

x: Input tensor.

Returns

Output tensor after all blocks have been applied.

`GPT2Medium()`

Bases: olm.models.openai.gpt2.GPT2Model

Source: src/olm/models/openai/gpt2.py:74

GPT-2 Medium (355M).

Methods

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

Source: src/olm/nn/structure/block.py:26

Apply each block to the input in sequence.

Parameters

x: Input tensor.

Returns

Output tensor after all blocks have been applied.

`GPT2Model(vocab_size: int, embed_dim: int, num_layers: int, num_heads: int, max_seq_len: int, dropout: float = 0.1, tie_weights: bool = True)`

Bases: olm.nn.structure.block.Block

Source: src/olm/models/openai/gpt2.py:34

Base class for GPT-2 models.

Structure

Token embedding + learned positional embedding -> GPT2Block x N -> tied OutputHead.

Forward

Accepts token IDs shaped [batch, seq_len] and returns logits shaped [batch, seq_len, vocab_size].

Methods

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

Source: src/olm/nn/structure/block.py:26

Apply each block to the input in sequence.

Parameters

x: Input tensor.

Returns

Output tensor after all blocks have been applied.

`GPT2XL()`

Bases: olm.models.openai.gpt2.GPT2Model

Source: src/olm/models/openai/gpt2.py:96

GPT-2 XL (1.5B).

Methods

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

Source: src/olm/nn/structure/block.py:26

Apply each block to the input in sequence.

Parameters

x: Input tensor.

Returns

Output tensor after all blocks have been applied.

Classes

GPT2()

Methods

forward(self, x: torch.Tensor) -> torch.Tensor (inherited from Block)

GPT2Large()

Methods

forward(self, x: torch.Tensor) -> torch.Tensor (inherited from Block)

GPT2Medium()

Methods

forward(self, x: torch.Tensor) -> torch.Tensor (inherited from Block)

GPT2Model(vocab_size: int, embed_dim: int, num_layers: int, num_heads: int, max_seq_len: int, dropout: float = 0.1, tie_weights: bool = True)

Methods

forward(self, x: torch.Tensor) -> torch.Tensor (inherited from Block)

GPT2XL()

Methods

forward(self, x: torch.Tensor) -> torch.Tensor (inherited from Block)

`GPT2()`

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

`GPT2Large()`

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

`GPT2Medium()`

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

`GPT2Model(vocab_size: int, embed_dim: int, num_layers: int, num_heads: int, max_seq_len: int, dropout: float = 0.1, tie_weights: bool = True)`

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

`GPT2XL()`

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)