`olm.models.alibaba.qwen2`

Source: src/olm/models/alibaba/qwen2.py:1

Classes

`Qwen2Block(embed_dim: int, intermediate_size: int, num_heads: int, num_kv_heads: int, max_seq_len: int, dropout: float, rope_theta: float, rms_norm_eps: float = 1e-06)`

Bases: olm.nn.structure.block.Block

Source: src/olm/models/alibaba/qwen2.py:9

A single Transformer block for Qwen 2.

Structure

x = x + GQA(RMSNorm(x)) x = x + SwiGLU(RMSNorm(x))

Parameters

embed_dim (int): Model dimension.
intermediate_size (int): FFN hidden dimension.
num_heads (int): Number of attention heads.
num_kv_heads (int): Number of KV heads.
max_seq_len (int): Max sequence length.
dropout (float): Dropout probability.
rope_theta (float): RoPE base.

Methods

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

Source: src/olm/nn/structure/block.py:26

Apply each block to the input in sequence.

Parameters

x: Input tensor.

Returns

Output tensor after all blocks have been applied.

`Qwen2Model(vocab_size: int, embed_dim: int, intermediate_size: int, num_layers: int, num_heads: int, num_kv_heads: int, max_seq_len: int, rope_theta: float, tie_weights: bool = True, dropout: float = 0.0, rms_norm_eps: float = 1e-06)`

Bases: olm.nn.structure.block.Block

Source: src/olm/models/alibaba/qwen2.py:44

Base class for Qwen 2 / 2.5 models.

Structure

Embedding -> [Qwen2Block] x N -> RMSNorm -> tied OutputHead.

Forward

Accepts token IDs shaped [batch, seq_len] and returns logits shaped [batch, seq_len, vocab_size].

Methods

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

Source: src/olm/nn/structure/block.py:26

Apply each block to the input in sequence.

Parameters

x: Input tensor.

Returns

Output tensor after all blocks have been applied.

`Qwen2_5_0_5B()`

Bases: olm.models.alibaba.qwen2.Qwen2Model

Source: src/olm/models/alibaba/qwen2.py:165

Qwen 2.5 0.5B Model.

Methods

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

Source: src/olm/nn/structure/block.py:26

Apply each block to the input in sequence.

Parameters

x: Input tensor.

Returns

Output tensor after all blocks have been applied.

`Qwen2_5_14B()`

Bases: olm.models.alibaba.qwen2.Qwen2Model

Source: src/olm/models/alibaba/qwen2.py:108

Qwen 2.5 14B Model.

Methods

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

Source: src/olm/nn/structure/block.py:26

Apply each block to the input in sequence.

Parameters

x: Input tensor.

Returns

Output tensor after all blocks have been applied.

`Qwen2_5_1_5B()`

Bases: olm.models.alibaba.qwen2.Qwen2Model

Source: src/olm/models/alibaba/qwen2.py:151

Qwen 2.5 1.5B Model.

Methods

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

Source: src/olm/nn/structure/block.py:26

Apply each block to the input in sequence.

Parameters

x: Input tensor.

Returns

Output tensor after all blocks have been applied.

`Qwen2_5_32B()`

Bases: olm.models.alibaba.qwen2.Qwen2Model

Source: src/olm/models/alibaba/qwen2.py:93

Qwen 2.5 32B Model.

Methods

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

Source: src/olm/nn/structure/block.py:26

Apply each block to the input in sequence.

Parameters

x: Input tensor.

Returns

Output tensor after all blocks have been applied.

`Qwen2_5_3B()`

Bases: olm.models.alibaba.qwen2.Qwen2Model

Source: src/olm/models/alibaba/qwen2.py:137

Qwen 2.5 3B Model.

Methods

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

Source: src/olm/nn/structure/block.py:26

Apply each block to the input in sequence.

Parameters

x: Input tensor.

Returns

Output tensor after all blocks have been applied.

`Qwen2_5_72B()`

Bases: olm.models.alibaba.qwen2.Qwen2Model

Source: src/olm/models/alibaba/qwen2.py:78

Qwen 2.5 72B Model.

Methods

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

Source: src/olm/nn/structure/block.py:26

Apply each block to the input in sequence.

Parameters

x: Input tensor.

Returns

Output tensor after all blocks have been applied.

`Qwen2_5_7B()`

Bases: olm.models.alibaba.qwen2.Qwen2Model

Source: src/olm/models/alibaba/qwen2.py:123

Qwen 2.5 7B Model.

Methods

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

Source: src/olm/nn/structure/block.py:26

Apply each block to the input in sequence.

Parameters

x: Input tensor.

Returns

Output tensor after all blocks have been applied.

Classes

Qwen2Block(embed_dim: int, intermediate_size: int, num_heads: int, num_kv_heads: int, max_seq_len: int, dropout: float, rope_theta: float, rms_norm_eps: float = 1e-06)

Methods

forward(self, x: torch.Tensor) -> torch.Tensor (inherited from Block)

Qwen2Model(vocab_size: int, embed_dim: int, intermediate_size: int, num_layers: int, num_heads: int, num_kv_heads: int, max_seq_len: int, rope_theta: float, tie_weights: bool = True, dropout: float = 0.0, rms_norm_eps: float = 1e-06)

Methods

forward(self, x: torch.Tensor) -> torch.Tensor (inherited from Block)

Qwen2_5_0_5B()

Methods

forward(self, x: torch.Tensor) -> torch.Tensor (inherited from Block)

Qwen2_5_14B()

Methods

forward(self, x: torch.Tensor) -> torch.Tensor (inherited from Block)

Qwen2_5_1_5B()

Methods

forward(self, x: torch.Tensor) -> torch.Tensor (inherited from Block)

Qwen2_5_32B()

Methods

forward(self, x: torch.Tensor) -> torch.Tensor (inherited from Block)

Qwen2_5_3B()

Methods

forward(self, x: torch.Tensor) -> torch.Tensor (inherited from Block)

Qwen2_5_72B()

Methods

forward(self, x: torch.Tensor) -> torch.Tensor (inherited from Block)

Qwen2_5_7B()

Methods

forward(self, x: torch.Tensor) -> torch.Tensor (inherited from Block)

`Qwen2Block(embed_dim: int, intermediate_size: int, num_heads: int, num_kv_heads: int, max_seq_len: int, dropout: float, rope_theta: float, rms_norm_eps: float = 1e-06)`

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

`Qwen2Model(vocab_size: int, embed_dim: int, intermediate_size: int, num_layers: int, num_heads: int, num_kv_heads: int, max_seq_len: int, rope_theta: float, tie_weights: bool = True, dropout: float = 0.0, rms_norm_eps: float = 1e-06)`

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

`Qwen2_5_0_5B()`

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

`Qwen2_5_14B()`

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

`Qwen2_5_1_5B()`

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

`Qwen2_5_32B()`

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

`Qwen2_5_3B()`

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

`Qwen2_5_72B()`

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)

`Qwen2_5_7B()`

`forward(self, x: torch.Tensor) -> torch.Tensor` (inherited from `Block`)