Generated from the public Python API in src/olm.
Each module page includes signatures, docstrings, and source-defined methods such as forward() where available.
Core
| Module | Public API |
|---|---|
olm.core.dist | all_gather, all_reduce, barrier, broadcast, cleanup_distributed, get_backend, get_local_rank, get_rank, +6 more |
olm.core.registry | Registry |
Data
| Module | Public API |
|---|---|
olm.data.datasets | BaseTextDataset, DataLoader, FineWebEduDataset, HuggingFaceTextDataset, LocalTextDataset |
olm.data.datasets.base_dataset | BaseTextDataset |
olm.data.datasets.data_loader | DataLoader |
olm.data.datasets.fineweb_edu | FineWebEduDataset |
olm.data.datasets.hf_dataset | FineWebEduDataset, HuggingFaceTextDataset |
olm.data.datasets.local_dataset | LocalTextDataset |
olm.data.tokenization.base | TokenizerBase |
olm.data.tokenization.hf_tokenizer | HFTokenizer |
olm.data.tokenization.hf_train_custom | HFTokenizerTrainCustom |
Logging
| Module | Public API |
|---|---|
olm.logging | WandBCallback, create_sweep, get_sweep_config_template |
olm.logging.wandb_logger | WandBCallback, create_sweep, get_sweep_config_template |
Models
| Module | Public API |
|---|---|
olm.models | GPT2, GPT2Large, GPT2Medium, GPT2Model, GPT2XL, Gemma2Model, Gemma2_27B, Gemma2_2B, +28 more |
olm.models.alibaba | Qwen2Model, Qwen2_5_0_5B, Qwen2_5_14B, Qwen2_5_1_5B, Qwen2_5_32B, Qwen2_5_3B, Qwen2_5_72B, Qwen2_5_7B |
olm.models.alibaba.qwen2 | Qwen2Block, Qwen2Model, Qwen2_5_0_5B, Qwen2_5_14B, Qwen2_5_1_5B, Qwen2_5_32B, Qwen2_5_3B, Qwen2_5_72B, +1 more |
olm.models.allenai | OLMoModel, OLMo_7B |
olm.models.allenai.olmo | OLMoBlock, OLMoModel, OLMo_7B |
olm.models.facebook | OPT125M, OPTModel |
olm.models.facebook.opt | OPT125M, OPTBlock, OPTModel |
olm.models.google | Gemma2Model, Gemma2_27B, Gemma2_2B, Gemma2_9B |
olm.models.google.gemma2 | Gemma2Block, Gemma2Embedding, Gemma2FinalLogitSoftcap, Gemma2Model, Gemma2_27B, Gemma2_2B, Gemma2_9B |
olm.models.meta | Llama2Model, Llama2_13B, Llama2_70B, Llama2_7B, Llama3Model, Llama3_1_405B, Llama3_1_70B, Llama3_1_8B, +2 more |
olm.models.meta.llama2 | Llama2Block, Llama2Model, Llama2_13B, Llama2_70B, Llama2_7B |
olm.models.meta.llama3 | Llama3Block, Llama3Model, Llama3_1_405B, Llama3_1_70B, Llama3_1_8B, Llama3_2_1B, Llama3_2_3B |
olm.models.microsoft | Phi3Model, Phi3_5_Mini, Phi3_Small, Phi4Model, Phi4_14B |
olm.models.microsoft.phi3 | Phi3Block, Phi3Model, Phi3_5_Mini, Phi3_Small |
olm.models.microsoft.phi4 | Phi4Block, Phi4Model, Phi4_14B |
olm.models.openai | GPT2, GPT2Large, GPT2Medium, GPT2Model, GPT2XL |
olm.models.openai.gpt2 | GPT2, GPT2Block, GPT2Large, GPT2Medium, GPT2Model, GPT2XL |
Neural Network Components
| Module | Public API |
|---|---|
olm.nn.activations.base | ActivationBase |
olm.nn.activations.elu | ELU |
olm.nn.activations.geglu | GeGLU |
olm.nn.activations.gelu | GELU |
olm.nn.activations.glu | GLU |
olm.nn.activations.identity | Identity |
olm.nn.activations.leaky_relu | LeakyReLU |
olm.nn.activations.liglu | LiGLU |
olm.nn.activations.mish | Mish |
olm.nn.activations.prelu | PReLU |
olm.nn.activations.reglu | ReGLU |
olm.nn.activations.relu | ReLU |
olm.nn.activations.selu | SELU |
olm.nn.activations.sigmoid | Sigmoid |
olm.nn.activations.silu | SiLU, Swish |
olm.nn.activations.softmax | Softmax |
olm.nn.activations.softplus | Softplus |
olm.nn.activations.swiglu | SwiGLU |
olm.nn.activations.swish | Swish |
olm.nn.activations.tanh | Tanh |
olm.nn.attention | AttentionBase, AttentionwithRoPEBase, FlashAttention, FlashAttentionwithRoPE, GroupedQueryAttention, MultiHeadAttention, MultiHeadAttentionwithALiBi, MultiHeadAttentionwithRoPE |
olm.nn.attention.alibi | MultiHeadAttentionwithALiBi |
olm.nn.attention.base | AttentionBase, AttentionwithRoPEBase |
olm.nn.attention.flash | FlashAttention, FlashAttentionwithRoPE |
olm.nn.attention.gqa | GroupedQueryAttention |
olm.nn.attention.masks | attention_mask_to_bool |
olm.nn.attention.mha | MultiHeadAttention, MultiHeadAttentionwithRoPE |
olm.nn.blocks.LM | LM |
olm.nn.blocks.linear_projections | QKVProjection |
olm.nn.blocks.output_head | OutputHead |
olm.nn.blocks.transformer_block | TransformerBlock |
olm.nn.embeddings | ALiBiPositionalBias, AbsolutePositionalEmbedding, Embedding, PartialRotaryPositionalEmbedding, PartialScaledRotaryPositionalEmbedding, PositionalEmbeddingBase, RotaryPositionalEmbedding, ScaledRotaryPositionalEmbedding, +1 more |
olm.nn.embeddings.positional | ALiBiPositionalBias, AbsolutePositionalEmbedding, PartialRotaryPositionalEmbedding, PartialScaledRotaryPositionalEmbedding, PositionalEmbeddingBase, RotaryPositionalEmbedding, ScaledRotaryPositionalEmbedding, SinusoidalPositionalEmbedding |
olm.nn.embeddings.positional.absolute | AbsolutePositionalEmbedding |
olm.nn.embeddings.positional.alibi | ALiBiPositionalBias |
olm.nn.embeddings.positional.base | PositionalEmbeddingBase |
olm.nn.embeddings.positional.rope | PartialRotaryPositionalEmbedding, PartialScaledRotaryPositionalEmbedding, RotaryPositionalEmbedding, ScaledRotaryPositionalEmbedding |
olm.nn.embeddings.positional.sinusoidal | SinusoidalPositionalEmbedding |
olm.nn.embeddings.token_embed | Embedding |
olm.nn.feedforward | ClassicFFN, ClassicMoEFFN, FeedForwardBase, GeGLUFFN, GeGLUMoEFFN, SwiGLUFFN, SwiGLUMoEFFN |
olm.nn.feedforward.base | FeedForwardBase |
olm.nn.feedforward.classic_ffn | ClassicFFN |
olm.nn.feedforward.classic_moe | ClassicMoEFFN |
olm.nn.feedforward.geglu_ffn | GeGLUFFN |
olm.nn.feedforward.geglu_moe | GeGLUMoEFFN |
olm.nn.feedforward.moe_base | MoEFeedForwardBase, MoERouter |
olm.nn.feedforward.swiglu_ffn | SwiGLUFFN |
olm.nn.feedforward.swiglu_moe | SwiGLUMoEFFN |
olm.nn.norms | LayerNorm, RMSNorm |
olm.nn.norms.base | NormBase |
olm.nn.norms.layer_norm | LayerNorm |
olm.nn.norms.rms_norm | RMSNorm |
olm.nn.structure.block | Block, load, load_block, load_model |
olm.nn.structure.combinators | BaseCombinator, Parallel, Repeat, Residual |
olm.nn.structure.combinators.base | BaseCombinator |
olm.nn.structure.combinators.parallel | Parallel |
olm.nn.structure.combinators.repeat | Repeat |
olm.nn.structure.combinators.residual | Residual |
olm.nn.torch_nn_wrappers | Linear |
Training
| Module | Public API |
|---|---|
olm.train | AdamW, CheckpointCallback, CosineAnnealingLR, CrossEntropyLoss, DDPTrainer, DeviceConfig, EarlyStoppingCallback, FSDPTrainer, +25 more |
olm.train.callbacks | CheckpointCallback, EarlyStoppingCallback, LRMonitorCallback, MetricsLoggerCallback, ThroughputCallback, ValidationCallback |
olm.train.callbacks.checkpoint_cb | CheckpointCallback |
olm.train.callbacks.early_stopping_cb | EarlyStoppingCallback |
olm.train.callbacks.lr_monitor_cb | LRMonitorCallback |
olm.train.callbacks.metrics_logger_cb | MetricsLoggerCallback |
olm.train.callbacks.throughput_cb | ThroughputCallback |
olm.train.callbacks.validation_cb | ValidationCallback |
olm.train.device | DeviceConfig, TrainerStrategy, detect_devices, determine_strategy, estimate_model_size, parse_device_string, print_strategy_summary |
olm.train.losses | CrossEntropyLoss, KLLoss, LossBase, MaskedCELoss, ZLoss |
olm.train.losses.base | LossBase |
olm.train.losses.cross_entropy | CrossEntropyLoss |
olm.train.losses.kllloss | KLLoss |
olm.train.losses.mce | MaskedCELoss |
olm.train.losses.zloss | ZLoss |
olm.train.optim | AdamW, Lion, OptimizerBase, ZeROOptimizer |
olm.train.optim.adamw | AdamW |
olm.train.optim.base | OptimizerBase |
olm.train.optim.lion | Lion |
olm.train.optim.zero | ZeROOptimizer |
olm.train.schedulers | CosineAnnealingLR, LinearDecayLR, LinearLR, SchedulerBase, WarmupCosineScheduler, WarmupLR |
olm.train.schedulers.base | SchedulerBase |
olm.train.schedulers.cosine | CosineAnnealingLR |
olm.train.schedulers.linear | LinearDecayLR, LinearLR |
olm.train.schedulers.warmup | WarmupCosineScheduler, WarmupLR |
olm.train.trainer | CheckpointCallback, DDPTrainer, EarlyStoppingCallback, FSDPTrainer, LRMonitorCallback, MetricsLoggerCallback, ThroughputCallback, Trainer, +4 more |
olm.train.trainer.auto_trainer | AutoTrainer, auto_trainer |
olm.train.trainer.ddp_trainer | DDPTrainer |
olm.train.trainer.fsdp_trainer | FSDPTrainer |
olm.train.trainer.trainer | Trainer, TrainerCallback |