OLM API Reference

API Reference

Generated from the public Python API in src/olm. Each module page includes signatures, docstrings, and source-defined methods such as forward() where available.

Core

ModulePublic API
olm.core.distall_gather, all_reduce, barrier, broadcast, cleanup_distributed, get_backend, get_local_rank, get_rank, +6 more
olm.core.registryRegistry

Data

ModulePublic API
olm.data.datasetsBaseTextDataset, DataLoader, FineWebEduDataset, HuggingFaceTextDataset, LocalTextDataset
olm.data.datasets.base_datasetBaseTextDataset
olm.data.datasets.data_loaderDataLoader
olm.data.datasets.fineweb_eduFineWebEduDataset
olm.data.datasets.hf_datasetFineWebEduDataset, HuggingFaceTextDataset
olm.data.datasets.local_datasetLocalTextDataset
olm.data.tokenization.baseTokenizerBase
olm.data.tokenization.hf_tokenizerHFTokenizer
olm.data.tokenization.hf_train_customHFTokenizerTrainCustom

Logging

ModulePublic API
olm.loggingWandBCallback, create_sweep, get_sweep_config_template
olm.logging.wandb_loggerWandBCallback, create_sweep, get_sweep_config_template

Models

ModulePublic API
olm.modelsGPT2, GPT2Large, GPT2Medium, GPT2Model, GPT2XL, Gemma2Model, Gemma2_27B, Gemma2_2B, +28 more
olm.models.alibabaQwen2Model, Qwen2_5_0_5B, Qwen2_5_14B, Qwen2_5_1_5B, Qwen2_5_32B, Qwen2_5_3B, Qwen2_5_72B, Qwen2_5_7B
olm.models.alibaba.qwen2Qwen2Block, Qwen2Model, Qwen2_5_0_5B, Qwen2_5_14B, Qwen2_5_1_5B, Qwen2_5_32B, Qwen2_5_3B, Qwen2_5_72B, +1 more
olm.models.allenaiOLMoModel, OLMo_7B
olm.models.allenai.olmoOLMoBlock, OLMoModel, OLMo_7B
olm.models.facebookOPT125M, OPTModel
olm.models.facebook.optOPT125M, OPTBlock, OPTModel
olm.models.googleGemma2Model, Gemma2_27B, Gemma2_2B, Gemma2_9B
olm.models.google.gemma2Gemma2Block, Gemma2Embedding, Gemma2FinalLogitSoftcap, Gemma2Model, Gemma2_27B, Gemma2_2B, Gemma2_9B
olm.models.metaLlama2Model, Llama2_13B, Llama2_70B, Llama2_7B, Llama3Model, Llama3_1_405B, Llama3_1_70B, Llama3_1_8B, +2 more
olm.models.meta.llama2Llama2Block, Llama2Model, Llama2_13B, Llama2_70B, Llama2_7B
olm.models.meta.llama3Llama3Block, Llama3Model, Llama3_1_405B, Llama3_1_70B, Llama3_1_8B, Llama3_2_1B, Llama3_2_3B
olm.models.microsoftPhi3Model, Phi3_5_Mini, Phi3_Small, Phi4Model, Phi4_14B
olm.models.microsoft.phi3Phi3Block, Phi3Model, Phi3_5_Mini, Phi3_Small
olm.models.microsoft.phi4Phi4Block, Phi4Model, Phi4_14B
olm.models.openaiGPT2, GPT2Large, GPT2Medium, GPT2Model, GPT2XL
olm.models.openai.gpt2GPT2, GPT2Block, GPT2Large, GPT2Medium, GPT2Model, GPT2XL

Neural Network Components

ModulePublic API
olm.nn.activations.baseActivationBase
olm.nn.activations.eluELU
olm.nn.activations.gegluGeGLU
olm.nn.activations.geluGELU
olm.nn.activations.gluGLU
olm.nn.activations.identityIdentity
olm.nn.activations.leaky_reluLeakyReLU
olm.nn.activations.ligluLiGLU
olm.nn.activations.mishMish
olm.nn.activations.preluPReLU
olm.nn.activations.regluReGLU
olm.nn.activations.reluReLU
olm.nn.activations.seluSELU
olm.nn.activations.sigmoidSigmoid
olm.nn.activations.siluSiLU, Swish
olm.nn.activations.softmaxSoftmax
olm.nn.activations.softplusSoftplus
olm.nn.activations.swigluSwiGLU
olm.nn.activations.swishSwish
olm.nn.activations.tanhTanh
olm.nn.attentionAttentionBase, AttentionwithRoPEBase, FlashAttention, FlashAttentionwithRoPE, GroupedQueryAttention, MultiHeadAttention, MultiHeadAttentionwithALiBi, MultiHeadAttentionwithRoPE
olm.nn.attention.alibiMultiHeadAttentionwithALiBi
olm.nn.attention.baseAttentionBase, AttentionwithRoPEBase
olm.nn.attention.flashFlashAttention, FlashAttentionwithRoPE
olm.nn.attention.gqaGroupedQueryAttention
olm.nn.attention.masksattention_mask_to_bool
olm.nn.attention.mhaMultiHeadAttention, MultiHeadAttentionwithRoPE
olm.nn.blocks.LMLM
olm.nn.blocks.linear_projectionsQKVProjection
olm.nn.blocks.output_headOutputHead
olm.nn.blocks.transformer_blockTransformerBlock
olm.nn.embeddingsALiBiPositionalBias, AbsolutePositionalEmbedding, Embedding, PartialRotaryPositionalEmbedding, PartialScaledRotaryPositionalEmbedding, PositionalEmbeddingBase, RotaryPositionalEmbedding, ScaledRotaryPositionalEmbedding, +1 more
olm.nn.embeddings.positionalALiBiPositionalBias, AbsolutePositionalEmbedding, PartialRotaryPositionalEmbedding, PartialScaledRotaryPositionalEmbedding, PositionalEmbeddingBase, RotaryPositionalEmbedding, ScaledRotaryPositionalEmbedding, SinusoidalPositionalEmbedding
olm.nn.embeddings.positional.absoluteAbsolutePositionalEmbedding
olm.nn.embeddings.positional.alibiALiBiPositionalBias
olm.nn.embeddings.positional.basePositionalEmbeddingBase
olm.nn.embeddings.positional.ropePartialRotaryPositionalEmbedding, PartialScaledRotaryPositionalEmbedding, RotaryPositionalEmbedding, ScaledRotaryPositionalEmbedding
olm.nn.embeddings.positional.sinusoidalSinusoidalPositionalEmbedding
olm.nn.embeddings.token_embedEmbedding
olm.nn.feedforwardClassicFFN, ClassicMoEFFN, FeedForwardBase, GeGLUFFN, GeGLUMoEFFN, SwiGLUFFN, SwiGLUMoEFFN
olm.nn.feedforward.baseFeedForwardBase
olm.nn.feedforward.classic_ffnClassicFFN
olm.nn.feedforward.classic_moeClassicMoEFFN
olm.nn.feedforward.geglu_ffnGeGLUFFN
olm.nn.feedforward.geglu_moeGeGLUMoEFFN
olm.nn.feedforward.moe_baseMoEFeedForwardBase, MoERouter
olm.nn.feedforward.swiglu_ffnSwiGLUFFN
olm.nn.feedforward.swiglu_moeSwiGLUMoEFFN
olm.nn.normsLayerNorm, RMSNorm
olm.nn.norms.baseNormBase
olm.nn.norms.layer_normLayerNorm
olm.nn.norms.rms_normRMSNorm
olm.nn.structure.blockBlock, load, load_block, load_model
olm.nn.structure.combinatorsBaseCombinator, Parallel, Repeat, Residual
olm.nn.structure.combinators.baseBaseCombinator
olm.nn.structure.combinators.parallelParallel
olm.nn.structure.combinators.repeatRepeat
olm.nn.structure.combinators.residualResidual
olm.nn.torch_nn_wrappersLinear

Training

ModulePublic API
olm.trainAdamW, CheckpointCallback, CosineAnnealingLR, CrossEntropyLoss, DDPTrainer, DeviceConfig, EarlyStoppingCallback, FSDPTrainer, +25 more
olm.train.callbacksCheckpointCallback, EarlyStoppingCallback, LRMonitorCallback, MetricsLoggerCallback, ThroughputCallback, ValidationCallback
olm.train.callbacks.checkpoint_cbCheckpointCallback
olm.train.callbacks.early_stopping_cbEarlyStoppingCallback
olm.train.callbacks.lr_monitor_cbLRMonitorCallback
olm.train.callbacks.metrics_logger_cbMetricsLoggerCallback
olm.train.callbacks.throughput_cbThroughputCallback
olm.train.callbacks.validation_cbValidationCallback
olm.train.deviceDeviceConfig, TrainerStrategy, detect_devices, determine_strategy, estimate_model_size, parse_device_string, print_strategy_summary
olm.train.lossesCrossEntropyLoss, KLLoss, LossBase, MaskedCELoss, ZLoss
olm.train.losses.baseLossBase
olm.train.losses.cross_entropyCrossEntropyLoss
olm.train.losses.klllossKLLoss
olm.train.losses.mceMaskedCELoss
olm.train.losses.zlossZLoss
olm.train.optimAdamW, Lion, OptimizerBase, ZeROOptimizer
olm.train.optim.adamwAdamW
olm.train.optim.baseOptimizerBase
olm.train.optim.lionLion
olm.train.optim.zeroZeROOptimizer
olm.train.schedulersCosineAnnealingLR, LinearDecayLR, LinearLR, SchedulerBase, WarmupCosineScheduler, WarmupLR
olm.train.schedulers.baseSchedulerBase
olm.train.schedulers.cosineCosineAnnealingLR
olm.train.schedulers.linearLinearDecayLR, LinearLR
olm.train.schedulers.warmupWarmupCosineScheduler, WarmupLR
olm.train.trainerCheckpointCallback, DDPTrainer, EarlyStoppingCallback, FSDPTrainer, LRMonitorCallback, MetricsLoggerCallback, ThroughputCallback, Trainer, +4 more
olm.train.trainer.auto_trainerAutoTrainer, auto_trainer
olm.train.trainer.ddp_trainerDDPTrainer
olm.train.trainer.fsdp_trainerFSDPTrainer
olm.train.trainer.trainerTrainer, TrainerCallback