OLM API Reference

`olm.logging`

Source: src/olm/logging/__init__.py:1

Optional experiment logging integrations for OLM.

Functions

create_sweep(sweep_config: Dict[str, Any], project: str, entity: str | None = None) -> str

Source: src/olm/logging/wandb_logger.py:442

Create a wandb sweep for hyperparameter optimization.

Parameters

  • sweep_config: Sweep configuration dictionary.
  • project: WandB project name.
  • entity: WandB entity (team/username).

Returns

Sweep ID to use with wandb agent.

Example

sweep_config = {
    "method": "bayes",
    "metric": {"name": "train/loss", "goal": "minimize"},
    "parameters": {
        "learning_rate": {
            "distribution": "log_uniform_values",
            "min": 1e-5,
            "max": 1e-3,
        },
        "batch_size": {"values": [8, 16, 32, 64]},
        "weight_decay": {
            "distribution": "uniform",
            "min": 0.0,
            "max": 0.3,
        },
    },
}

sweep_id = create_sweep(sweep_config, project="my-llm-project")
print(f"Run: wandb agent {sweep_id}")

get_sweep_config_template(method: str = 'bayes') -> Dict[str, Any]

Source: src/olm/logging/wandb_logger.py:487

Get a template sweep configuration.

Parameters

  • method: Sweep method ("grid", "random", "bayes"). Default: "bayes".

Returns

Template sweep configuration dictionary.

Example

config = get_sweep_config_template("bayes")
# Customize the config
config["parameters"]["learning_rate"]["min"] = 1e-5
config["parameters"]["learning_rate"]["max"] = 1e-3
# Create sweep
sweep_id = create_sweep(config, project="my-project")

Classes

WandBCallback(project: str, entity: str | None = None, name: str | None = None, tags: List[str] | None = None, notes: str | None = None, config: Dict[str, Any] | None = None, log_frequency: int = 1, log_gradients: bool = False, log_model: bool = False, watch_model: bool = False, watch_freq: int = 1000, log_predictions: bool = False, log_system_metrics: bool = True, alert_thresholds: Dict[str, Dict[str, float]] | None = None, offline: bool = False, resume: str | None = None, group: str | None = None, job_type: str | None = 'train', save_code: bool = True, reinit: bool = True)

Bases: olm.train.trainer.trainer.TrainerCallback

Source: src/olm/logging/wandb_logger.py:23

Callback for Weights & Biases integration with OLM Trainer.

Provides comprehensive experiment tracking including:

  • Training metrics (loss, perplexity, learning rate, throughput)
  • Hyperparameter logging
  • System metrics (GPU memory, CPU usage)
  • Gradient and weight histograms (optional)
  • Model checkpoint artifacts
  • Prediction tables (optional)
  • Alert monitoring (optional)
  • Sweep support for hyperparameter optimization

Parameters

  • project: WandB project name.
  • entity: WandB team/username (defaults to your default entity).
  • name: Run name (auto-generated if None).
  • tags: List of tags for this run.
  • notes: Optional notes/description for this run.
  • config: Hyperparameters and config to log (auto-captured from trainer if None).
  • log_frequency: Log metrics every N steps (default: 1).
  • log_gradients: Enable gradient histogram logging (can slow training).
  • log_model: Save model checkpoints as wandb artifacts.
  • watch_model: Use wandb.watch() for automatic gradient/parameter tracking.
  • watch_freq: Frequency for wandb.watch logging (default: 1000).
  • log_predictions: Enable prediction table logging.
  • log_system_metrics: Log GPU/CPU metrics (default: True).
  • alert_thresholds: Dict of metric thresholds for alerts.
  • Example: {"loss": {"min": 0.1, "max": 10.0}}
  • offline: Run in offline mode (for air-gapped environments).
  • resume: Resume from previous run ("allow", "must", "never", or "auto").
  • group: Group name for grouping runs.
  • job_type: Job type (e.g., "train", "eval", "sweep").
  • save_code: Save training code to wandb (default: True).
  • reinit: Allow multiple wandb.init() calls in same process.

Example

from olm.logging import WandBCallback

# Basic usage
wandb_callback = WandBCallback(
    project="my-llm-project",
    name="llama-7b-baseline",
    tags=["llama", "baseline"],
)

trainer = Trainer(..., callbacks=[wandb_callback])
trainer.train(epochs=10)

# Advanced: with alerts and gradient logging
wandb_callback = WandBCallback(
    project="my-llm-project",
    log_gradients=True,
    watch_model=True,
    alert_thresholds={
        "loss": {"max": 10.0},  # Alert if loss > 10
        "learning_rate": {"min": 1e-6}  # Alert if LR < 1e-6
    },
)

Methods

log_predictions(self, step: int, inputs: List[str], predictions: List[str], targets: List[str] | None = None)

Source: src/olm/logging/wandb_logger.py:414

Log predictions to wandb table.

Parameters

  • step: Current training step.
  • inputs: Input texts.
  • predictions: Model predictions.
  • targets: Target texts (optional).
on_epoch_end(self, trainer, epoch: int) -> None

Source: src/olm/logging/wandb_logger.py:347

Called at the end of each epoch.

on_step_end(self, trainer, step: int, loss: float) -> None

Source: src/olm/logging/wandb_logger.py:317

Called at the end of each optimization step.

on_train_begin(self, trainer) -> None

Source: src/olm/logging/wandb_logger.py:298

Called at the beginning of training.

on_train_end(self, trainer) -> None

Source: src/olm/logging/wandb_logger.py:304

Called at the end of training.