Install from one official script.
We currently support one installation system only: the official signed installer scripts for Linux/macOS and Windows. Additional package managers may come later.
What to expect
Auto-detects RAM, GPU, VRAM, CPU. Fills the config screen with sensible defaults. Nothing to configure manually.
Pick a TOML-defined model. VRAM estimate shown per model and per context size so you launch right the first time.
Server spawns. Logs, GPU stats, throughput, cost, and energy stream in the same terminal surface — no second window.
- NVIDIA GPU via NVML
- CPU + memory stats
- All features available
- CPU monitoring
- Shared memory stats
- No GPU access (Apple Silicon)
- Basic launch + config
- WMI hardware stub
- GPU support planned
Personal adds vulcanized logs with raw/vulcanized export modes. Professional adds LlamaMon-parsed JSON/NDJSON event export built from raw llama.cpp logs for Datadog-style pipelines, plus ROI tracking, cloud price comparison, and persistent analytics.