--- title: "CLI Reference" description: "Command-line interface for running MemoryBench evaluations" sidebarTitle: "CLI" --- ## Commands ### run Execute the full benchmark pipeline. ```bash bun run src/index.ts run -p -b -j -r ``` | Option | Description | |--------|-------------| | `-p, --provider` | Memory provider (`supermemory`, `mem0`, `zep`) | | `-b, --benchmark` | Benchmark (`locomo`, `longmemeval`, `convomem`) | | `-j, --judge` | Judge model (default: `gpt-4o`) | | `-r, --run-id` | Run identifier (auto-generated if omitted) | | `-m, --answering-model` | Model for answer generation (default: `gpt-4o`) | | `-l, --limit` | Limit number of questions | | `-s, --sample` | Sample N questions per category | | `--sample-type` | Sampling strategy: `consecutive` (default), `random` | | `--force` | Clear checkpoint and restart | See [Supported Models](/memorybench/supported-models) for all available judge and answering models. --- ### compare Run benchmark across multiple providers in parallel. ```bash bun run src/index.ts compare -p supermemory,mem0,zep -b locomo -j gpt-4o ``` --- ### test Evaluate a single question for debugging. ```bash bun run src/index.ts test -r -q ``` --- ### status Check progress of a run. ```bash bun run src/index.ts status -r ``` --- ### show-failures Debug failed questions with full context. ```bash bun run src/index.ts show-failures -r ``` --- ### list-questions Browse benchmark questions. ```bash bun run src/index.ts list-questions -b ``` --- ### Random Sampling Sample N questions per category with optional randomization. ```bash bun run src/index.ts run -p supermemory -b longmemeval -s 3 --sample-type random ``` --- ### serve Start the web UI. ```bash bun run src/index.ts serve ``` Opens at [http://localhost:3000](http://localhost:3000). --- ### help Get help on providers, models, or benchmarks. ```bash bun run src/index.ts help providers bun run src/index.ts help models bun run src/index.ts help benchmarks ``` ## Checkpointing Runs are saved to `data/runs/{runId}/` and automatically resume from the last successful phase. Use `--force` to restart.