blob: e52094a92a71bf967a6e735a73da07ae51e8fbfb (
plain) (
blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
|
---
title: "Quick Start"
description: "Run your first benchmark evaluation in 3 steps"
sidebarTitle: "Quick Start"
---
## 1. Run Your First Benchmark
```bash
bun run src/index.ts run -p supermemory -b longmemeval -j gpt-4o -r my-first-run
```
## 2. View Results
### Option A: Web UI
```bash
bun run src/index.ts serve
```
Open [http://localhost:3000](http://localhost:3000) to see results visually.
### Option B: CLI
```bash
# Check run status
bun run src/index.ts status -r my-first-run
# View failed questions for debugging
bun run src/index.ts show-failures -r my-first-run
```
## 3. Compare Providers
Run the same benchmark across multiple providers:
```bash
bun run src/index.ts compare -p supermemory,mem0,zep -b locomo -j gpt-4o
```
Results are saved to `data/runs/{runId}/report.json`.
## Sample Output
```json
{
"accuracy": 0.72,
"accuracyByType": {
"single-hop": 0.85,
"multi-hop": 0.65,
"temporal": 0.70,
"adversarial": 0.68
},
"avgLatency": 1250,
"totalQuestions": 50
}
```
## What's Next
Head to [CLI Reference](/memorybench/cli) to play around with all the commands, or check out [Architecture](/memorybench/architecture) to understand how MemoryBench works under the hood.
|