aboutsummaryrefslogtreecommitdiff
path: root/scripts/test_scripts/hub/PERF_SEED_README.md
blob: eacb0da55ac69b6adb9daa4ce562359aa9358a0a (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
# Perf-seed workflow

Three-stage pipeline for running repeatable hub-hydration perf tests against a
local MinIO backend seeded with real module data pulled from production S3.

The pipeline is **pack-on only** - the seeded baseline always comes from a hub
launched with `--hub-hydration-enable-pack=true`. The pack-off variant is no
longer maintained.

## Layout

All path arguments are required (no hardcoded defaults). Pick a perf-seed root
with enough free space (snapshots and preserved CAS dirs can be large) and pass
the matching `--*-dir` flag on each invocation. Stage A's hub data dir should
live on the same volume as the snapshot dir so snapshotting is an O(1) rename
per module instead of a cross-volume byte copy; Stage C's hub data dir should
live on a different volume from the MinIO data dir so hub I/O does not skew the
measured perf run.

Example layout (directory names only; pick volumes/roots and pass via `--*-dir`
flags):

```
<perf-seed-A>/               bulk data + Stage A/B flow (one volume = move-friendly)
  hub-a/                     Stage A hub data dir (transient; snapshot-step rename source)
    servers/<moduleid>/
  s3-snapshot/               Preserved production server-state trees (read-only after Stage A)
    <moduleid>/
  hubs/                      Stage B per-bucket hub data dirs (transient)
    hub-b-zen-seed-packed/
  minio-data/                Stage B MinIO data dir (transient)
  minio-seeded-packed/       Preserved packed MinIO CAS (read-only after Stage B + preserve)
    README.txt
  minio-run/                 Stage C MinIO data dir (wiped + re-copied each run)
  perf-runs/                 Per-run archive: hub.log, logs/, hub.utrace, summary.json
    20260423-141530_zen-seed-packed/

<perf-seed-B>/               separate volume from <perf-seed-A> for measurement isolation
  hub-perf/                  Stage C hub data dir (wiped each run)
```

## Prerequisites

- Debug or release build of zenserver + minio: `xmake -y`
- `pip install boto3`
- AWS CLI v2 with an SSO profile configured (for Stage A only)
- Environment variables (or pass equivalents via CLI flags):
  - `ZEN_PERF_S3_URI` - source S3 bucket, e.g. `s3://your-bucket/optional-prefix/`
  - `ZEN_PERF_AWS_PROFILE` - AWS SSO profile name with read access to that bucket
  - `ZEN_PERF_AWS_REGION` - optional, defaults to `us-east-1`

## Stage A - snapshot real S3 data

One-time (or when you want a fresh snapshot from production).

```
export ZEN_PERF_S3_URI=s3://your-bucket/
export ZEN_PERF_AWS_PROFILE=your-sso-profile
python scripts/test_scripts/hub/seed_s3_snapshot.py
```

Provisions N modules from `$ZEN_PERF_S3_URI`, hibernates them, then **moves**
`hub-a/servers/<mid>/` to `s3-snapshot/<mid>/`. When `--hub-data-dir` and
`--snapshot-dir` share a volume (the default) the move is an O(1) rename per
module; cross-volume falls back to a byte copy with the old cost profile. The
hub data dir is wiped on the next run regardless. Triggers `aws sso login`
automatically if the SSO token is missing or expired.

Module selection ranks all UUID-shaped folders by their
`incremental-state.cbo` `LastModified` (newest first, a proxy for
most-recently-accessed) and takes the top `--module-count`.

Options:
- `--module-count N` (default 1000)
- `--snapshot-dir PATH` (required, e.g. `<perf-seed>/s3-snapshot`)
- `--hub-data-dir PATH` (required, e.g. `<perf-seed>/hub-a`)

## Stage B - seed MinIO from the snapshot

One-time, or when `s3-snapshot/` changes.

`seed_minio.py` seeds the `zen-seed-packed` bucket with pack ON
(`--hub-hydration-enable-pack=true` is hardcoded). The script provisions every
module found under `s3-snapshot/`, hibernates them, overlays the snapshot on
top of the hub's servers dir, then deprovisions all modules - which runs the
dehydrate path and uploads the content into the bucket.

```
python scripts/test_scripts/hub/seed_minio.py --wipe --bucket zen-seed-packed
python scripts/test_scripts/hub/preserve_minio_state.py --dest <perf-seed>/minio-seeded-packed
```

`preserve_minio_state.py` MOVES (default; `--copy` to keep source) the
resulting `minio-data/` to the preservation dir and writes a README with
provenance.

Options of interest:
- `--bucket NAME` - bucket name (default `zen-seed-packed`).
- `--wipe` removes the per-bucket hub data dir and the shared minio-data
  dir before starting.
- `--module-count N` caps the set (0 = every module in snapshot-dir).

## Stage C - run a perf iteration

Repeat as often as you want; each run starts from the preserved baseline.

```
python scripts/test_scripts/hub/run_minio_perf.py --bucket zen-seed-packed --trace
```

Steps:
1. Copies `--minio-seeded` over `--minio-run` so MinIO starts from a known state.
2. Wipes `--hub-data-dir` (unless `--no-wipe-hub`).
3. Starts MinIO and hub.
4. Provisions all modules, waits for `provisioned`, deprovisions, waits gone.
5. Stops everything cleanly.

Default mode is `--hub-enable-dehydration=false` so MinIO isn't modified; every
iteration exercises the hydrate-only path against the same baseline CAS.

Pass `--enable-dehydration` to run a full provision -> deprovision cycle that
includes re-upload (dehydrate) at deprovision time. Use this to measure the
dehydrate phase end-to-end against the seeded baseline. Note the seeded
baseline diverges after a `--enable-dehydration` run - re-copy `--minio-seeded`
or re-run `preserve_minio_state.py` if you want to compare to the pristine state.

After each run the hub log, structured zenserver logs, any utrace file, and a
`summary.json` with the run's timings are copied into
`perf-runs/<timestamp>_<bucket>/` so Stage C runs can be compared
post-hoc. Override the destination with `--archive-dir PATH`.

## Resetting between runs

- **Keep**: `s3-snapshot/`, `minio-seeded-packed/`. These are expensive to rebuild.
- **Discard freely**: `hub-a/`, `hubs/`, `hub-perf/`, `minio-data/`, `minio-run/`.

To force a fresh MinIO seed: delete `minio-seeded-packed/` and re-run Stage B
+ preserve. To force a fresh S3 snapshot: delete `s3-snapshot/` and re-run
Stage A.