README.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306

# Plutia

Plutia is a deterministic, verifiable PLC mirror for `plc.directory` with signed checkpoints and a proof-serving API.

## Key Capabilities

- Mirror and resolver modes.
- Pebble-backed state/index storage.
- Compressed append-only operation blocks (`zstd`) in mirror mode.
- Deterministic canonical operation handling and signature-chain verification.
- Signed Merkle checkpoints and DID inclusion proof API.
- Corruption detection and restart-safe ingestion.
- Measured storage in benchmarked runs is about 1.2 KB per operation.
- Benchmarked replay throughput is about 45x higher than naive replay in the same test setup.

## Trust Model

- Plutia mirrors data from `https://plc.directory`.
- Plutia validates operation signature chains and prev-link continuity according to configured verification policy.
- Plutia does **not** alter PLC authority or introduce consensus.
- Checkpoints are mirror commitments about what this mirror observed and verified, not global consensus.

## Modes

- `mirror`: stores full verifiable operation history (`data/ops/*.zst`) + state + proofs/checkpoints.
- `resolver`: stores resolved DID state/index only (no op block archive).
- `thin`: on-demand, verifiable DID resolver with persistent TTL/LRU cache; no full replay, no blocklog, no full history archive.

## Quick Start

```bash
task build
task verify:small
task bench
```

## Dev / Smoke Test (Docker Compose)

```bash
VERSION="$(cat VERSION)" \
COMMIT="$(git rev-parse --short HEAD)" \
BUILD_DATE="$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
docker compose -f docker-compose.yaml up --build
```

Compose build args are read from environment variables (`VERSION`, `COMMIT`, `BUILD_DATE`).
Equivalent direct build:

```bash
docker build \
  --build-arg VERSION="$(cat VERSION)" \
  --build-arg COMMIT="$(git rev-parse --short HEAD)" \
  --build-arg BUILD_DATE="$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
  -t plutia:local .
```

In another terminal, generate the mirror signing key inside the running container:

```bash
docker compose -f docker-compose.yaml exec plutia /app/plutia keygen --out=/var/lib/plutia/mirror.key
```

Verify checkpoint and proof endpoints:

```bash
curl -sS http://127.0.0.1:8080/checkpoints/latest | jq .
DID="$(curl -sS 'http://127.0.0.1:8080/export?count=1' | head -n 1 | jq -r '.did')"
curl -sS "http://127.0.0.1:8080/did/${DID}/proof" | jq .
```

### CLI Commands

```bash
plutia serve --config=config.default.yaml [--max-ops=0]
plutia replay --config=config.default.yaml [--max-ops=0]
plutia verify --config=config.default.yaml --did=did:plc:example
plutia snapshot --config=config.default.yaml
plutia bench --config=config.default.yaml --max-ops=200000
plutia compare --config=config.default.yaml --remote=https://mirror.example.com
plutia keygen --out=./mirror.key [--force]
plutia version
```

## Versioning and Reproducible Builds

Plutia follows semantic versioning, starting at `v0.1.0`.

`plutia version` prints:

- `Version` (defaults to `dev` if not injected)
- `Commit`
- `BuildDate` (UTC RFC3339)
- `GoVersion`

Build metadata is injected through ldflags:

```bash
go build -trimpath \
  -ldflags "-X main.version=v0.1.0 -X main.commit=$(git rev-parse --short HEAD) -X main.buildDate=$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
  -o ./bin/plutia ./cmd/plutia
```

Task runner equivalent:

```bash
task build
```

## HTTP API

- `GET /health`
- `GET /metrics` (Prometheus)
- `GET /status` (includes build/version metadata)
- `GET /did/{did}`
- `GET /did/{did}/proof`
- `GET /checkpoints/latest`
- `GET /checkpoints/{sequence}`

## PLC API Compatibility

Plutia includes read-only compatibility endpoints for `plc.directory` API consumers:

- `GET /{did}` (returns `application/did+ld+json`)
- `GET /{did}/log`
- `GET /{did}/log/last`
- `GET /{did}/log/audit`
- `GET /{did}/data`
- `GET /export` (NDJSON, `application/jsonlines`, supports `count` up to `1000`, and `after` RFC3339 filtering based on ingested operation timestamps)

For audit/export compatibility fields, `createdAt` is sourced from the mirror's recorded ingest timestamp for each operation reference.

Write behavior is intentionally unsupported:

- `POST /{did}` returns `405 Method Not Allowed` with `Allow: GET`

Verification features are additive extensions and remain available under:

- `GET /did/{did}`
- `GET /did/{did}/proof`
- `GET /checkpoints/latest`
- `GET /checkpoints/{sequence}`

## Metrics and Observability

Prometheus series exposed at `/metrics` include:

- `ingest_ops_total`
- `ingest_ops_per_second`
- `ingest_lag_ops`
- `verify_failures_total`
- `checkpoint_duration_seconds`
- `checkpoint_sequence`
- `disk_bytes_total`
- `did_count`
- `thin_cache_hits_total`
- `thin_cache_misses_total`
- `thin_cache_entries`
- `thin_cache_evictions_total`

Operational hardening includes:

- Per-IP token-bucket rate limits (stricter on proof endpoints).
- Per-request timeout (default `10s`) with cancellation propagation.
- Upstream ingestion retries with exponential backoff and `429` handling.
- Graceful SIGINT/SIGTERM shutdown with flush-before-exit behavior.

## Running Your Own Mirror

### System Requirements

- Go 1.25+
- SSD-backed storage recommended
- RAM: 4GB minimum, 8GB+ recommended for larger throughput
- CPU: multi-core recommended for parallel verification workers

### Disk Projections

Using benchmarked density (~1.2KB/op total):

- 5,000,000 ops: ~6GB
- 10,000,000 ops: ~12GB

Always keep extra headroom for compaction, checkpoints, and operational buffers.

### Example `config.default.yaml`

See [`config.default.yaml`](./config.default.yaml). All supported config keys:

- `mode`
- `data_dir`
- `plc_source`
- `verify`
- `zstd_level`
- `block_size_mb`
- `checkpoint_interval`
- `commit_batch_size`
- `verify_workers`
- `export_page_size`
- `replay_trace`
- `thin_cache_ttl`
- `thin_cache_max_entries`
- `listen_addr`
- `mirror_private_key_path`
- `poll_interval`
- `request_timeout`
- `http_retry_max_attempts`
- `http_retry_base_delay`
- `http_retry_max_delay`
- `rate_limit.resolve_rps`
- `rate_limit.resolve_burst`
- `rate_limit.proof_rps`
- `rate_limit.proof_burst`

Config files are optional. If `--config` points to a missing file, Plutia falls back to internal defaults and then applies any `PLUTIA_*` environment overrides.

### Example `docker-compose.yaml`

```yaml
services:
  plutia:
    image: ghcr.io/fuwn/plutia:0.1.0
    command: ["plutia", "serve", "--config=/etc/plutia/config.yaml"]
    ports:
      - "8080:8080"
    volumes:
      - ./config.default.yaml:/etc/plutia/config.yaml:ro
      - ./data:/var/lib/plutia
    restart: unless-stopped
```

### Upgrade and Backup Guidance

- Stop the process cleanly (`SIGTERM`) to flush pending writes.
- Back up `data/index`, `data/ops`, and `data/checkpoints` together.
- Keep the same `mode` per data directory across restarts.
- Upgrade binaries first in staging, then production using the same on-disk data.

## Which Mode Should I Run?

| Mode | Disk Usage (at current PLC size) | CPU Usage | Ingestion Behavior | Verification Guarantees | Historical Archive? | Checkpoint Proof Support? | Recommended For |
|---|---|---|---|---|---|---|---|
| `mirror` | ~1.17 KB per operation (~100 GB at current PLC head; scales linearly) | High sustained | Continuous full replay + polling | Full chain verification at ingest (`verify=full`), deterministic state, signed checkpoints | Yes | Yes | Independent mirrors, auditors, proof-serving infrastructure |
| `resolver` | ~0.8 KB/op (~60–70 GB at current PLC head) | Medium to high sustained | Continuous replay + polling, state-only storage | Full verification at ingest; stores resolved state without retaining historical block archive | No | No | Operators who want full verification with lower storage overhead |
| `thin` | ~475 bytes per cached DID (scales with active usage, not global PLC size) | Low idle, bursty on requests | No global replay; fetches and verifies per DID on demand | Performs full signature and prev-link verification for each requested DID chain before caching. | No | No | Edge deployments, small VPS instances, low-disk resolvers |

Disk usage scales linearly with total PLC operation count.

### Mirror Mode

- Maintains a full operation archive and materialized state.
- Generates signed checkpoints and supports proof serving.
- Preserves full DID audit history.
- Operates with the highest autonomy because it does not depend on upstream for resolved DIDs once ingested.
- Has the largest disk footprint.

### Resolver Mode

- Replays and verifies the global stream, but does not keep historical op blocks.
- Supports full verification policies while reducing storage versus mirror mode.
- Maintains current resolved state for all ingested DIDs.
- Offers a balanced operational profile for most self-hosted operators.
- Resolver mode does not support checkpoint inclusion proofs because it does not retain historical block references required for Merkle inclusion verification.

### Thin Mode

- Fetches DID logs from upstream on demand and verifies them locally.
- Stores only verified latest DID state plus cache metadata.
- Uses minimal disk and scales with active DID usage, not total PLC history.
- Depends on upstream availability at request time.
- Does not support checkpoint inclusion proofs.

### Quick Recommendations

- `< 10 GB` available disk: run `thin`.
- `20–80 GB` available disk: run `resolver`.
- `100 GB+` available disk: run `mirror`.

### Security Tradeoffs

- `thin` mode still verifies signature chains and prev linkage, but resolution depends on upstream availability for cache misses or refreshes.
- `mirror` mode is the most self-contained because it stores full verified history locally and can serve proofs from local checkpoints.
- `resolver` mode sits between the two: it verifies globally like mirror mode but does not retain a full historical archive for proof reconstruction.

## Mirror Comparison

Use:

```bash
plutia compare --config=config.default.yaml --remote=https://mirror.example.com
```

The command fetches remote `/checkpoints/latest` and compares:

- checkpoint sequence
- DID Merkle root
- signature presence

Behavior:

- same sequence + different root => divergence warning and non-zero exit
- different sequences => reports which mirror is ahead and exits non-zero
- matching sequence/root/signature presence => success

## License

MIT OR Apache-2.0