aboutsummaryrefslogtreecommitdiff
path: root/src/zen/trace/trace_model.cpp
diff options
context:
space:
mode:
authorStefan Boberg <[email protected]>2026-05-05 15:47:48 +0200
committerGitHub Enterprise <[email protected]>2026-05-05 15:47:48 +0200
commit01286c6233347d561064fc9e6cf9deaf2087ceb7 (patch)
treebdbfdf01725baa2d2dd3d73727e6506b41421dff /src/zen/trace/trace_model.cpp
parenthub async s3 client (#1024) (diff)
downloadarchived-zen-main.tar.xz
archived-zen-main.zip
sessions: persist to disk, prune, track client liveness, accept UE_LOGFMT (#1014)HEADmain
Branch started as a sessions-service overhaul (persistence, client liveness, UE_LOGFMT intake) and grew to pick up adjacent infrastructure work: an early-startup log backlog, a hardened `MemoryArena`, the `zen trace serve` viewer gaining a counter view + compact timeline + tabbed callsite panel, defensive fixes in the third-party `tourist` trace parser, a series of allocation reductions across the HTTP and compact-binary hot paths, and a new `zen sessions` CLI command tree. ## Sessions service **Persistence.** Each session lives on disk under `<DataRoot>/sessions/<id>/` as `info.cb` (metadata) plus `log.bin` (length-prefixed CbObject log records). On startup the service scans that directory and loads prior sessions as ended sessions, preloading the tail of each log so historical views work after a restart. `SessionLog` is noexcept-constructed and falls back to a disabled state on disk errors, so a bad disk can't take down `RegisterSession`. `GetSession` falls back to the ended-sessions list (fixes historical log fetches over HTTP). `LoadTail` counts only successfully-parsed records. **Pruning.** Periodic cleanup task drops ended sessions once any of three caps is exceeded: age (default 1 year), count (default 1000), or total on-disk footprint (default 50 MiB). Runs 30 s after startup, hourly thereafter. Active sessions never pruned; disk removal and directory stat happen outside the exclusive lock so a slow filesystem can't stall lookups. **Client liveness.** Sessions carry a `ProcessHandle` for the client-reported pid, captured at registration time so Windows pid recycling can't produce false positives. A 30 s asio timer probes liveness and ends dead sessions through the normal remove path, producing a synthetic `Session ended: process exited (...)` line persisted to `log.bin`. Windows decodes common NTSTATUS exit codes to human names (Ctrl-C, access violation, stack overflow, ...); POSIX stays at plain `process exited`. Clients auto-fill `ClientPid` only for local targets (unix socket / loopback); the server defensively accepts pids only from `IsLocalMachineRequest()` peers. zenserver also reports its own pid when registering its self-session, so it shows up with a real pid in the dashboard and `zen sessions ls`. **Synthetic end-of-session line.** `RemoveSession` takes an optional reason; before the session moves to the ended list it appends an Info-level `Session ended[: reason]` entry through the normal log path (released outside `m_Lock`). Current reasons: `client request` (HTTP DELETE), `server shutdown` (self-session), `process exited (...)` (liveness). **UE_LOGFMT structured entries.** `POST /sessions/{id}/log` now accepts `{level, logger, format, fields}` alongside the existing `{level, logger, message}` shape. New `logtemplate.{h,cpp}` implements UE's `StructuredLog.cpp` template grammar (field paths with `.name` / `[N]`, `{{`/`}}` escapes, `$text` / `$format` / `$locformat` object conventions, bounded recursion). Renders to a displayable message at intake while persisting raw format + fields so a future UI can drill into fields without another schema bump. Hot path is zero-alloc — renders into `ExtendableStringBuilder<256>` using stack-buffered `Oid::ToString` / `IoHash::ToHexString` overloads. UI shows a `{…}` marker with the raw template + JSON-pretty fields on hover. **Parent sessions.** `SessionInfo` gains `parent_session_id`; hub-managed storage server child processes inherit the hub's session id via `--parent-session=<id>`. `ZEN_SESSIONS_URL` env var becomes a fallback for `--sessions-url` / config when neither is provided. The in-process session log sink is disabled when a remote sessions target is configured (logs flow through `SessionsServiceClient` instead). The sessions UI groups child sessions under their parent (collapsible/expandable, sorts as a unit, supports nesting). **Platform reporting.** `SessionInfo` gains `Platform`, flowed end-to-end: client auto-fills via `GetRuntimePlatformName()`, server persists in `info.cb` (`plat`) and emits on GET. UI renders as a SimpleIcons-style inline SVG (windows / macOS / iOS / linux / wine / android / playstation / xbox / nintendo) with case-insensitive alias resolution (Win32/Win64, PS4/PS5, XSX/XSS, NintendoSwitch, iPhone/iPad, Darwin/OSX). Unknown values fall back to text; sorting runs on the underlying string. **WebSocket log streaming.** Sessions UI moves from 2 s polling to a WebSocket push model. New `WsSubscriber` has a stable id + helper methods. UI caps the log-line DOM at 5 000 entries with a shared cursor-regression helper, factored out of two call sites. Per-broadcast allocations trimmed on the push path; fixed a stack overrun in the WS log broadcast hex-id buffer. **Log memory.** `LogEntry::Level` is now `logging::LogLevel` (1 byte) instead of `std::string` (~32 B) — saves ~310 KB per full 10 k-entry deque and eliminates a per-message allocation in the in-proc sink. On-disk format writes an int32 and accepts either int or legacy string on read. `LogEntry` strings now live in a `MemoryArena`; logger names are interned across the deque. `SessionLog::Append` and `WriteSessionInfoFile` drop their `UniqueBuffer` round-trip and write `CbObject::GetView()` straight through `BasicFile` / `SafeWriteFile`. Multi-entry `POST /log` batched under one lock + one push. **In-proc log timestamps.** `InProcSessionLogSink::TimePointToDateTime` previously preserved only whole seconds, so every in-proc entry rendered at `.000` ms in the dashboard and `zen sessions tail`. It now adds the sub-second part (nanoseconds → 100 ns ticks) to keep ms precision end-to-end. **UI.** Side "Session Details" panel is gone — its info is inline in the table (appname, mode, platform, id, timestamps, this/log pills, active dot). Bottom panel is a tabbed `Log | Metadata` view with a right-side "Session Information" panel beside metadata; log-only controls (filter, newest-first, follow, log-level filter, expand/collapse) hide when Metadata is active, polling keeps running across tab switches. Wide-mode toggle fills the viewport edge-to-edge. Log lines show the logger category; timestamps render in 24 h with zero-padded fields regardless of locale. Sessions list defaults to All / 10 per page / created-desc, gains click-to-sort headers on the full dataset, a header filter box, and a pager aligned to the table's right edge. Duplicate auto-injected `<h1>Sessions</h1>` removed. ## `zen sessions` CLI New command tree on the `zen` client for inspecting the sessions service from the terminal: - **`zen sessions ls`** — lists sessions (active first, ended next; newest-first within each group) with id, status, app/mode, pid, created, duration, and log count. Supports `--status active|ended|all` (default `all`). - **`zen sessions status`** — prints the sessions service summary: self id, active / ended counts, and the read/write/delete/list/request/bad-request counters from `/stats/sessions`. - **`zen sessions tail [session]`** — tails a session's log. With no argument it tails zenserver's own session (resolved via `/sessions/list`'s `self_id`); an explicit 24-hex id targets any session, including ended ones (historical replay). `--lines N` (default 50, 0 = all buffered) trims the initial dump client-side. `--follow` prefers a WebSocket push subscription on `/sessions/ws` for sub-second latency; on upgrade failure (older server, blocked port, unix-socket transport) it falls back to HTTP cursor polling at `--interval-ms` (default 500), with sleeps chunked to 50 ms so Ctrl-C reacts quickly. Output matches `zen::logging::FullFormatter` (`[YY-MM-DD HH:MM:SS.mmm] [lvl] [logger] message`); on a TTY the level is colored and the logger is bold, with continuation lines indented under the message column using the *visible* prefix width. 404 surfaces as `(session ended)` and connection errors as `(server gone)` — both clean exits, so stopping the server mid-tail no longer prints a stack trace. - **`zen sessions ui`** — opens `<host>/dashboard/?page=sessions` in the user's default browser. Rejects unix-socket hosts. A small `ZenServiceClient::IsUnixSocket()` helper now wraps the unix-socket check used by `ui`, `sessions tail` (WS path), and `sessions ui`. ## Logging `BacklogSink` captures early-startup log entries in a fixed-capacity ring so late-attached sinks (session sink, file sink) can replay them. Detaches from the broadcast list when disabled; backed by destructor-only cleanup (no `unique_ptr` indirection per entry). Tuned defaults so the backlog covers typical bring-up without unbounded growth. ## `zen trace serve` viewer - Compact timeline mode for high-density views. - New `TRACE_INT_VALUE` / `TRACE_FLOAT_VALUE` counter trace points + a counters page in the viewer. - Callsite tables collapsed into a single tabbed panel. - Lossless `Oid <-> Guid` bridge for trace session ids; trace `SessionId` plumbed through. - `tourist` parser hardening: bounds-check `BufferStream::read`, validate `Type::info_size` before `patch()`, convert `parse_important_aux` to a loop (avoids deep recursion), widen `ParserPool` index to `uint32`, bounds-check field offsets in the dispatcher, pin `Types::parse` buffer up-front. ## `MemoryArena` Configurable chunk size, inline chunk list, oversize requests routed to truly-dedicated chunks (no slack waste, no fragmentation when one allocation is much larger than the chunk). ## Allocation cleanups across hot paths - `zenhttp::HttpRequestRouter::HandleRequest` and `FormatPackageMessageInternal`: drop heap allocations. - Compact-binary validation: `eastl::fixed_vector` + `eastl::sort`; eliminate `std::vector` churn. - `zenserverprocess`: trim transient allocations in spawn paths. - Sessions HTTP intake / broadcast: drop transient `std::string` allocs.
Diffstat (limited to 'src/zen/trace/trace_model.cpp')
-rw-r--r--src/zen/trace/trace_model.cpp209
1 files changed, 209 insertions, 0 deletions
diff --git a/src/zen/trace/trace_model.cpp b/src/zen/trace/trace_model.cpp
index ac81161a1..c11e2c47c 100644
--- a/src/zen/trace/trace_model.cpp
+++ b/src/zen/trace/trace_model.cpp
@@ -387,6 +387,26 @@ begin_outline(CsvProfiler, Metadata)
field(uint8[], Key)
field(uint8[], Value)
end_outline()
+
+// Counters trace events (UE CountersTrace / zen ZEN_TRACE_INT_VALUE).
+begin_outline(Counters, Spec)
+ field(uint16, Id)
+ field(uint8, Type)
+ field(uint8, DisplayHint)
+ field(uint8[], Name)
+end_outline()
+
+begin_outline(Counters, SetValueInt)
+ field(uint64, Cycle)
+ field(int64, Value)
+ field(uint16, CounterId)
+end_outline()
+
+begin_outline(Counters, SetValueFloat)
+ field(uint64, Cycle)
+ field(double, Value)
+ field(uint16, CounterId)
+end_outline()
// clang-format on
//////////////////////////////////////////////////////////////////////////////
@@ -1471,6 +1491,138 @@ private:
};
//////////////////////////////////////////////////////////////////////////////
+// Counters analyzer -- consumes Counters.Spec / SetValueInt / SetValueFloat
+// (UE TRACE_INT_VALUE / TRACE_FLOAT_VALUE / zen ZEN_TRACE_INT_VALUE etc.).
+// Spec events register a counter id; SetValue events emit a sample. We keep
+// per-counter time series for the viewer / report to render.
+
+class CountersAnalyzer : public Analyzer
+{
+public:
+ explicit CountersAnalyzer(const TraceTiming* Timing = nullptr) : m_Timing(Timing) {}
+
+ void subscribe(Vector<Subscription>& Subs) override
+ {
+ Subs.emplace_back(this, &CountersAnalyzer::OnSpec);
+ Subs.emplace_back(this, &CountersAnalyzer::OnSetValueInt);
+ Subs.emplace_back(this, &CountersAnalyzer::OnSetValueFloat);
+ }
+
+ struct EditableSeries
+ {
+ zen::trace_detail::TraceModel::CounterSeries Series;
+ bool HasMin = false;
+ };
+
+ const eastl::hash_map<uint16_t, zen::trace_detail::TraceModel::CounterDef>& Defs() const { return m_Defs; }
+ eastl::hash_map<uint16_t, EditableSeries>& MutableSeriesMap() { return m_Series; }
+
+private:
+ uint32_t CycleToTimeUs(uint64_t Cycle) const
+ {
+ if (!m_Timing || m_Timing->Freq == 0)
+ {
+ return 0;
+ }
+ uint64_t Elapsed = (Cycle >= m_Timing->Base) ? (Cycle - m_Timing->Base) : 0;
+ return uint32_t(Elapsed * 1'000'000 / m_Timing->Freq);
+ }
+
+ static std::string DecodeAnsiName(const Array<uint8[]>& Data)
+ {
+ const uint8_t* P = Data.get();
+ size_t Size = Data.get_size();
+ if (!P || Size == 0)
+ {
+ return {};
+ }
+ return std::string(reinterpret_cast<const char*>(P), Size);
+ }
+
+ void OnSpec(const Counters_Spec& Ev)
+ {
+ uint16_t Id = uint16_t(Ev.Id());
+ if (Id == 0)
+ {
+ return;
+ }
+ zen::trace_detail::TraceModel::CounterDef Def;
+ Def.Id = Id;
+ Def.Type = uint8_t(Ev.Type());
+ Def.DisplayHint = uint8_t(Ev.DisplayHint());
+ Def.Name = DecodeAnsiName(Ev.Name());
+ if (Def.Name.empty())
+ {
+ Def.Name = fmt::format("counter_{}", Id);
+ }
+ m_Defs[Id] = std::move(Def);
+ }
+
+ EditableSeries& EnsureSeries(uint16_t Id, uint8_t Type)
+ {
+ auto It = m_Series.find(Id);
+ if (It == m_Series.end())
+ {
+ EditableSeries E;
+ E.Series.Id = Id;
+ E.Series.Type = Type;
+ It = m_Series.emplace(Id, std::move(E)).first;
+ }
+ return It->second;
+ }
+
+ void OnSetValueInt(const Counters_SetValueInt& Ev)
+ {
+ uint16_t Id = uint16_t(Ev.CounterId());
+ if (Id == 0)
+ {
+ return;
+ }
+ EditableSeries& E = EnsureSeries(Id, /*Int*/ 0);
+ double Value = double(int64_t(Ev.Value()));
+ uint32_t TimeUs = CycleToTimeUs(Ev.Cycle());
+ E.Series.Samples.push_back({TimeUs, Value});
+ ++E.Series.Count;
+ if (!E.HasMin || Value < E.Series.Min)
+ {
+ E.Series.Min = Value;
+ E.HasMin = true;
+ }
+ if (Value > E.Series.Max)
+ {
+ E.Series.Max = Value;
+ }
+ }
+
+ void OnSetValueFloat(const Counters_SetValueFloat& Ev)
+ {
+ uint16_t Id = uint16_t(Ev.CounterId());
+ if (Id == 0)
+ {
+ return;
+ }
+ EditableSeries& E = EnsureSeries(Id, /*Float*/ 1);
+ double Value = double(Ev.Value());
+ uint32_t TimeUs = CycleToTimeUs(Ev.Cycle());
+ E.Series.Samples.push_back({TimeUs, Value});
+ ++E.Series.Count;
+ if (!E.HasMin || Value < E.Series.Min)
+ {
+ E.Series.Min = Value;
+ E.HasMin = true;
+ }
+ if (Value > E.Series.Max)
+ {
+ E.Series.Max = Value;
+ }
+ }
+
+ const TraceTiming* m_Timing = nullptr;
+ eastl::hash_map<uint16_t, zen::trace_detail::TraceModel::CounterDef> m_Defs;
+ eastl::hash_map<uint16_t, EditableSeries> m_Series;
+};
+
+//////////////////////////////////////////////////////////////////////////////
// Analyzers
class CpuAnalyzer : public Analyzer
@@ -3165,6 +3317,7 @@ BuildTraceModel(const std::filesystem::path& FilePath, WorkerThreadPool& ThreadP
LogAnalyzer LogAn(&Timing);
BookmarksAnalyzer BookmarkAn(&Timing);
CsvProfilerAnalyzer CsvAn(&Timing);
+ CountersAnalyzer CountersAn(&Timing);
AllocationAnalyzer AllocAn(&Timing);
CallstackAnalyzer CallstackAn;
@@ -3182,6 +3335,7 @@ BuildTraceModel(const std::filesystem::path& FilePath, WorkerThreadPool& ThreadP
Dispatch.add_analyzer(LogAn);
Dispatch.add_analyzer(BookmarkAn);
Dispatch.add_analyzer(CsvAn);
+ Dispatch.add_analyzer(CountersAn);
Dispatch.add_analyzer(AllocAn);
Dispatch.add_analyzer(CallstackAn);
@@ -3434,6 +3588,61 @@ BuildTraceModel(const std::filesystem::path& FilePath, WorkerThreadPool& ThreadP
Model.CsvEvents.size());
}
+ // Counters (TRACE_INT_VALUE / TRACE_FLOAT_VALUE)
+ {
+ using CounterDefT = zen::trace_detail::TraceModel::CounterDef;
+ using CounterSeriesT = zen::trace_detail::TraceModel::CounterSeries;
+ using CounterSampleT = zen::trace_detail::TraceModel::CounterSample;
+
+ Model.CounterDefs.reserve(CountersAn.Defs().size());
+ for (const auto& [Id, Def] : CountersAn.Defs())
+ {
+ Model.CounterDefs.push_back(Def);
+ }
+
+ auto& SeriesMap = CountersAn.MutableSeriesMap();
+ Model.CounterTimeSeries.reserve(SeriesMap.size());
+ for (auto& [Id, Editable] : SeriesMap)
+ {
+ // Each counter's samples were appended in stream order. Tourist
+ // guarantees per-thread monotonicity but counters can be set from
+ // any thread, so a final sort by TimeUs is required.
+ eastl::sort(Editable.Series.Samples.begin(),
+ Editable.Series.Samples.end(),
+ [](const CounterSampleT& A, const CounterSampleT& B) { return A.TimeUs < B.TimeUs; });
+ // If the counter never produced a Spec event we still want the
+ // series visible. Synthesize a default def so the viewer has a name.
+ if (CountersAn.Defs().find(Id) == CountersAn.Defs().end())
+ {
+ CounterDefT Synth;
+ Synth.Id = Id;
+ Synth.Type = Editable.Series.Type;
+ Synth.Name = fmt::format("counter_{}", Id);
+ Model.CounterDefs.push_back(std::move(Synth));
+ }
+ Model.CounterTimeSeries.push_back(std::move(Editable.Series));
+ }
+ eastl::sort(Model.CounterDefs.begin(), Model.CounterDefs.end(), [](const CounterDefT& A, const CounterDefT& B) {
+ return A.Id < B.Id;
+ });
+ eastl::sort(Model.CounterTimeSeries.begin(), Model.CounterTimeSeries.end(), [](const CounterSeriesT& A, const CounterSeriesT& B) {
+ return A.Id < B.Id;
+ });
+
+ size_t TotalSamples = 0;
+ for (const CounterSeriesT& S : Model.CounterTimeSeries)
+ {
+ TotalSamples += S.Samples.size();
+ }
+ if (!Model.CounterDefs.empty() || !Model.CounterTimeSeries.empty())
+ {
+ ZEN_INFO("Counters: {} defined, {} with samples ({} samples total)",
+ Model.CounterDefs.size(),
+ Model.CounterTimeSeries.size(),
+ zen::ThousandsNum(TotalSamples));
+ }
+ }
+
// Memory allocation data
{
AllocAn.EmitFinalSample(Model.TraceEndUs);