| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Retracted is an explicit, instigator-initiated request to pull an action
back and reschedule it on a different runner (e.g. capacity opened up
elsewhere). Unlike Failed/Abandoned auto-retry, rescheduling from
Retracted does not increment RetryCount since nothing went wrong.
- Add Retracted enum value after Cancelled with static_assert guarding
ordinal placement so runner-side transitions cannot override it
- Implement idempotent RetractAction() CAS method on RunnerAction
- Extend ResetActionStateToPending() to accept Retracted without
incrementing RetryCount
- Add RetractAction() to ComputeServiceSession with pending/running
map lookup and runner cancellation for running actions
- Handle Retracted in HandleActionUpdates() scheduler loop (remove
from active maps, reset to Pending, no history/results entry)
- Add POST jobs/{lsn}/retract and queues/{queueref}/jobs/{lsn}/retract
HTTP endpoints
- Bump ActionHistoryEntry::Timestamps array from [8] to [9]
- Add state machine documentation and per-state comments to RunnerAction
- Add tests: retract_pending, retract_not_terminal, retract_http
|
| |
|
|
|
|
|
|
|
|
|
|
| |
NotifyQueueActionComplete (which decrements ActiveCount) was called after
releasing m_ResultsLock. GetActionResult acquires m_ResultsLock to consume
the result, so a caller could observe ActiveCount still at 1 immediately
after GetActionResult returned OK if the scheduler thread was preempted
between releasing m_ResultsLock and reaching NotifyQueueActionComplete.
Fix by calling NotifyQueueActionComplete before the m_ResultsLock block
that publishes the result into m_ResultsMap. This guarantees that by the
time GetActionResult can return OK, the queue counters are already updated.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AbandonAllActions() scans m_PendingActions to mark actions as Abandoned,
but EnqueueAction posts actions to m_UpdatedActions first — the scheduler
inserts them into m_PendingActions on its next tick. If the session
transitions to Abandoned in that window, AbandonAllActions() sees an empty
m_PendingActions and the actions are later inserted as Pending with no one
left to abandon them, causing GetActionResult to return 202 indefinitely.
Fix: in HandleActionUpdates, when processing a Pending-state action, check
if the session is already Abandoned and if so call SetActionState(Abandoned)
immediately rather than inserting into the pending map. SetActionState calls
PostUpdate internally, so the action re-enters m_UpdatedActions as Abandoned
and flows into m_ResultsMap on the next scheduler pass.
|
| |
|
|
|
| |
Log hostname, platform, CPU count and total memory alongside the
worker URI so operators can identify machines at a glance.
|
| |
|
|
|
|
|
| |
- Consolidate duplicated action submission logic in httpcomputeservice into a single HandleSubmitAction method supporting both single-action and batch (actions array) payloads
- Group actions by queue in RemoteHttpRunner and submit as batches with configurable chunk size, falling back to individual submission on failure
- Extract shared helpers in computeservice: MakeErrorResult, ValidateQueueForEnqueue, ActivateActionInQueue, RemoveActionFromActiveMaps
- Add WriteCompactBinaryObject to zencore
|
|
|
- Added local process runners for Linux/Wine, Mac with some sandboxing support
- Horde & Nomad provisioning for development and testing
- Client session queues with lifecycle management (active/draining/cancelled), automatic retry with configurable limits, and manual reschedule API
- Improved web UI for orchestrator, compute, and hub dashboards with WebSocket push updates
- Some security hardening
- Improved scalability and `zen exec` command
Still experimental - compute support is disabled by default
|