aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorStefan Boberg <[email protected]>2026-03-17 11:31:59 +0100
committerStefan Boberg <[email protected]>2026-03-17 11:31:59 +0100
commit642c09cdae77c1d4c4f46b1145a7156c878ef2d6 (patch)
tree8b1278ab72f0377fe921628ae209c6b8786910ad
parentClarify StripAnsiSgrSequences comment to note CSI non-SGR sequences are not s... (diff)
downloadzen-sb/compute-batch.tar.xz
zen-sb/compute-batch.zip
Fix queue ActiveCount race in HandleActionUpdates terminal pathsb/compute-batch
NotifyQueueActionComplete (which decrements ActiveCount) was called after releasing m_ResultsLock. GetActionResult acquires m_ResultsLock to consume the result, so a caller could observe ActiveCount still at 1 immediately after GetActionResult returned OK if the scheduler thread was preempted between releasing m_ResultsLock and reaching NotifyQueueActionComplete. Fix by calling NotifyQueueActionComplete before the m_ResultsLock block that publishes the result into m_ResultsMap. This guarantees that by the time GetActionResult can return OK, the queue counters are already updated.
-rw-r--r--src/zencompute/computeservice.cpp8
1 files changed, 7 insertions, 1 deletions
diff --git a/src/zencompute/computeservice.cpp b/src/zencompute/computeservice.cpp
index 2e49c1114..43031955a 100644
--- a/src/zencompute/computeservice.cpp
+++ b/src/zencompute/computeservice.cpp
@@ -1896,6 +1896,13 @@ ComputeServiceSession::Impl::HandleActionUpdates()
RemoveActionFromActiveMaps(ActionLsn);
+ // Update queue counters BEFORE publishing the result into
+ // m_ResultsMap. GetActionResult erases from m_ResultsMap
+ // under m_ResultsLock, so if we updated counters after
+ // releasing that lock, a caller could observe ActiveCount
+ // still at 1 immediately after GetActionResult returned OK.
+ NotifyQueueActionComplete(Action->QueueId, ActionLsn, TerminalState);
+
m_ResultsLock.WithExclusiveLock([&] {
m_ResultsMap[ActionLsn] = Action;
@@ -1927,7 +1934,6 @@ ComputeServiceSession::Impl::HandleActionUpdates()
Action->ActionId,
ActionLsn,
TerminalState == RunnerAction::State::Completed ? "SUCCESS" : "FAILURE");
- NotifyQueueActionComplete(Action->QueueId, ActionLsn, TerminalState);
break;
}
}