Skip to content

Commit 5e35604

Browse files
dipinhoraSeanTAllen
authored andcommitted
Fix multi-worker performance related to dynamic scheduler scaling
As part of performance testing Wallaroo using multiple workers, @JONBRWN discovered a regression in both throughput and latency. He tracked the issue down the commit that re-enabled dynamic scheduler scaling (fc80968). NOTE: This performance issue did not exist for singler worker runs of Wallaroo. Some head scratching and testing led to the current commit to resolve the multi-worker performance issue. My best guess is that before this change the `steal` loop was dependent on a memory access to determine if dynamic scheduler scaling needed to suspend a thread or not as its initial check. This would lead to somewhat erratic behavior where some times the `steal` loop would take long while other times it wouldn't depending on how long the memory load took. This had a follow-on impact on actor execution because of ASIO messages because they wouldn't be picked up off of the queue for work as quickly as they could be due to the extra memory accesses. This commit changes the ordering of some operations to ensure that there is more consistent memory accesses for the loop resulting in more consistent actor actor execution for ASIO messages resolving the multi-worker performance issue that @JONBRWN discovered.
1 parent f5a43b0 commit 5e35604

File tree

1 file changed

+4
-7
lines changed

1 file changed

+4
-7
lines changed

src/libponyrt/sched/scheduler.c

+4-7
Original file line numberDiff line numberDiff line change
@@ -543,8 +543,8 @@ static pony_actor_t* perhaps_suspend_scheduler(
543543
// if we're the highest active scheduler thread
544544
// and there are more active schedulers than the minimum requested
545545
// and we're not terminating
546-
if ((sched == &scheduler[current_active_scheduler_count - 1])
547-
&& (current_active_scheduler_count > min_scheduler_count)
546+
if ((current_active_scheduler_count > min_scheduler_count)
547+
&& (sched == &scheduler[current_active_scheduler_count - 1])
548548
&& (!sched->terminate)
549549
#if defined(USE_SCHEDULER_SCALING_PTHREADS)
550550
// try to acquire mutex if using pthreads
@@ -677,10 +677,10 @@ static pony_actor_t* steal(scheduler_t* sched)
677677
// By waiting 1 millisecond before sending a block message, we are going to
678678
// delay quiescence by a small amount of time but also optimize work
679679
// stealing for generating far fewer block/unblock messages.
680+
uint32_t current_active_scheduler_count = get_active_scheduler_count();
681+
680682
if (!block_sent)
681683
{
682-
uint32_t current_active_scheduler_count = get_active_scheduler_count();
683-
684684
// make sure thread scaling order is still valid. we should never be
685685
// active if the active_scheduler_count isn't larger than our index.
686686
pony_assert(current_active_scheduler_count > (uint32_t)sched->index);
@@ -718,9 +718,6 @@ static pony_actor_t* steal(scheduler_t* sched)
718718
// if we do suspend, we'll send a unblock message first to ensure cnf/ack
719719
// cycle works as expected
720720

721-
// get active scheduler count
722-
uint32_t current_active_scheduler_count = get_active_scheduler_count();
723-
724721
// make sure thread scaling order is still valid. we should never be
725722
// active if the active_scheduler_count isn't larger than our index.
726723
pony_assert(current_active_scheduler_count > (uint32_t)sched->index);

0 commit comments

Comments
 (0)