Pulumi's engineering team describes how they built a custom distributed work scheduling system called the 'background activity system' to power Pulumi Cloud's workflow runners. Rather than using off-the-shelf queues like SQS or RabbitMQ, they built on top of their existing database to support self-hosted and air-gapped

13m read timeFrom pulumi.com
Post cover image
Table of contents
Where we startedWhy not use an off-the-shelf queue?Design constraintsThe background activityLeases: distributed execution without coordinationRouting work to the right runner poolDependencies and multi-step workflowsTwo execution modes, one interfacePutting it all togetherRetries and schedulingObservabilityWhat we learnedWrapping it up

Sort: