Railway's engineering team details how they automate bare metal server provisioning at scale. The process starts with using Redfish APIs to scrape hardware details from BMCs, building stable device identifiers for NICs and NVMe drives. A Temporal workflow orchestrates the entire import process, matching hardware against known configurations exposed to Ansible via a custom plugin. OS installation is automated via PXE/Pixiecore with preseed files tailored per machine, and Claude (via Supermicro's CaptureScreen Redfish API) monitors the install progress by interpreting server screen images. Networking uses BGP unnumbered with FRR to create a uniform, scalable CLOS topology that requires no per-rack reconfiguration as the cluster grows.

10m read timeFrom blog.railway.com
Post cover image
Table of contents
Sorting your LEGO piecesWho needs webhooks when we’ve got ClaudeLow Config Networking with BGP UnnumberedBuilding Software to Run Hardware to Run Software

Sort: