The current Python service with single node can do 192 RPS, about 400 pairs each. Only about 20% average CPU utilization. The limiting factor now was the language, the serving framework and the network call to feature store. I chose Fiber framework for REST API, it seemed most welcoming, good documentation, expressjs like API. Took less than an hour.
Sort: