The current Python service with single node can do 192 RPS, about 400 pairs each. Only about 20% average CPU utilization. The limiting factor now was the language, the serving framework and the network call to feature store. I chose Fiber framework for REST API, it seemed most welcoming, good documentation, expressjs like API. Took less than an hour.

5m read timeFrom ai.ragv.in
Post cover image
Table of contents
Why Golang? #Porting the existing MAB to Golang #Results #

Sort: