When building a chatbot on top of OpenAI, a single user can exhaust your organization's rate limits across both requests per minute (RPM) and tokens per minute (TPM), effectively causing a denial of service for all other users. A practical solution involves tracking per-user RPM and TPM using Redis cache with increment/expire

6m read timeFrom thoughtbot.com
Post cover image
Table of contents
Our baseRate limit messagesLimit token usageCalculating per-user rate limitsAdditional considerationsWrapping upIf you enjoyed this post, you might also like:

Sort: