Skip to main content

Rate Limits

Rate limits are restrictions that our API enforces on how often users can access our services within a given time period. Rate limits can be identified via HTTP 429 error codes.

The following rate limits apply to our serverless deployments. If you require higher limits, consider using a dedicated deployment or contacting support@predibase.com.

Rate Limits by Tier

TierRate Limit
Free1 request / sec
Paid (Developer or Enterprise)100 requests / sec
VPCUnlimited

Daily Rate Limit

There is a daily token limit of 1 million tokens / day / tenant.

Rate Limits in Headers

TierRate Limit
x-envoy-ratelimitedWhether the rate limit has been reached
x-ratelimit-limitThe max number of requests until the rate limit is reached
x-ratelimit-remainingThe remaining number of requests until the rate limit is reached
x-ratelimit-resetAmount of time (seconds) until you can query again