Rate Limits
Rate limits are restrictions that our API enforces on how often users can access our services within a given time period. Rate limits can be identified via HTTP 429 error codes.
The following rate limits apply to our serverless deployments. If you require higher limits, consider using a dedicated deployment or contacting support@predibase.com.
Rate Limits by Tier
Tier | Rate Limit |
---|---|
Free | 1 request / sec |
Paid (Developer or Enterprise) | 100 requests / sec |
VPC | Unlimited |
Daily Rate Limit
There is a daily token limit of 1 million tokens / day / tenant.
Rate Limits in Headers
Tier | Rate Limit |
---|---|
x-envoy-ratelimited | Whether the rate limit has been reached |
x-ratelimit-limit | The max number of requests until the rate limit is reached |
x-ratelimit-remaining | The remaining number of requests until the rate limit is reached |
x-ratelimit-reset | Amount of time (seconds) until you can query again |