If your bot is encountering rate limit errors, don't worry—this is a common challenge that can be resolved with the right approach. This article will help you understand Discord's rate-limiting system and provide practical solutions.

What this article covers:

Understanding Discord's Rate Limit Types

Discord uses multiple types of rate limiting to protect the API. Identifying which type you're encountering is crucial for finding the right solution:

Global Rate Limits

Limit: 50 requests per second across most endpoints

Scope: Applies to your entire application

Identification: Look for X-RateLimit-Scope: global in response headers

Per-Route Rate Limits

Limit: Varies by endpoint

Scope: Specific to individual API routes

Identification: Check for X-RateLimit-Scope: user

Resource-Specific Rate Limits

Resource-specific rate limits can be reached by multiple sources (other users, bots, webhooks, etc.) and may not indicate that your application is solely responsible.

Limit: Independent limits for specific guilds, channels, or webhooks

Scope: Applies to actions on specific resources.

Identification: Look for X-RateLimit-Scope: shared in headers

Invalid Request Limits

Limit: 10,000 invalid requests per 10 minutes

Common Cause: Unhandled errors (401, 403, or 429) causing request spikes. Please note, 429 errors returned with X-RateLimit-Scope: shared are not counted towards your invalid request limit.

Result: Temporary Cloudflare ban

How to Identify Your Rate Limit Issue

The most reliable way to determine which limit you're hitting is by examining the HTTP response headers when you receive a 429 status code. Key headers to check:

X-RateLimit-Limit: The rate limit ceiling for that endpoint

X-RateLimit-Remaining: Number of requests remaining in the current window

X-RateLimit-Reset: When the rate limit window resets (Unix timestamp)

X-RateLimit-Reset-After: Seconds until the limit resets

X-RateLimit-Scope: Indicates the type of rate limit (global, user, or shared)

retry_after: Milliseconds to wait before making another request

Best Practices for Handling Rate Limits

Implement Proper Backoff Strategies

Always respect the retry_after value in rate limit responses. This tells you exactly how long to wait before retrying.

Consider Using Interactions Where Possible

Application commands and message components are an excellent alternative to prefix commands, which may prevent excessive API requests and messages in channels.

Bonus tip: Make Interaction Responses and follow-up messages ephemeral since they do not count towards the rate limits.

Cache Data Effectively

Reduce API calls by caching frequently accessed data, like:

Guild information
Channel details
User profiles
Role data

Use Request Throttling

Throttling is a proactive approach to preventing rate limits by controlling the pace of your requests before hitting the limit.

For example, if your bot needs to send welcome messages to 200 new members, instead of sending all 200 messages immediately, place them in a queue that releases 4 requests every 100 milliseconds. This maintains a steady rate of 40 requests per second, staying safely below the 50 request limit while ensuring all messages are sent in about 5 seconds.

Global Rate Limits

If you're hitting global rate limits, your program may have an underlying issue that needs to be addressed.

Here's how to optimize your code to stay within limits:

Implementing proper caching
Migrate to interaction-based features

If these solutions don't resolve your global rate limit issues, we encourage you to reach out in the Discord Developer Server #api-help channel or reach out to Developer Support.

Gateway Considerations and Sharding

For bots handling real-time events through Discord's Gateway (websocket connection), sharding is essential as your bot grows.

What is Sharding?

Sharding splits your bot into multiple instances, each handling a subset of guilds. This distributes the load across multiple websocket connections, helping you stay within rate limits.

It's recommended to start planning for sharding implementation when approaching 2,000 guilds, as sharding must be enabled at 2,500+ guilds. For optimal performance, follow the best practice of maintaining approximately 1 shard per 1,000 guilds.

Remember, rate limits exist to ensure a stable experience for all Discord users. By following these best practices, you can build a bot that scales effectively while respecting these limits.

Understanding Discord's Rate Limit Types

Global Rate Limits

Per-Route Rate Limits

Resource-Specific Rate Limits

Invalid Request Limits

How to Identify Your Rate Limit Issue

Best Practices for Handling Rate Limits

Implement Proper Backoff Strategies

Consider Using Interactions Where Possible

Cache Data Effectively

Use Request Throttling

Global Rate Limits

Gateway Considerations and Sharding

What is Sharding?

Related articles