Gateway and sharding
Shard is one websocket connection; ShardManager coordinates the shards in a process. Below roughly 2,500 guilds you can rely on defaults.
Shard is one websocket connection; ShardManager coordinates the shards in a process. Below roughly 2,500 guilds you can rely on defaults.
Startup sequence
connect()reads the recommended shard count and session limits from Discord.- With
maxShards: 'auto'that count is used;shardConcurrency: 'auto'uses Discord'smax_concurrency. - Each shard opens its socket, gets
HELLO, sendsIDENTIFY, then receivesREADYand a burst ofGUILD_CREATE. - After
guildCreateTimeout, the shard firesshardReady. When all shards are ready, the client emitsready.
Sharding modes
Single process, auto:
new Client(token, { maxShards: 'auto', shardConcurrency: 'auto' });Multi-process, contiguous ranges (total must match across processes, since Discord routes by guild_id % total):
new Client(token, { maxShards: 16, firstShardID: 0, lastShardID: 7 }); // process A
new Client(token, { maxShards: 16, firstShardID: 8, lastShardID: 15 }); // process BExplicit IDs:
new Client(token, { maxShards: 16, shards: [0, 2, 4, 6, 8, 10, 12, 14] });Connection concurrency
Shards sharing id % max_concurrency are in the same identify bucket. Athena waits for a free bucket, then enforces a 5 second cooldown before the next identify in that bucket (skipped for resumes). max_concurrency comes from Discord and rises with scale; very large bots request elevated access.
Session start limits
Discord caps how many times a token can identify per day (1,000 by default; exhausting the budget forces a token reset). connect() stores the current values on client.sessionStartLimit ({ total, remaining, reset_after, max_concurrency }) and emits a warn when the remaining daily identifies run low. Resumes do not consume the budget; reconnect loops that re-identify do.
Compression and encoding
The gateway URL always pins encoding=json explicitly. Transport compression is options.compress (boolean | 'zlib' | 'zstd', default true), decompressed entirely by the built-in node:zlib: no native addons, nothing to install.
| Value | Stream |
|---|---|
true | Best available: zstd-stream when the runtime supports it (Node >= 22.15), zlib-stream otherwise. |
'zstd' | zstd-stream; on runtimes without node zstd, Athena warns and falls back to zlib-stream. |
'zlib' | zlib-stream, always available. |
false | None. |
Athena never requests payload compression in IDENTIFY: transport compression covers the whole stream, and Discord disables payload compression when transport compression is active anyway.
WebSocket transport
gateway.transport defaults to 'auto': shards run on the runtime's built-in WebSocket client (Node 22+, Bun, Deno) unless you set custom options.ws, so a default install needs no WebSocket dependency. The ws package is optional; install it (npm install ws) only for proxy agents or custom headers, which select it automatically (or force it with transport: 'ws'). transport: 'native' forces the built-in client.
Heartbeats, reconnect, resume
Athena heartbeats at Discord's interval and tears the socket down if an ACK is missed (zombie connection). The first heartbeat is jittered per the gateway docs, so mass-reconnected shards do not heartbeat in lockstep. On disconnect, if a session ID is held it issues RESUME to replay missed events; if the session expired or maxResumeAttempts is exceeded it re-identifies. An Invalid Session with d: true means the session is still resumable, and Athena resumes it instead of re-identifying; close code 4007 (invalid sequence) invalidates the session, so Athena starts a fresh one. Reconnect delay follows reconnectDelay(lastDelay, attempts).
Member chunking
For many members on one guild, Discord requires the gateway, not REST:
const members = await shard.requestGuildMembers({ guild_id: guildID, query: '', limit: 0 });This now waits the configured request timeout for the chunks (rather than returning early). getAllUsers: true chunks every guild at startup and significantly delays ready on large bots.
Member chunk rate limit
Since 2025-10-01, Discord allows 1 all-members request (limit 0, empty query) per guild per 30 seconds; exceeding it triggers a RATE_LIMITED dispatch instead of a chunk. Athena defers a duplicate all-members request for the same guild client-side until the window passes, auto-retries a rate limited request after retry_after (up to 3 times), and surfaces every RATE_LIMITED dispatch as a rateLimited client event with the raw payload.
Channel info and soundboard requests
Two more gateway request/response pairs live on both Client and Shard:
// Opcode 43: ephemeral voice channel data for a guild.
const entries = await client.requestChannelInfo(guildID);
// entries: Array<{ id, status?, voice_start_time? }>
// Opcode 31: soundboard sounds for one or more guilds.
client.requestSoundboardSounds([guildID]);requestChannelInfo(guildID, fields?) requests voice channel status and voice session start times (fields defaults to both); the response also fires the channelInfo event, and live changes arrive as voiceChannelStatusUpdate / voiceChannelStartTimeUpdate. requestSoundboardSounds(guildIDs) answers arrive as soundboardSounds events, one per guild.
Presence and voice
shard.setPresence(PresenceUpdateStatus.DoNotDisturb, [{ name: 'with code', type: ActivityType.Playing }]);
shard.sendWS(GatewayOpcodes.VoiceStateUpdate, { guild_id, channel_id, self_mute: false, self_deaf: false });Athena does not bundle a voice-connection (audio) implementation; pair it with a dedicated voice library if you need audio.
Useful shard surface
shard.status; // 'disconnected' | 'connecting' | 'handshaking' | 'ready' | ...
shard.latency; // ms
shard.sessionID;
shard.connect(); shard.disconnect({ reconnect: true });
shard.requestGuildMembers(options);
shard.requestChannelInfo(guildID, fields?);
shard.requestSoundboardSounds(guildIDs);Tuning
- Default to
maxShards: 'auto'. - Keep
compresson (transport compression cuts gateway bandwidth a lot). - Use
disableEventsfor noisy events you ignore. - Stagger reconnects with a randomised
reconnectDelayto avoid thundering herds after an outage.
For the cluster + cache picture at the very top end, see Scaling to millions.