Gateway and sharding

Shard is one websocket connection; ShardManager coordinates the shards in a process. Below roughly 2,500 guilds you can rely on defaults.

Shard is one websocket connection; ShardManager coordinates the shards in a process. Below roughly 2,500 guilds you can rely on defaults.

Startup sequence

connect() reads the recommended shard count and session limits from Discord.
With maxShards: 'auto' that count is used; shardConcurrency: 'auto' uses Discord's max_concurrency.
Each shard opens its socket, gets HELLO, sends IDENTIFY, then receives READY and a burst of GUILD_CREATE.
After guildCreateTimeout, the shard fires shardReady. When all shards are ready, the client emits ready.

Sharding modes

Single process, auto:

new Client(token, { maxShards: 'auto', shardConcurrency: 'auto' });

Multi-process, contiguous ranges (total must match across processes, since Discord routes by guild_id % total):

new Client(token, { maxShards: 16, firstShardID: 0, lastShardID: 7 });   // process A
new Client(token, { maxShards: 16, firstShardID: 8, lastShardID: 15 });  // process B

Explicit IDs:

new Client(token, { maxShards: 16, shards: [0, 2, 4, 6, 8, 10, 12, 14] });

Connection concurrency

Shards sharing id % max_concurrency are in the same identify bucket. Athena waits for a free bucket, then enforces a 5 second cooldown before the next identify in that bucket (skipped for resumes). max_concurrency comes from Discord and rises with scale; very large bots request elevated access.

Session start limits

Discord caps how many times a token can identify per day (1,000 by default; exhausting the budget forces a token reset). connect() stores the current values on client.sessionStartLimit ({ total, remaining, reset_after, max_concurrency }) and emits a warn when the remaining daily identifies run low. Resumes do not consume the budget; reconnect loops that re-identify do.

Compression and encoding

The gateway URL always pins encoding=json explicitly. Transport compression is options.compress (boolean | 'zlib' | 'zstd', default true), decompressed entirely by the built-in node:zlib: no native addons, nothing to install.

Value	Stream
`true`	Best available: zstd-stream when the runtime supports it (Node >= 22.15), zlib-stream otherwise.
`'zstd'`	zstd-stream; on runtimes without node zstd, Athena warns and falls back to zlib-stream.
`'zlib'`	zlib-stream, always available.
`false`	None.

Athena never requests payload compression in IDENTIFY: transport compression covers the whole stream, and Discord disables payload compression when transport compression is active anyway.

WebSocket transport

gateway.transport defaults to 'auto': shards run on the runtime's built-in WebSocket client (Node 22+, Bun, Deno) unless you set custom options.ws, so a default install needs no WebSocket dependency. The ws package is optional; install it (npm install ws) only for proxy agents or custom headers, which select it automatically (or force it with transport: 'ws'). transport: 'native' forces the built-in client.

Heartbeats, reconnect, resume

Athena heartbeats at Discord's interval and tears the socket down if an ACK is missed (zombie connection). The first heartbeat is jittered per the gateway docs, so mass-reconnected shards do not heartbeat in lockstep. On disconnect, if a session ID is held it issues RESUME to replay missed events; if the session expired or maxResumeAttempts is exceeded it re-identifies. An Invalid Session with d: true means the session is still resumable, and Athena resumes it instead of re-identifying; close code 4007 (invalid sequence) invalidates the session, so Athena starts a fresh one. Reconnect delay follows reconnectDelay(lastDelay, attempts).

Member chunking

For many members on one guild, Discord requires the gateway, not REST:

const members = await shard.requestGuildMembers({ guild_id: guildID, query: '', limit: 0 });

This now waits the configured request timeout for the chunks (rather than returning early). getAllUsers: true chunks every guild at startup and significantly delays ready on large bots.

Member chunk rate limit

Since 2025-10-01, Discord allows 1 all-members request (limit 0, empty query) per guild per 30 seconds; exceeding it triggers a RATE_LIMITED dispatch instead of a chunk. Athena defers a duplicate all-members request for the same guild client-side until the window passes, auto-retries a rate limited request after retry_after (up to 3 times), and surfaces every RATE_LIMITED dispatch as a rateLimited client event with the raw payload.

Channel info and soundboard requests

Two more gateway request/response pairs live on both Client and Shard:

// Opcode 43: ephemeral voice channel data for a guild.
const entries = await client.requestChannelInfo(guildID);
// entries: Array<{ id, status?, voice_start_time? }>
 
// Opcode 31: soundboard sounds for one or more guilds.
client.requestSoundboardSounds([guildID]);

requestChannelInfo(guildID, fields?) requests voice channel status and voice session start times (fields defaults to both); the response also fires the channelInfo event, and live changes arrive as voiceChannelStatusUpdate / voiceChannelStartTimeUpdate. requestSoundboardSounds(guildIDs) answers arrive as soundboardSounds events, one per guild.

Presence and voice

shard.setPresence(PresenceUpdateStatus.DoNotDisturb, [{ name: 'with code', type: ActivityType.Playing }]);
shard.sendWS(GatewayOpcodes.VoiceStateUpdate, { guild_id, channel_id, self_mute: false, self_deaf: false });

Athena does not bundle a voice-connection (audio) implementation; pair it with a dedicated voice library if you need audio.

Useful shard surface

shard.status;     // 'disconnected' | 'connecting' | 'handshaking' | 'ready' | ...
shard.latency;    // ms
shard.sessionID;
shard.connect(); shard.disconnect({ reconnect: true });
shard.requestGuildMembers(options);
shard.requestChannelInfo(guildID, fields?);
shard.requestSoundboardSounds(guildIDs);

Tuning

Default to maxShards: 'auto'.
Keep compress on (transport compression cuts gateway bandwidth a lot).
Use disableEvents for noisy events you ignore.
Stagger reconnects with a randomised reconnectDelay to avoid thundering herds after an outage.

For the cluster + cache picture at the very top end, see Scaling to millions.

Gateway and sharding

Startup sequence#

Sharding modes#

Connection concurrency#

Session start limits#

Compression and encoding#

WebSocket transport#

Heartbeats, reconnect, resume#

Member chunking#

Member chunk rate limit#

Channel info and soundboard requests#

Presence and voice#

Useful shard surface#

Tuning#