DISCORD.PY — ASYNC
// LESSON 01 — DISCORD BOT SETUP

Intents, Bot, and the Message Loop

Build the foundation: Discord gateway connection, privileged intents, and the on_message handler that routes incoming messages to your LLM backend.

// FIELD ANALOGY
You've placed a radio operator on a multiplexed net. Every transmission on the net comes through — the operator monitors all of them. When they hear your callsign ("robot", "mrrobot"), they patch it through to command (the LLM). Everything else stays in the net. Intents are the frequencies your radio is tuned to. message_content is a privileged frequency — you need clearance (the Developer Portal toggle). on_message is the operator. BOT_TRIGGER_NAMES is the callsign filter. bot.run(TOKEN) is the operator going on air.

// NOMENCLATURE — DISCORD.PY + ASYNCIO

discord.IntentsDiscord's permission system for gateway events. Controls what the bot receives. Intents.default() gives you most events except privileged ones. message_content is privileged — must be manually enabled in the Discord Developer Portal or the bot sees empty message content.
commands.BotSubclass of discord.Client that adds the command prefix system and cog/extension loader. Even if you don't use slash commands, Bot is preferred over Client for the extra routing infrastructure.
@bot.eventDecorator that registers a coroutine as an event handler. The function name determines which event it handles — on_ready, on_message, etc. Discord.py uses naming convention, not explicit registration.
async def on_messageRaw event hook. Fires on EVERY message in every channel/DM the bot can see. You must filter manually. Always call await bot.process_commands(message) at the end so prefix commands still work.
load_dotenv(override=True)Loads .env file into os.environ. override=True means .env values win over existing shell env vars — ensures your .env configuration isn't silently shadowed by a mismatched shell variable.
any(name in content for name in ...)Generator expression inside any(). Returns True if at least one trigger name appears in the lowercased message content. Short-circuits on first match — efficient for multiple trigger names.
asyncio event loopDiscord.py is fully async — all I/O happens in a single-threaded event loop via coroutines. Any LLM call made from on_message must either be awaitable or run in an executor. Blocking the event loop freezes all discord interactions.
// REFERENCE — mrrobot.py, intents + bot init (annotated)
import discord
from discord.ext import commands
from dotenv import load_dotenv

load_dotenv(override=True)  # .env wins over shell env vars — consistent config

BOT_TOKEN = os.getenv("DISCORD_BOT_TOKEN")
BOT_TRIGGER_NAMES = ["robot", "mrrobot", "mr robot"]  # callsigns that activate the bot

intents = discord.Intents.default()
intents.message_content = True  # PRIVILEGED — must enable in Discord Dev Portal
intents.members = True

bot = commands.Bot(command_prefix="!", intents=intents)

@bot.event
async def on_ready():
    print(f"[+] logged in as {bot.user}")

@bot.event
async def on_message(message):
    if message.author == bot.user:   # don't respond to yourself — infinite loop otherwise
        return
    content = message.content.lower()
    triggered = any(name in content for name in BOT_TRIGGER_NAMES)  # callsign check
    if not triggered:
        await bot.process_commands(message)  # pass non-trigger messages to prefix command router
        return
    await handle_llm(message)            # route to LLM handler
    await bot.process_commands(message)

bot.run(BOT_TOKEN)
// TASK — type the bot skeleton
DISCORD.PY + AIOHTTP
// LESSON 02 — PERSONA INJECTION

System Prompt as Personality Engine

The bot becomes whoever you tell it to be via a SYSTEM prompt prepended to every conversation. This is the core mechanism: inject a persona, maintain per-channel message history, forward to Ollama.

// FIELD ANALOGY
Before every radio transmission, the operator reads the mission brief. The brief defines their identity, their authority, their constraints. No matter who calls in, they respond through that identity. SYSTEM_PROMPT is the doctrine brief — read before every exchange. The per-channel history deque is the mission log, bounded to the last 12 entries so it never overflows. build_messages() assembles [brief + history + incoming message] and hands it to the LLM. Ollama is the remote command post that processes and replies.

// NOMENCLATURE — PERSONA + ASYNC HTTP

SYSTEM_PROMPTThe personality definition. Loaded from .env so it can be swapped without touching code. Prepended to every API call as a {"role": "system"} message. The entire character — voice, doctrine, constraints, capabilities — lives here.
collections.deque(maxlen=N)Double-ended queue with a maximum length. When maxlen is exceeded, the oldest element drops off automatically. Used as a sliding window conversation history per channel — bounded RAM regardless of how long the bot runs.
defaultdict(lambda: deque(...))Dictionary that auto-creates a new deque for any new channel_id key. No need to check if the key exists before appending. The lambda creates a fresh bounded deque on first access.
build_messages()Assembles the messages array for the Ollama API: [system prompt] + [history] + [current user message]. This is the exact format both Ollama and OpenAI-compatible APIs expect. System first, history in order, user last.
aiohttp.ClientSessionAsync HTTP client. Used inside the Discord event loop — blocking HTTP calls (requests library) would freeze all Discord interactions. aiohttp is non-blocking: other events can be processed while waiting for the Ollama response.
"stream": FalseTells Ollama to return the complete response in one JSON object rather than streaming tokens. For Discord (where you send the complete reply), this is simpler than assembling a streaming response.
history append patternAfter getting the reply, append the user message AND the assistant reply to history. Both are needed for context. The deque's maxlen handles eviction of old messages automatically.
// REFERENCE — persona injection + Ollama call (annotated)
import os
import aiohttp
from collections import defaultdict, deque

SYSTEM_PROMPT = os.getenv("SYSTEM_PROMPT",
    "You are VADER, a machine-spirit serving the 22nd Survey Division. "
    "Respond with military precision. No corporate fluff.")

OLLAMA_URL  = os.getenv("OLLAMA_URL", "http://localhost:11434/api/chat")
MODEL_NAME  = os.getenv("MODEL_NAME", "huihui_ai/qwen2.5-coder-abliterate:7b")
HISTORY_DEPTH = int(os.getenv("HISTORY_DEPTH", "12"))

_history = defaultdict(lambda: deque(maxlen=HISTORY_DEPTH))  # auto-creates bounded deque per channel

def build_messages(channel_id, user_content):
    msgs = [{"role": "system", "content": SYSTEM_PROMPT}]  # doctrine brief first
    msgs.extend(_history[channel_id])                        # last N exchanges
    msgs.append({"role": "user", "content": user_content})  # current message last
    return msgs

async def query_ollama(channel_id, user_content):
    payload = {
        "model": MODEL_NAME,
        "messages": build_messages(channel_id, user_content),
        "stream": False,   # complete response in one JSON blob
    }
    async with aiohttp.ClientSession() as session:
        async with session.post(OLLAMA_URL, json=payload) as resp:
            data = await resp.json()   # non-blocking: event loop continues while waiting
    reply = data["message"]["content"]
    _history[channel_id].append({"role": "user",    "content": user_content})  # log both sides
    _history[channel_id].append({"role": "assistant","content": reply})
    return reply
// TASK — write the persona + Ollama function
PYTHON 3 — AUDIO + THREADING
// LESSON 03 — VOICE CLONE + TTS

Cloning the Commander's Voice — ElevenLabs PCM to Discord

Feed the bot's response text to your cloned ElevenLabs voice. Build a WAV in memory with stdlib wave, play it locally with winsound, and upload it to Discord simultaneously — all from a background thread.

// FIELD ANALOGY
You cloned the general's voice and loaded it into a voice synthesiser. When the bot replies, you type the text — the synthesiser speaks it in the general's voice over the radio net while simultaneously broadcasting the audio file to the channel. ElevenLabs is the voice synthesiser. PCM bytes are the raw audio signal. wave.open builds the transmission container. winsound.PlaySound is the local loudspeaker. asyncio.run_coroutine_threadsafe is the multiplexer that sends the audio to Discord's net from outside the async loop.

// NOMENCLATURE — PCM AUDIO + THREADING

pcm_22050Raw signed 16-bit PCM audio at 22050 Hz sample rate. No container, no compression — just raw audio data. The ElevenLabs PCM endpoint returns this directly. Lighter than MP3 decode for local playback.
wave.open(buf, 'wb')Python stdlib WAV writer. Opens a BytesIO buffer for writing. You configure the format (channels, sample width, frame rate) then call writeframes(pcm) to embed the raw audio. The result is a valid WAV file in memory.
setsampwidth(2)2 bytes per sample = 16-bit audio. winsound.PlaySound requires a valid WAV header — this is the matching value for pcm_22050 (signed 16-bit).
winsound.PlaySoundWindows stdlib audio playback. SND_FILENAME = play from a file path. Blocks until playback completes — this is why it must run in a background thread. Running it in the async event loop would freeze all Discord interaction during playback.
asyncio.run_coroutine_threadsafeThe only safe way to schedule an async coroutine from a non-async (background) thread. Takes the coroutine and the running event loop. Returns a Future — call .result(timeout=15) to wait for completion and catch upload errors.
threading.Thread(daemon=True)daemon=True means the thread dies automatically when the main process exits. Without this, audio playback threads would keep the process alive after bot.close() is called.
tempfile.NamedTemporaryFileCreates a named temporary file that winsound can reference by path. The delete=False is critical — winsound needs the file to stay on disk during playback. The finally: os.unlink() cleans it up afterwards.
// REFERENCE — mrrobot.py, ElevenLabs PCM fetch + speak() (annotated)
import os, io, wave, threading, tempfile, asyncio
import httpx
import discord

_EL_API_KEY  = os.environ.get("EL_API_KEY", "")
_EL_VOICE_ID = os.environ.get("EL_VOICE_ID", "")
_EL_RATE     = 22050  # sample rate: must match the pcm_22050 endpoint

def _fetch_el_pcm(text):
    """Call ElevenLabs, return raw PCM bytes or None on failure."""
    if not _EL_API_KEY or not _EL_VOICE_ID:
        return None              # no credentials → silent fallback
    try:
        url  = f"https://api.elevenlabs.io/v1/text-to-speech/{_EL_VOICE_ID}"
        hdrs = {"xi-api-key": _EL_API_KEY, "Content-Type": "application/json"}
        body = {
            "text": text[:5000],  # EL limit: 5000 chars per request
            "model_id": "eleven_multilingual_v2",
            "voice_settings": {
                "stability":         0.30,
                "similarity_boost":  0.80,
                "style":             0.55,
                "use_speaker_boost": True,
            },
        }
        r = httpx.post(url, json=body, headers=hdrs,
                       params={"output_format": "pcm_22050"}, timeout=30)
        return r.content if r.status_code == 200 else None  # return raw PCM or None
    except Exception:
        return None

def speak(text, channel=None, loop=None):
    def _run():
        pcm = _fetch_el_pcm(text)
        if pcm is None:
            return            # failed → silent, don't crash the thread
        buf = io.BytesIO()
        with wave.open(buf, 'wb') as wf:  # wrap PCM in WAV container
            wf.setnchannels(1)            # mono
            wf.setsampwidth(2)            # 16-bit = 2 bytes
            wf.setframerate(_EL_RATE)     # 22050 Hz
            wf.writeframes(pcm)
        tmp = tempfile.NamedTemporaryFile(
            prefix='vader_tts_', suffix='.wav', delete=False)  # delete=False: winsound needs the file
        tmp.write(buf.getvalue()); tmp.close()
        try:
            import winsound
            upload_fut = None
            if channel is not None and loop is not None:
                async def _upload():
                    await channel.send(
                        file=discord.File(tmp.name, filename="voice.wav"))
                upload_fut = asyncio.run_coroutine_threadsafe(_upload(), loop)  # schedule from thread
            winsound.PlaySound(tmp.name, winsound.SND_FILENAME)  # blocks until done
            if upload_fut is not None:
                try: upload_fut.result(timeout=15)  # wait for upload, max 15s
                except Exception: pass
        finally:
            try: os.unlink(tmp.name)  # clean up temp file
            except Exception: pass
    threading.Thread(target=_run, daemon=True).start()  # daemon: dies with main process
// TASK — write _fetch_el_pcm() and speak()

// COURSE COMPLETE

You built it. Discord gateway + privileged intents. Persona injection via system prompt. ElevenLabs PCM piped through stdlib wave, played locally, uploaded to Discord from a background thread.

← BACK TO ALL COURSES