DISCORD.PY — ASYNC
// LESSON 01 — DISCORD BOT SETUP
Intents, Bot, and the Message Loop
Build the foundation: Discord gateway connection, privileged intents, and the on_message handler that routes incoming messages to your LLM backend.
// NOMENCLATURE — DISCORD.PY + ASYNCIO
discord.Intents | Discord's permission system for gateway events. Controls what the bot receives. Intents.default() gives you most events except privileged ones. message_content is privileged — must be manually enabled in the Discord Developer Portal or the bot sees empty message content. |
commands.Bot | Subclass of discord.Client that adds the command prefix system and cog/extension loader. Even if you don't use slash commands, Bot is preferred over Client for the extra routing infrastructure. |
@bot.event | Decorator that registers a coroutine as an event handler. The function name determines which event it handles — on_ready, on_message, etc. Discord.py uses naming convention, not explicit registration. |
async def on_message | Raw event hook. Fires on EVERY message in every channel/DM the bot can see. You must filter manually. Always call await bot.process_commands(message) at the end so prefix commands still work. |
load_dotenv(override=True) | Loads .env file into os.environ. override=True means .env values win over existing shell env vars — ensures your .env configuration isn't silently shadowed by a mismatched shell variable. |
any(name in content for name in ...) | Generator expression inside any(). Returns True if at least one trigger name appears in the lowercased message content. Short-circuits on first match — efficient for multiple trigger names. |
| asyncio event loop | Discord.py is fully async — all I/O happens in a single-threaded event loop via coroutines. Any LLM call made from on_message must either be awaitable or run in an executor. Blocking the event loop freezes all discord interactions. |
// REFERENCE — mrrobot.py, intents + bot init (annotated)
import discord from discord.ext import commands from dotenv import load_dotenv load_dotenv(override=True) # .env wins over shell env vars — consistent config BOT_TOKEN = os.getenv("DISCORD_BOT_TOKEN") BOT_TRIGGER_NAMES = ["robot", "mrrobot", "mr robot"] # callsigns that activate the bot intents = discord.Intents.default() intents.message_content = True # PRIVILEGED — must enable in Discord Dev Portal intents.members = True bot = commands.Bot(command_prefix="!", intents=intents) @bot.event async def on_ready(): print(f"[+] logged in as {bot.user}") @bot.event async def on_message(message): if message.author == bot.user: # don't respond to yourself — infinite loop otherwise return content = message.content.lower() triggered = any(name in content for name in BOT_TRIGGER_NAMES) # callsign check if not triggered: await bot.process_commands(message) # pass non-trigger messages to prefix command router return await handle_llm(message) # route to LLM handler await bot.process_commands(message) bot.run(BOT_TOKEN)
// TASK — type the bot skeleton
DISCORD.PY + AIOHTTP
// LESSON 02 — PERSONA INJECTION
System Prompt as Personality Engine
The bot becomes whoever you tell it to be via a SYSTEM prompt prepended to every conversation. This is the core mechanism: inject a persona, maintain per-channel message history, forward to Ollama.
// NOMENCLATURE — PERSONA + ASYNC HTTP
SYSTEM_PROMPT | The personality definition. Loaded from .env so it can be swapped without touching code. Prepended to every API call as a {"role": "system"} message. The entire character — voice, doctrine, constraints, capabilities — lives here. |
collections.deque(maxlen=N) | Double-ended queue with a maximum length. When maxlen is exceeded, the oldest element drops off automatically. Used as a sliding window conversation history per channel — bounded RAM regardless of how long the bot runs. |
defaultdict(lambda: deque(...)) | Dictionary that auto-creates a new deque for any new channel_id key. No need to check if the key exists before appending. The lambda creates a fresh bounded deque on first access. |
build_messages() | Assembles the messages array for the Ollama API: [system prompt] + [history] + [current user message]. This is the exact format both Ollama and OpenAI-compatible APIs expect. System first, history in order, user last. |
aiohttp.ClientSession | Async HTTP client. Used inside the Discord event loop — blocking HTTP calls (requests library) would freeze all Discord interactions. aiohttp is non-blocking: other events can be processed while waiting for the Ollama response. |
"stream": False | Tells Ollama to return the complete response in one JSON object rather than streaming tokens. For Discord (where you send the complete reply), this is simpler than assembling a streaming response. |
| history append pattern | After getting the reply, append the user message AND the assistant reply to history. Both are needed for context. The deque's maxlen handles eviction of old messages automatically. |
// REFERENCE — persona injection + Ollama call (annotated)
import os
import aiohttp
from collections import defaultdict, deque
SYSTEM_PROMPT = os.getenv("SYSTEM_PROMPT",
"You are VADER, a machine-spirit serving the 22nd Survey Division. "
"Respond with military precision. No corporate fluff.")
OLLAMA_URL = os.getenv("OLLAMA_URL", "http://localhost:11434/api/chat")
MODEL_NAME = os.getenv("MODEL_NAME", "huihui_ai/qwen2.5-coder-abliterate:7b")
HISTORY_DEPTH = int(os.getenv("HISTORY_DEPTH", "12"))
_history = defaultdict(lambda: deque(maxlen=HISTORY_DEPTH)) # auto-creates bounded deque per channel
def build_messages(channel_id, user_content):
msgs = [{"role": "system", "content": SYSTEM_PROMPT}] # doctrine brief first
msgs.extend(_history[channel_id]) # last N exchanges
msgs.append({"role": "user", "content": user_content}) # current message last
return msgs
async def query_ollama(channel_id, user_content):
payload = {
"model": MODEL_NAME,
"messages": build_messages(channel_id, user_content),
"stream": False, # complete response in one JSON blob
}
async with aiohttp.ClientSession() as session:
async with session.post(OLLAMA_URL, json=payload) as resp:
data = await resp.json() # non-blocking: event loop continues while waiting
reply = data["message"]["content"]
_history[channel_id].append({"role": "user", "content": user_content}) # log both sides
_history[channel_id].append({"role": "assistant","content": reply})
return reply
// TASK — write the persona + Ollama function
PYTHON 3 — AUDIO + THREADING
// LESSON 03 — VOICE CLONE + TTS
Cloning the Commander's Voice — ElevenLabs PCM to Discord
Feed the bot's response text to your cloned ElevenLabs voice. Build a WAV in memory with stdlib wave, play it locally with winsound, and upload it to Discord simultaneously — all from a background thread.
// NOMENCLATURE — PCM AUDIO + THREADING
pcm_22050 | Raw signed 16-bit PCM audio at 22050 Hz sample rate. No container, no compression — just raw audio data. The ElevenLabs PCM endpoint returns this directly. Lighter than MP3 decode for local playback. |
wave.open(buf, 'wb') | Python stdlib WAV writer. Opens a BytesIO buffer for writing. You configure the format (channels, sample width, frame rate) then call writeframes(pcm) to embed the raw audio. The result is a valid WAV file in memory. |
setsampwidth(2) | 2 bytes per sample = 16-bit audio. winsound.PlaySound requires a valid WAV header — this is the matching value for pcm_22050 (signed 16-bit). |
winsound.PlaySound | Windows stdlib audio playback. SND_FILENAME = play from a file path. Blocks until playback completes — this is why it must run in a background thread. Running it in the async event loop would freeze all Discord interaction during playback. |
asyncio.run_coroutine_threadsafe | The only safe way to schedule an async coroutine from a non-async (background) thread. Takes the coroutine and the running event loop. Returns a Future — call .result(timeout=15) to wait for completion and catch upload errors. |
threading.Thread(daemon=True) | daemon=True means the thread dies automatically when the main process exits. Without this, audio playback threads would keep the process alive after bot.close() is called. |
tempfile.NamedTemporaryFile | Creates a named temporary file that winsound can reference by path. The delete=False is critical — winsound needs the file to stay on disk during playback. The finally: os.unlink() cleans it up afterwards. |
// REFERENCE — mrrobot.py, ElevenLabs PCM fetch + speak() (annotated)
import os, io, wave, threading, tempfile, asyncio
import httpx
import discord
_EL_API_KEY = os.environ.get("EL_API_KEY", "")
_EL_VOICE_ID = os.environ.get("EL_VOICE_ID", "")
_EL_RATE = 22050 # sample rate: must match the pcm_22050 endpoint
def _fetch_el_pcm(text):
"""Call ElevenLabs, return raw PCM bytes or None on failure."""
if not _EL_API_KEY or not _EL_VOICE_ID:
return None # no credentials → silent fallback
try:
url = f"https://api.elevenlabs.io/v1/text-to-speech/{_EL_VOICE_ID}"
hdrs = {"xi-api-key": _EL_API_KEY, "Content-Type": "application/json"}
body = {
"text": text[:5000], # EL limit: 5000 chars per request
"model_id": "eleven_multilingual_v2",
"voice_settings": {
"stability": 0.30,
"similarity_boost": 0.80,
"style": 0.55,
"use_speaker_boost": True,
},
}
r = httpx.post(url, json=body, headers=hdrs,
params={"output_format": "pcm_22050"}, timeout=30)
return r.content if r.status_code == 200 else None # return raw PCM or None
except Exception:
return None
def speak(text, channel=None, loop=None):
def _run():
pcm = _fetch_el_pcm(text)
if pcm is None:
return # failed → silent, don't crash the thread
buf = io.BytesIO()
with wave.open(buf, 'wb') as wf: # wrap PCM in WAV container
wf.setnchannels(1) # mono
wf.setsampwidth(2) # 16-bit = 2 bytes
wf.setframerate(_EL_RATE) # 22050 Hz
wf.writeframes(pcm)
tmp = tempfile.NamedTemporaryFile(
prefix='vader_tts_', suffix='.wav', delete=False) # delete=False: winsound needs the file
tmp.write(buf.getvalue()); tmp.close()
try:
import winsound
upload_fut = None
if channel is not None and loop is not None:
async def _upload():
await channel.send(
file=discord.File(tmp.name, filename="voice.wav"))
upload_fut = asyncio.run_coroutine_threadsafe(_upload(), loop) # schedule from thread
winsound.PlaySound(tmp.name, winsound.SND_FILENAME) # blocks until done
if upload_fut is not None:
try: upload_fut.result(timeout=15) # wait for upload, max 15s
except Exception: pass
finally:
try: os.unlink(tmp.name) # clean up temp file
except Exception: pass
threading.Thread(target=_run, daemon=True).start() # daemon: dies with main process
// TASK — write _fetch_el_pcm() and speak()