Files
GLaDOSify/README.md
2025-09-11 03:11:55 +02:00

7.2 KiB
Raw Blame History

GLaDOS GUI — Local LLM Style Transfer + Realtime TTS

GLaDOS GUI lets you type text, have a local LLM rewrite it in GLaDOS voice, and hear it spoken immediately using a highquality Piper TTS model. It streams the LLM output into the UI while synthesizing audio in realtime, so you hear sentences as they form.

  • Left: editable input box
  • Center: Image of GLaDOS; her eye glow follows audio loudness
  • Right: noneditable output (streamed from the LLM) with Copy, Play/Stop, and Save Audio buttons

Works fully offline with a local Ollama model and a local Piper voice.

Highlights

  • Realtime: streams LLM tokens, splits into sentences, and synthesizes with Piper as they arrive
  • Lowlatency audio: raw PCM straight to PortAudio via sounddevice
  • Eye glow: smoothed visual loudness indicator from the audio envelope
  • Persistent settings: remembers Ollama URL and selected model, and Piper ONNX path
  • Simple launcher: run.sh creates a venv, installs deps, and launches the app

Quick Start

  1. Install system prerequisites
  • Install Python 3.10+ and ensure python3 is on your PATH
  • Install Ollama (and start it): https://ollama.com/
  • Install the Piper TTS voice model (ONNX) from Daves Armoury (see TTS section below)
  1. Clone and run
  • macOS/Linux
    • chmod +x run.sh
    • ./run.sh
  • Or manually
    • python3 -m venv .venv && source .venv/bin/activate
    • pip install --upgrade pip
    • pip install -r requirements.txt
    • python glados_gui.py
  1. In the GUI
  • Model: pick your local Ollama model (recommendation below)
  • Ollama URL: default is http://localhost:11434/api/generate
  • Voice model: pick the glados_piper_medium.onnx file you downloaded
  • Type your text (left) → GLaDOSify → output streams (right) + audio plays
  • Best results: mistral3.2:24b
    • 32 GB unified memory works well
    • 16 GB can work, but generation may be slower
  • If that tag isnt available on your system, fall back to another LLM. The app will list locallyavailable models via Ollama.

Pull the model in Ollama (example):

  • ollama pull mistral3.2:24b

Ensure the Ollama service is running:

  • ollama serve

TTS Voice (Piper / ONNX)

  • Voice: GLaDOS Piper medium ONNX from Daves Armoury
  • Credits: massive thanks to Daves Armoury for the GLaDOS TTS model!

Piper CLI

  • The app expects the piper binary on your PATH
  • Installing piper-tts (Python package) provides the CLI on most setups
  • Alternatively, install Piper via your OS package manager if available

Running from the CLI (no GUI)

You can still use the original streaming script directly:

  • python glados_say_stream.py --stream -t "input text"

This streams rewritten text from Ollama and speaks it sentencebysentence with Piper.

Files

  • glados_gui.py: Main GUI
  • glados_say_stream.py: Streaming LLM → TTS engine (also usable as a CLI)
  • speaker.svg: Speaker icon used in the GUI (can be replaced)
  • glados_head.png, glados_eye.png: Center visuals (eye opacity tracks loudness)
  • run.sh: Oneshot launcher; creates venv, installs deps, runs GUI
  • requirements.txt: Python dependencies for the GUI + streaming

Configuration

The GUI writes a small config.json next to the app with:

  • ollama_url: Ollama HTTP endpoint (default http://localhost:11434/api/generate)
  • ollama_model: last selected model name
  • piper_model: last selected Piper ONNX path

You can change these at any time in the GUI; the app remembers them.

How it works (tech overview)

  • Prompt: The app uses a carefully crafted “GLaDOS style transfer” prompt. Your input is rewritten; no extra narration or meta text is added.
  • Streaming: The GUI posts to Ollama /api/generate with stream=true. Chunks are filtered to drop any hidden <think> blocks and appended to the right textbox, in case you are using a reasoning-model.
  • Sentence splitting: As streamed text arrives, its split at safe sentence boundaries; each complete sentence is sent to Piper immediately.
  • Piper: A single Piper process runs during the session. The GUI writes sentences to Pipers stdin and plays the raw PCM via sounddevice in realtime. Audio is also mirrored to a WAV so you can Save Audio at the end.
  • Eye flicker: The audio reader computes an RMS envelope and calls back into the GUI. Opacity is smoothed and slightly delayed to match perceived audio timing.

Fonts / Visuals

  • UI typeface: Lucida Console (please install it locally for best results).
  • Colors: black background (#000000), amber text (#e1a101).
  • Borders: dashed “retro” borders around textboxes and control buttons.

Dependencies

Core Python packages (see requirements.txt):

  • requests: Ollama HTTP client
  • sounddevice: PortAudio bridge for lowlatency playback
  • numpy: audio level calculations (RMS)
  • Pillow: image scaling and eye opacity levels; icon rendering
  • cairosvg (optional): renders speaker.svg to a crisp PNG at UI size
  • piper-tts: provides the piper CLI (or install Piper separately)

NonPython:

Usage Tips

  • If audio starts clipping, reduce the Piper length/noise parameters (the GUI uses sensible defaults).
  • If the eye glow feels out of sync, small variations in audio driver buffering can be compensated by adjusting the internal smoothing/delay values (ask for help in issues).

Credits

Trademarks & Attribution

  • Portal and GLaDOS are properties/trademarks of Valve Corporation. Portal is a registered trademark of Valve. See: https://www.valvesoftware.com/
  • The GLaDOS image shown in the center panel is a screenshot of an ingame model from the Portal series by Valve and is used here for demonstration/fan purposes. No ownership is claimed.
  • The AI voice used by this app is not created by the author of this GUI. The Piper ONNX model is created by Daves Armoury using audio from the Portal games by Valve. Download: https://huggingface.co/DavesArmoury/GLaDOS_TTS
  • This project is a fanmade tool intended for personal/educational use and is not affiliated with, endorsed by, or sponsored by Valve Corporation.

Troubleshooting

  • “piper not found”: ensure the piper CLI is on PATH. Installing piper-tts via pip often provides it; otherwise install Piper for your OS.
  • “No audio device”: some systems need sounddevice to be configured with a valid output device; use the CLI --list-devices to find indices.
  • “Ollama connection error”: make sure ollama serve is running and that the URL in the GUI is correct. Pull the model you want with ollama pull ... first.
  • Icon looks blurry: install cairosvg so SVG is rendered to the exact UI size (or drop a tuned speaker.png).