Files

Victor Giers 353e0a980d initial commit

2025-09-11 03:11:55 +02:00

7.2 KiB

Raw Blame History

GLaDOS GUI — Local LLM Style Transfer + Realtime TTS

GLaDOS GUI lets you type text, have a local LLM rewrite it in GLaDOS’ voice, and hear it spoken immediately using a high‑quality Piper TTS model. It streams the LLM output into the UI while synthesizing audio in realtime, so you hear sentences as they form.

Left: editable input box
Center: Image of GLaDOS; her eye glow follows audio loudness
Right: non‑editable output (streamed from the LLM) with Copy, Play/Stop, and Save Audio buttons

Works fully offline with a local Ollama model and a local Piper voice.

Highlights

Realtime: streams LLM tokens, splits into sentences, and synthesizes with Piper as they arrive
Low‑latency audio: raw PCM straight to PortAudio via sounddevice
Eye glow: smoothed visual loudness indicator from the audio envelope
Persistent settings: remembers Ollama URL and selected model, and Piper ONNX path
Simple launcher: run.sh creates a venv, installs deps, and launches the app

Quick Start

Install system prerequisites

Install Python 3.10+ and ensure python3 is on your PATH
Install Ollama (and start it): https://ollama.com/
Install the Piper TTS voice model (ONNX) from Dave’s Armoury (see TTS section below)

Clone and run

macOS/Linux
- chmod +x run.sh
- ./run.sh
Or manually
- python3 -m venv .venv && source .venv/bin/activate
- pip install --upgrade pip
- pip install -r requirements.txt
- python glados_gui.py

In the GUI

Model: pick your local Ollama model (recommendation below)
Ollama URL: default is http://localhost:11434/api/generate
Voice model: pick the glados_piper_medium.onnx file you downloaded
Type your text (left) → GLaDOSify → output streams (right) + audio plays

Recommended LLM (Ollama)

Best results: mistral3.2:24b
- 32 GB unified memory works well
- 16 GB can work, but generation may be slower
If that tag isn’t available on your system, fall back to another LLM. The app will list locally‑available models via Ollama.

Pull the model in Ollama (example):

ollama pull mistral3.2:24b

Ensure the Ollama service is running:

ollama serve

TTS Voice (Piper / ONNX)

Voice: GLaDOS Piper medium ONNX from Dave’s Armoury
- Download from: https://huggingface.co/DavesArmoury/GLaDOS_TTS
- Files needed in the same folder (or anywhere you choose via picker):
  - glados_piper_medium.onnx
  - glados_piper_medium.onnx.json
Credits: massive thanks to Dave’s Armoury for the GLaDOS TTS model!

Piper CLI

The app expects the piper binary on your PATH
Installing piper-tts (Python package) provides the CLI on most setups
Alternatively, install Piper via your OS package manager if available

Running from the CLI (no GUI)

You can still use the original streaming script directly:

python glados_say_stream.py --stream -t "input text"

This streams rewritten text from Ollama and speaks it sentence‑by‑sentence with Piper.

Files

glados_gui.py: Main GUI
glados_say_stream.py: Streaming LLM → TTS engine (also usable as a CLI)
speaker.svg: Speaker icon used in the GUI (can be replaced)
glados_head.png, glados_eye.png: Center visuals (eye opacity tracks loudness)
run.sh: One‑shot launcher; creates venv, installs deps, runs GUI
requirements.txt: Python dependencies for the GUI + streaming

Configuration

The GUI writes a small config.json next to the app with:

ollama_url: Ollama HTTP endpoint (default http://localhost:11434/api/generate)
ollama_model: last selected model name
piper_model: last selected Piper ONNX path

You can change these at any time in the GUI; the app remembers them.

How it works (tech overview)

Prompt: The app uses a carefully crafted “GLaDOS style transfer” prompt. Your input is rewritten; no extra narration or meta text is added.
Streaming: The GUI posts to Ollama /api/generate with stream=true. Chunks are filtered to drop any hidden <think> blocks and appended to the right textbox, in case you are using a reasoning-model.
Sentence splitting: As streamed text arrives, it’s split at safe sentence boundaries; each complete sentence is sent to Piper immediately.
Piper: A single Piper process runs during the session. The GUI writes sentences to Piper’s stdin and plays the raw PCM via sounddevice in realtime. Audio is also mirrored to a WAV so you can Save Audio at the end.
Eye flicker: The audio reader computes an RMS envelope and calls back into the GUI. Opacity is smoothed and slightly delayed to match perceived audio timing.

Fonts / Visuals

UI typeface: Lucida Console (please install it locally for best results).
Colors: black background (#000000), amber text (#e1a101).
Borders: dashed “retro” borders around textboxes and control buttons.

Dependencies

Core Python packages (see requirements.txt):

requests: Ollama HTTP client
sounddevice: PortAudio bridge for low‑latency playback
numpy: audio level calculations (RMS)
Pillow: image scaling and eye opacity levels; icon rendering
cairosvg (optional): renders speaker.svg to a crisp PNG at UI size
piper-tts: provides the piper CLI (or install Piper separately)

Non‑Python:

Ollama (local LLM runtime): https://ollama.com/
Piper (CLI TTS engine): https://github.com/rhasspy/piper

Usage Tips

If audio starts clipping, reduce the Piper length/noise parameters (the GUI uses sensible defaults).
If the eye glow feels out of sync, small variations in audio driver buffering can be compensated by adjusting the internal smoothing/delay values (ask for help in issues).

Credits

GLaDOS TTS voice: Dave’s Armoury — https://huggingface.co/DavesArmoury/GLaDOS_TTS
Piper TTS engine: Rhasspy / Piper — https://github.com/rhasspy/piper
LLM runtime: Ollama — https://ollama.com/
Python libraries: requests, sounddevice, numpy, Pillow, cairosvg, piper-tts, and the standard Tkinter GUI toolkit.

Trademarks & Attribution

Portal and GLaDOS are properties/trademarks of Valve Corporation. Portal is a registered trademark of Valve. See: https://www.valvesoftware.com/
The GLaDOS image shown in the center panel is a screenshot of an in‑game model from the Portal series by Valve and is used here for demonstration/fan purposes. No ownership is claimed.
The AI voice used by this app is not created by the author of this GUI. The Piper ONNX model is created by Dave’s Armoury using audio from the Portal games by Valve. Download: https://huggingface.co/DavesArmoury/GLaDOS_TTS
This project is a fan‑made tool intended for personal/educational use and is not affiliated with, endorsed by, or sponsored by Valve Corporation.

Troubleshooting

“piper not found”: ensure the piper CLI is on PATH. Installing piper-tts via pip often provides it; otherwise install Piper for your OS.
“No audio device”: some systems need sounddevice to be configured with a valid output device; use the CLI --list-devices to find indices.
“Ollama connection error”: make sure ollama serve is running and that the URL in the GUI is correct. Pull the model you want with ollama pull ... first.
Icon looks blurry: install cairosvg so SVG is rendered to the exact UI size (or drop a tuned speaker.png).

7.2 KiB Raw Blame History Unescape Escape