Files
MurMur/README.md

110 lines
5.3 KiB
Markdown
Raw Permalink Normal View History

2025-08-15 15:04:45 +02:00
# MurMur — Audio Bridge / Transcribe / Translate
**MurMur** is a lightweight desktop app that **routes**, **transcribes**, **translates**, and **records** system audio in real time — all locally.
- **ASR (Automatic Speech Recognition) engine:** [`faster-whisper`](https://github.com/SYSTRAN/faster-whisper) driven by a local Whisper model for fast, memory-efficient transcription.
- **Live client:** “nearly-live” Whisper pipeline for continuous updates.
- **Translation:** uses Whispers built-in `translate=True` path (**to English**)
- **Audio bridge:** pick input/output devices, loop back desktop audio, add **virtual gain**, and capture to MP3 (via `ffmpeg`).
- **Bonus use-case:** route desktop audio into **Shazam for Mac** for highly reliable song ID (no mic/room noise).
![MurMur](murmur.png)
> **Note:** Earlier builds experimented with NLLB and SeamlessM4T for wider language translation; **current MurMur does _not_ use these**. Whisper **does** transcribe and translate nearly any language, but can only translate **to** English. In case you want to use other transcription or translation software (non-realtime), the recording-feature was implemented.
---
## Features
- **Two-pane UI:** Live transcript + optional live English translation.
- **Source language:** Auto-detect (Whisper) or manually set.
- **Audio routing:** Low-latency loopback with adjustable **virtual gain**.
- **Recording:** One-click capture to MP3.
---
## Requirements
- **Python** 3.9+
- **ffmpeg** on PATH (for MP3 export) — installable via Homebrew.
- **Virtual audio sink** (macOS) — install **one** of:
- **BlackHole** (open-source, zero-latency loopback).
- **Background Music** (adds a virtual device + per-app volume).
- **Soundflower** (classic virtual device).
- **Loopback** (commercial, flexible routing UI).
These create a virtual output/input so one apps audio can be captured by another (e.g., system output → MurMur input).
---
## Install
Create a venv and install dependencies:
```bash
python -m venv .venv && source .venv/bin/activate
python -m pip install --upgrade pip
pip install pywebview sounddevice huggingface_hub torch whisper-live numpy
```
- **[Pywebview](https://pywebview.flowrl.com/)**: lightweight native window hosting HTML/JS. ([GitHub](https://github.com/r0x0r/pywebview)
- **[python-sounddevice](https://python-sounddevice.readthedocs.io/)**: PortAudio bindings for low-latency I/O.
---
## Quick Start
1. Install and select a **virtual device** (e.g., set system output to **BlackHole 2ch**). [existential.audio](https://existential.audio/blackhole/)
2. Run the app:
```bash
python audiobridge-gui.py
```
3. In **Devices**:
- **Input** → the virtual device (captures desktop audio), or your mic.
- **Output** → your speakers/headphones (or another virtual sink or no output at all).
4. Click **Transcribe** → live captions appear; toggle **Translate** to get **English** translation output via Whisper.
5. Use the **Gain** slider for loopback level.
6. Tap the square **Record** button to capture a timestamped **MP3** (requires `ffmpeg`).
**Tip — [Shazam](https://apps.apple.com/ch/app/shazam-musikerkennung/id897118787) workflow:** set **system output = [BlackHole](https://existential.audio/blackhole/)**, **MurMur input = [BlackHole](https://existential.audio/blackhole/)**, then run **[Shazam for Mac](https://apps.apple.com/ch/app/shazam-musikerkennung/id897118787)** to ID whats playing on your desktop with no ambient noise.
---
## Engines & Model Notes
- **Transcription:** [`faster-whisper`](https://github.com/SYSTRAN/faster-whisper) — Whisper reimplementation using CTranslate2 for speed and lower memory usage. You can swap model sizes (e.g., small/base/large-v3) as needed. [GitHub](https://github.com/SYSTRAN/faster-whisper) [Hugging Face](https://huggingface.co/Systran/faster-whisper-large-v3)
- **Live pipeline:** WhisperLive-style continuous client for “near-real-time” updates. [GitHub](https://github.com/collabora/WhisperLive)
---
## Packaging (optional)
- macOS bundles can be built with **PyInstaller** (not included here).
- Include the provided **1024×1024 icon** and ensure `ffmpeg` is present on target systems (or ship a static build).
---
## Troubleshooting
- **No audio into MurMur:** ensure the **virtual device** is selected as both **system output** and **MurMur input**. For some drivers, restarting CoreAudio can help after install.
- **`sounddevice` errors:** ensure PortAudio is available/installed correctly.
- **Recording fails:** confirm `ffmpeg` is installed and on PATH.
---
## Privacy
All processing runs **locally**. Audio never leaves your machine unless you route it to cloud software yourself.
---
## Acknowledgements
- **[faster-whisper](https://github.com/SYSTRAN/faster-whisper)** by SYSTRAN.
- **[WhisperLive](https://github.com/collabora/WhisperLive)** (real-time pipeline inspiration).
- **[Pywebview](https://pywebview.flowrl.com/)** and **[python-sounddevice](https://python-sounddevice.readthedocs.io/)** communities.
- **[BlackHole](https://existential.audio/blackhole/)**, **[Background Music](https://github.com/kyleneideck/BackgroundMusic)**, **[Soundflower](https://rogueamoeba.com/freebies/soundflower/)**, **[Loopback](https://rogueamoeba.com/loopback/)** for virtual audio devices.