Files
MurMur/README.md
2025-08-15 15:04:45 +02:00

110 lines
5.3 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# MurMur — Audio Bridge / Transcribe / Translate
**MurMur** is a lightweight desktop app that **routes**, **transcribes**, **translates**, and **records** system audio in real time — all locally.
- **ASR (Automatic Speech Recognition) engine:** [`faster-whisper`](https://github.com/SYSTRAN/faster-whisper) driven by a local Whisper model for fast, memory-efficient transcription.
- **Live client:** “nearly-live” Whisper pipeline for continuous updates.
- **Translation:** uses Whispers built-in `translate=True` path (**to English**)
- **Audio bridge:** pick input/output devices, loop back desktop audio, add **virtual gain**, and capture to MP3 (via `ffmpeg`).
- **Bonus use-case:** route desktop audio into **Shazam for Mac** for highly reliable song ID (no mic/room noise).
![MurMur](murmur.png)
> **Note:** Earlier builds experimented with NLLB and SeamlessM4T for wider language translation; **current MurMur does _not_ use these**. Whisper **does** transcribe and translate nearly any language, but can only translate **to** English. In case you want to use other transcription or translation software (non-realtime), the recording-feature was implemented.
---
## Features
- **Two-pane UI:** Live transcript + optional live English translation.
- **Source language:** Auto-detect (Whisper) or manually set.
- **Audio routing:** Low-latency loopback with adjustable **virtual gain**.
- **Recording:** One-click capture to MP3.
---
## Requirements
- **Python** 3.9+
- **ffmpeg** on PATH (for MP3 export) — installable via Homebrew.
- **Virtual audio sink** (macOS) — install **one** of:
- **BlackHole** (open-source, zero-latency loopback).
- **Background Music** (adds a virtual device + per-app volume).
- **Soundflower** (classic virtual device).
- **Loopback** (commercial, flexible routing UI).
These create a virtual output/input so one apps audio can be captured by another (e.g., system output → MurMur input).
---
## Install
Create a venv and install dependencies:
```bash
python -m venv .venv && source .venv/bin/activate
python -m pip install --upgrade pip
pip install pywebview sounddevice huggingface_hub torch whisper-live numpy
```
- **[Pywebview](https://pywebview.flowrl.com/)**: lightweight native window hosting HTML/JS. ([GitHub](https://github.com/r0x0r/pywebview)
- **[python-sounddevice](https://python-sounddevice.readthedocs.io/)**: PortAudio bindings for low-latency I/O.
---
## Quick Start
1. Install and select a **virtual device** (e.g., set system output to **BlackHole 2ch**). [existential.audio](https://existential.audio/blackhole/)
2. Run the app:
```bash
python audiobridge-gui.py
```
3. In **Devices**:
- **Input** → the virtual device (captures desktop audio), or your mic.
- **Output** → your speakers/headphones (or another virtual sink or no output at all).
4. Click **Transcribe** → live captions appear; toggle **Translate** to get **English** translation output via Whisper.
5. Use the **Gain** slider for loopback level.
6. Tap the square **Record** button to capture a timestamped **MP3** (requires `ffmpeg`).
**Tip — [Shazam](https://apps.apple.com/ch/app/shazam-musikerkennung/id897118787) workflow:** set **system output = [BlackHole](https://existential.audio/blackhole/)**, **MurMur input = [BlackHole](https://existential.audio/blackhole/)**, then run **[Shazam for Mac](https://apps.apple.com/ch/app/shazam-musikerkennung/id897118787)** to ID whats playing on your desktop with no ambient noise.
---
## Engines & Model Notes
- **Transcription:** [`faster-whisper`](https://github.com/SYSTRAN/faster-whisper) — Whisper reimplementation using CTranslate2 for speed and lower memory usage. You can swap model sizes (e.g., small/base/large-v3) as needed. [GitHub](https://github.com/SYSTRAN/faster-whisper) [Hugging Face](https://huggingface.co/Systran/faster-whisper-large-v3)
- **Live pipeline:** WhisperLive-style continuous client for “near-real-time” updates. [GitHub](https://github.com/collabora/WhisperLive)
---
## Packaging (optional)
- macOS bundles can be built with **PyInstaller** (not included here).
- Include the provided **1024×1024 icon** and ensure `ffmpeg` is present on target systems (or ship a static build).
---
## Troubleshooting
- **No audio into MurMur:** ensure the **virtual device** is selected as both **system output** and **MurMur input**. For some drivers, restarting CoreAudio can help after install.
- **`sounddevice` errors:** ensure PortAudio is available/installed correctly.
- **Recording fails:** confirm `ffmpeg` is installed and on PATH.
---
## Privacy
All processing runs **locally**. Audio never leaves your machine unless you route it to cloud software yourself.
---
## Acknowledgements
- **[faster-whisper](https://github.com/SYSTRAN/faster-whisper)** by SYSTRAN.
- **[WhisperLive](https://github.com/collabora/WhisperLive)** (real-time pipeline inspiration).
- **[Pywebview](https://pywebview.flowrl.com/)** and **[python-sounddevice](https://python-sounddevice.readthedocs.io/)** communities.
- **[BlackHole](https://existential.audio/blackhole/)**, **[Background Music](https://github.com/kyleneideck/BackgroundMusic)**, **[Soundflower](https://rogueamoeba.com/freebies/soundflower/)**, **[Loopback](https://rogueamoeba.com/loopback/)** for virtual audio devices.