Small CLI wrapper around the Kokoro TTS pipeline for Japanese speech synthesis. It chunks long text at punctuation so the model keeps a natural pace and writes 24 kHz audio with soundfile.

Quick start

Requires Python 3.11+ with pip.
Run ./run.sh "こんにちは、ココロです。" to create a .venv, install deps, download UniDic on first run, and generate out.wav.

Direct usage

source .venv/bin/activate              # or your own environment
python kokoro_ja.py "テストです。" \
  --voice jf_alpha \                   # e.g. jf_alpha, jf_tebukuro, jm_kumo
  --speed 1.0 \                        # >1 faster, <1 slower
  --out test_ja.wav

Notes

Output is mono 24 kHz WAV.
UniDic is only downloaded if not already present (handled inside run.sh).
PYTORCH_ENABLE_MPS_FALLBACK=1 is set in run.sh for smoother macOS GPU fallback; adjust as needed on other platforms.