Prepare publishable Tauri app

This commit is contained in:
2026-05-04 10:13:03 +02:00
parent dd135a6089
commit ff66d0aea3
20 changed files with 6386 additions and 159 deletions

44
.gitignore vendored Normal file
View File

@@ -0,0 +1,44 @@
venv
.venv
sync.sh
summaries.db
.env
node_modules
data
package-lock.json
dump_summaries.sql
.DS_Store
.git
__pycache__
.mypy_cache
.pytest_cache
dist
build
out
.next
.nuxt
.turbo
.parcel-cache
.cache
target
src-tauri/target
src-tauri/gen
coverage
logs
tmp
temp
output
tmp*
*.log
*.tmp
*.swp
*.lock
*.exe
Cargo.lock
!src-tauri/Cargo.lock
src-tauri/resources/backend/*
!src-tauri/resources/backend/.gitkeep
src-tauri/resources/ffmpeg/*
!src-tauri/resources/ffmpeg/.gitkeep

26
MIGRATION_NOTES.md Normal file
View File

@@ -0,0 +1,26 @@
# Migration Notes
## What Was Preserved
- Static frontend design from `ui/index.html`, including the rose color palette, compact header form, list layout, collapsed summary previews and pagination.
- Frontend behavior from `ui/renderer.js`: model loading, local UI preferences, Whisper toggle, auto-translation toggle, per-entry language tabs, delete confirmation, progress updates, expandable summaries and thumbnail external links.
- Tauri command surface: `get_models`, `get_summaries`, `summarize_video`, `delete_summary`, `translate_summary`, `open_external` and `open_file`.
- Local runtime model: SQLite history in the OS app local data directory, media under that data directory, Ollama on `localhost:11434`, and Python helpers for YouTube metadata, transcripts, Whisper, summaries and translation.
- Release bundling path: a PyInstaller-built backend sidecar plus copied `ffmpeg` and `ffprobe` resources under `src-tauri/resources`.
## Electron Reality Check
No active Electron app was present in the source snapshot used for this migration. There was no Electron main process, preload script, `ipcMain`/`ipcRenderer` bridge, `BrowserWindow` setup, `package.json` or Electron build configuration. The working desktop shell was already Tauri 2, so this folder packages that actual implementation as a standalone Tauri project rather than inventing behavior from missing Electron files.
## Important Runtime Details
- The Tauri identifier remains `com.victorgiers.youtube-summarizer` so OS-level app data and history stay aligned with the existing app identity.
- `run.sh` and `run.bat` now change into this folder before creating the Python environment or launching Cargo.
- The frontend still uses `window.__TAURI__` because `withGlobalTauri` is enabled in `src-tauri/tauri.conf.json`.
- Development falls back to local Python scripts when no bundled backend sidecar exists.
## Imported Legacy Data
- The old Electron database from `/Users/giers/Tools/victors-tools/youtube_summarizer/summaries.db` was copied into the Tauri runtime data directory at `/Users/giers/Library/Application Support/com.victorgiers.youtube-summarizer/summaries.db`.
- A copy also exists at `summaries.db` in this folder for local migration reference.
- Thumbnail files from the old `data/` folder were copied so historical entries keep their images. Audio and transcript files were not copied because the Tauri runtime clears those artifact references on startup.

View File

@@ -1,4 +1,4 @@
# YouTube Summarizer
# YouTube Summarizer Tauri
This is a local-first desktop app for summarizing YouTube videos with Ollama.
@@ -9,6 +9,10 @@ It uses:
- Ollama on `localhost` for summarization and translation
- SQLite for local history
## Migration State
This folder is the standalone Tauri version of the app. The repository snapshot this was created from did not contain an active Electron runtime, `package.json`, preload script or Electron main process; the actual app behavior was already represented by a static HTML/CSS/JS frontend, a Tauri 2 Rust shell and Python backend helpers. The migration work here keeps that behavior and design intact inside `ytsummarizer_tauri` so it can be built and run without depending on files outside this folder.
## What It Does
Given a YouTube URL, the app can:
@@ -48,7 +52,7 @@ For development in this repo you still need:
- FFmpeg in `PATH`
- Ollama running locally on `http://localhost:11434`
Python dependencies are listed in [requirements.txt](/Users/giers/youtube_summarizer/requirements.txt).
Python dependencies are listed in [requirements.txt](requirements.txt).
## Run In Development
@@ -73,7 +77,7 @@ pip install -r requirements.txt
cargo run --manifest-path src-tauri/Cargo.toml
```
The app prefers a bundled backend executable when one is present under [src-tauri/resources/backend](/Users/giers/youtube_summarizer/src-tauri/resources/backend), and otherwise falls back to the local Python environment for development.
The app prefers a bundled backend executable when one is present under [src-tauri/resources/backend](src-tauri/resources/backend), and otherwise falls back to the local Python environment for development.
## Build A Shippable Bundle
@@ -93,21 +97,21 @@ cargo tauri build
What `tools/prepare_bundle.py` does:
- installs PyInstaller into the current Python environment
- builds a single-file backend executable from [backend_cli.py](/Users/giers/youtube_summarizer/backend_cli.py)
- copies that executable into [src-tauri/resources/backend](/Users/giers/youtube_summarizer/src-tauri/resources/backend)
- copies `ffmpeg` and `ffprobe` from the build machine into [src-tauri/resources/ffmpeg](/Users/giers/youtube_summarizer/src-tauri/resources/ffmpeg)
- builds a single-file backend executable from [backend_cli.py](backend_cli.py)
- copies that executable into [src-tauri/resources/backend](src-tauri/resources/backend)
- copies `ffmpeg` and `ffprobe` from the build machine into [src-tauri/resources/ffmpeg](src-tauri/resources/ffmpeg)
Build once on each target OS you want to ship. For Windows 10, build on Windows.
## Build On GitHub Actions
A Windows build workflow is included at [.github/workflows/windows-installer.yml](/Users/giers/youtube_summarizer/.github/workflows/windows-installer.yml).
A Windows build workflow from the original repository can be pointed at this folder by running the same commands from `ytsummarizer_tauri`.
It runs on `windows-latest`, installs `ffmpeg` and NSIS, prepares the bundled Python backend with [tools/prepare_bundle.py](/Users/giers/youtube_summarizer/tools/prepare_bundle.py), builds an NSIS installer, and uploads the result as a workflow artifact named `windows-installer`.
It should run on `windows-latest`, install `ffmpeg` and NSIS, prepare the bundled Python backend with [tools/prepare_bundle.py](tools/prepare_bundle.py), build an NSIS installer, and upload the result as a workflow artifact named `windows-installer`.
## Notes
- If Python is not on your `PATH` for development, set `YTS_PYTHON` to the interpreter you want the Tauri backend to use.
- If you want to test a prebuilt backend executable during development, set `YTS_BACKEND_BIN` to its full path.
- If `ffmpeg` or `ffprobe` are not on `PATH` during bundle prep, set `YTS_FFMPEG` and `YTS_FFPROBE` to their full paths before running [tools/prepare_bundle.py](/Users/giers/youtube_summarizer/tools/prepare_bundle.py).
- If `ffmpeg` or `ffprobe` are not on `PATH` during bundle prep, set `YTS_FFMPEG` and `YTS_FFPROBE` to their full paths before running [tools/prepare_bundle.py](tools/prepare_bundle.py).
- Generated thumbnails and the SQLite database are created on first run in the app's local data directory.

View File

@@ -8,6 +8,7 @@ while still supporting direct Python execution during development.
import argparse
import json
import multiprocessing
import sys
from pathlib import Path
@@ -18,19 +19,38 @@ from youtube_summarizer import process_video
DEFAULT_MODEL = "mistral:latest"
def compact_error_message(exc: BaseException) -> str:
"""Build a short error string without dumping a traceback into the GUI."""
parts = []
current = exc
while current:
text = " ".join(str(current).split())
if text and text not in parts:
parts.append(text)
current = current.__cause__ or current.__context__
return ": ".join(parts) or exc.__class__.__name__
def configure_stdio() -> None:
"""Keep progress output line-buffered for the desktop app."""
if hasattr(sys.stdout, "reconfigure"):
sys.stdout.reconfigure(line_buffering=True)
sys.stdout.reconfigure(encoding="utf-8", errors="replace", line_buffering=True)
if hasattr(sys.stderr, "reconfigure"):
sys.stderr.reconfigure(line_buffering=True)
sys.stderr.reconfigure(encoding="utf-8", errors="replace", line_buffering=True)
def summarize(args: argparse.Namespace) -> int:
prompt_template = None
if args.prompt_template_file:
prompt_template = Path(args.prompt_template_file).read_text(encoding="utf-8")
elif args.prompt_template:
prompt_template = args.prompt_template
meta = process_video(
args.url,
use_whisper=args.use_whisper,
model=args.model,
prompt_template=prompt_template,
output_json=args.output_json,
)
if not args.output_json:
@@ -44,7 +64,13 @@ def translate(args: argparse.Namespace) -> int:
if not summary_text:
raise SystemExit("Empty summary text!")
translation = translate_summary_text(summary_text, args.lang, args.model)
prompt_template = None
if args.prompt_template_file:
prompt_template = Path(args.prompt_template_file).read_text(encoding="utf-8")
elif args.prompt_template:
prompt_template = args.prompt_template
translation = translate_summary_text(summary_text, args.lang, args.model, prompt_template)
if args.output_file:
Path(args.output_file).write_text(translation, encoding="utf-8")
@@ -60,6 +86,8 @@ def build_parser() -> argparse.ArgumentParser:
summarize_parser = subparsers.add_parser("summarize", help="Summarize a YouTube video")
summarize_parser.add_argument("--url", required=True, help="YouTube video URL")
summarize_parser.add_argument("--model", default=DEFAULT_MODEL, help="Ollama model to use")
summarize_parser.add_argument("--prompt-template", help="Prompt template for the summary LLM call")
summarize_parser.add_argument("--prompt-template-file", help="Path to a prompt template file")
summarize_parser.add_argument(
"--no-whisper",
dest="use_whisper",
@@ -76,6 +104,8 @@ def build_parser() -> argparse.ArgumentParser:
translate_parser.add_argument("--summary-file", required=True, help="Path to the English summary text")
translate_parser.add_argument("--lang", required=True, choices=["de", "jp"], help="Target language")
translate_parser.add_argument("--model", default=DEFAULT_MODEL, help="Ollama model to use")
translate_parser.add_argument("--prompt-template", help="Prompt template for the translation LLM call")
translate_parser.add_argument("--prompt-template-file", help="Path to a translation prompt template file")
translate_parser.add_argument("--output-file", help="Optional path to write the translated text")
translate_parser.set_defaults(handler=translate)
@@ -83,10 +113,18 @@ def build_parser() -> argparse.ArgumentParser:
def main() -> int:
multiprocessing.freeze_support()
configure_stdio()
parser = build_parser()
args = parser.parse_args()
return args.handler(args)
try:
return args.handler(args)
except KeyboardInterrupt:
print("[error] Cancelled.", file=sys.stderr, flush=True)
return 130
except Exception as exc:
print(f"[error] {compact_error_message(exc)}", file=sys.stderr, flush=True)
return 1
if __name__ == "__main__":

View File

@@ -1,5 +1,6 @@
@echo off
setlocal
cd /d "%~dp0"
REM 1. Prüfen, ob venv existiert, sonst erstellen
if not exist venv (

2
run.sh
View File

@@ -1,6 +1,8 @@
#!/usr/bin/env bash
set -e
cd "$(dirname "$0")"
# 1. Python venv einrichten
GREEN="\033[0;32m"
CYAN="\033[0;36m"

5392
src-tauri/Cargo.lock generated Normal file

File diff suppressed because it is too large Load Diff

BIN
src-tauri/icons/128x128.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.2 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

BIN
src-tauri/icons/32x32.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.2 KiB

BIN
src-tauri/icons/icon.icns Normal file

Binary file not shown.

BIN
src-tauri/icons/icon.ico Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 49 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.3 KiB

After

Width:  |  Height:  |  Size: 44 KiB

View File

@@ -1,8 +1,9 @@
#![cfg_attr(not(debug_assertions), windows_subsystem = "windows")]
use std::{
env, fs,
io::{BufRead, BufReader, ErrorKind},
env,
fs::{self, OpenOptions},
io::{BufRead, BufReader, ErrorKind, Write},
path::{Path, PathBuf},
process::{Command, Stdio},
sync::{Arc, Mutex},
@@ -10,16 +11,22 @@ use std::{
time::{SystemTime, UNIX_EPOCH},
};
#[cfg(target_os = "windows")]
use std::os::windows::process::CommandExt;
use open::that;
use reqwest::blocking::Client;
use rusqlite::{params, Connection, OptionalExtension};
use serde::{Deserialize, Serialize};
use tauri::{AppHandle, Emitter, Manager, State, WebviewWindow};
use tauri::menu::{MenuBuilder, SubmenuBuilder};
use tauri::{path::BaseDirectory, AppHandle, Emitter, Manager, State, WebviewWindow};
const DEFAULT_MODEL: &str = "mistral:latest";
const OLLAMA_TAGS_URL: &str = "http://localhost:11434/api/tags";
const BACKEND_EXECUTABLE_NAME: &str = "yts-backend";
const TARGET_TRIPLE: &str = env!("TAURI_BUILD_TARGET");
#[cfg(target_os = "windows")]
const CREATE_NO_WINDOW: u32 = 0x0800_0000;
#[derive(Clone)]
enum BackendRuntime {
@@ -37,6 +44,7 @@ struct AppState {
app_dir: PathBuf,
media_dir: PathBuf,
db_path: PathBuf,
backend_log_path: PathBuf,
backend: BackendRuntime,
ffmpeg_path: Option<PathBuf>,
ffprobe_path: Option<PathBuf>,
@@ -49,6 +57,7 @@ struct SummarizeVideoRequest {
url: String,
use_whisper: bool,
model: Option<String>,
master_prompt: Option<String>,
}
#[derive(Debug, Deserialize)]
@@ -57,10 +66,12 @@ struct DeleteSummaryRequest {
}
#[derive(Debug, Deserialize)]
#[serde(rename_all = "camelCase")]
struct TranslateSummaryRequest {
id: i64,
lang: String,
model: Option<String>,
prompt_template: Option<String>,
}
#[derive(Debug, Deserialize)]
@@ -165,6 +176,12 @@ fn normalize_model(model: Option<String>) -> String {
.unwrap_or_else(|| DEFAULT_MODEL.to_string())
}
fn normalize_prompt_template(prompt: Option<String>) -> Option<String> {
prompt
.map(|value| value.trim().to_string())
.filter(|value| !value.is_empty())
}
fn now_millis() -> u128 {
SystemTime::now()
.duration_since(UNIX_EPOCH)
@@ -190,15 +207,16 @@ fn platform_executable_name(base_name: &str) -> String {
fn resolve_resource_file(app: &AppHandle, relative_path: &Path) -> Option<PathBuf> {
let mut candidates = Vec::new();
if let Ok(resource_dir) = app.path().resource_dir() {
candidates.push(resource_dir.join(relative_path));
if let Ok(resource_path) = app.path().resolve(relative_path, BaseDirectory::Resource) {
candidates.push(resource_path);
}
candidates.push(
PathBuf::from(env!("CARGO_MANIFEST_DIR"))
.join("resources")
.join(relative_path),
);
let manifest_dir = PathBuf::from(env!("CARGO_MANIFEST_DIR"));
candidates.push(manifest_dir.join(relative_path));
candidates.push(manifest_dir.join("resources").join(relative_path));
if let Ok(project_root) = resolve_project_root() {
candidates.push(project_root.join(relative_path));
}
candidates.into_iter().find(|path| path.exists())
}
@@ -218,9 +236,9 @@ fn resolve_backend_binary(app: &AppHandle) -> Option<PathBuf> {
}
fn resolve_script_dir(app: &AppHandle) -> Result<PathBuf, String> {
if let Ok(resource_dir) = app.path().resource_dir() {
if resource_dir.join("backend_cli.py").exists() {
return Ok(resource_dir);
if let Some(resource_file) = resolve_resource_file(app, Path::new("backend_cli.py")) {
if let Some(parent) = resource_file.parent() {
return Ok(parent.to_path_buf());
}
}
@@ -292,6 +310,21 @@ fn resolve_whisper_cache_dir(app: &AppHandle) -> Result<PathBuf, String> {
Ok(whisper_cache_dir)
}
fn resolve_log_dir(app: &AppHandle) -> Result<PathBuf, String> {
let log_dir = app
.path()
.app_log_dir()
.or_else(|_| {
app.path()
.app_local_data_dir()
.map(|path| path.join("logs"))
})
.map_err(|err| format!("Failed to resolve application log directory: {err}"))?;
fs::create_dir_all(&log_dir)
.map_err(|err| format!("Failed to create application log directory: {err}"))?;
Ok(log_dir)
}
fn open_connection(state: &AppState) -> Result<Connection, String> {
Connection::open(&state.db_path).map_err(|err| format!("Failed to open SQLite database: {err}"))
}
@@ -341,8 +374,10 @@ fn cleanup_artifacts(state: &AppState, audio: Option<&str>, transcript: Option<&
fn purge_existing_artifacts(state: &AppState) -> Result<(), String> {
let db = open_connection(state)?;
let mut stmt = db
.prepare("SELECT id, audio, transcript FROM summaries WHERE audio IS NOT NULL OR transcript IS NOT NULL")
.map_err(|err| format!("Failed to prepare artifact cleanup query: {err}"))?;
.prepare(
"SELECT id, audio, transcript FROM summaries WHERE audio IS NOT NULL OR transcript IS NOT NULL",
)
.map_err(|err| format!("Failed to prepare artifact cleanup query: {err}"))?;
let rows = stmt
.query_map([], |row| {
@@ -372,6 +407,30 @@ fn purge_existing_artifacts(state: &AppState) -> Result<(), String> {
Ok(())
}
fn write_startup_error_log(app: &AppHandle, message: &str) {
let mut candidates = Vec::new();
if let Ok(path) = app.path().app_log_dir() {
candidates.push(path);
}
if let Ok(path) = app.path().app_local_data_dir() {
candidates.push(path);
}
candidates.push(env::temp_dir().join("youtube-summarizer"));
for directory in candidates {
if fs::create_dir_all(&directory).is_ok() {
let log_path = directory.join("startup-error.log");
if fs::write(&log_path, message).is_ok() {
eprintln!("Startup failure written to {}", log_path.display());
return;
}
}
}
eprintln!("{message}");
}
fn ensure_app_state(app: &AppHandle) -> Result<AppState, String> {
let app_dir = app
.path()
@@ -386,6 +445,7 @@ fn ensure_app_state(app: &AppHandle) -> Result<AppState, String> {
ffmpeg_path: resolve_optional_tool_path(app, "YTS_FFMPEG", "ffmpeg"),
ffprobe_path: resolve_optional_tool_path(app, "YTS_FFPROBE", "ffprobe"),
whisper_cache_dir: resolve_whisper_cache_dir(app)?,
backend_log_path: resolve_log_dir(app)?.join("backend.log"),
app_dir: app_dir.clone(),
media_dir,
db_path: app_dir.join("summaries.db"),
@@ -403,8 +463,47 @@ fn emit_progress(app: &AppHandle, window_label: &str, line: &str) {
}
}
fn append_backend_log(log_path: &Path, line: &str) {
if let Ok(mut file) = OpenOptions::new().create(true).append(true).open(log_path) {
let _ = writeln!(file, "{line}");
}
}
fn backend_failure_message(stderr_output: &str, fallback: String) -> String {
for line in stderr_output.lines().rev() {
let trimmed = line.trim();
if let Some(message) = trimmed.strip_prefix("[error]") {
let message = message.trim();
if !message.is_empty() {
return message.to_string();
}
}
}
for line in stderr_output.lines().rev() {
let trimmed = line.trim();
if trimmed.is_empty()
|| trimmed.starts_with("WARNING:")
|| trimmed.starts_with("Traceback")
|| trimmed.starts_with("File ")
|| trimmed.starts_with("During handling")
{
continue;
}
if trimmed.starts_with("ERROR:")
|| trimmed.contains("RuntimeError:")
|| trimmed.contains("SystemExit:")
{
return trimmed.to_string();
}
}
fallback
}
fn apply_backend_env(command: &mut Command, state: &AppState) {
command.env("PYTHONUNBUFFERED", "1");
command.env("PYTHONIOENCODING", "utf-8");
command.env("YTS_WHISPER_CACHE_DIR", &state.whisper_cache_dir);
if let Some(ffmpeg_path) = &state.ffmpeg_path {
@@ -427,6 +526,8 @@ fn build_backend_command(state: &AppState, args: &[String]) -> Command {
command.args(args).current_dir(&state.media_dir);
apply_backend_env(&mut command, state);
#[cfg(target_os = "windows")]
command.creation_flags(CREATE_NO_WINDOW);
command
}
@@ -440,12 +541,20 @@ fn run_backend_json_command(
let mut command_args = args.to_vec();
command_args.push("--output-json".to_string());
command_args.push(output_path.to_string_lossy().into_owned());
append_backend_log(
&state.backend_log_path,
&format!("=== summarize {} ===", command_args.join(" ")),
);
let mut child = build_backend_command(state, &command_args)
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.spawn()
.map_err(|err| format!("Failed to start bundled backend: {err}"))?;
.map_err(|err| {
let message = format!("Failed to start bundled backend: {err}");
append_backend_log(&state.backend_log_path, &message);
message
})?;
let stdout = child
.stdout
@@ -456,26 +565,29 @@ fn run_backend_json_command(
.take()
.ok_or_else(|| "Backend stderr was not captured.".to_string())?;
let stderr_buffer = Arc::new(Mutex::new(String::new()));
let stdout_log_path = state.backend_log_path.clone();
let stderr_log_path = state.backend_log_path.clone();
let stdout_app = app.clone();
let stdout_label = window_label.to_string();
let stdout_handle = thread::spawn(move || {
for line in BufReader::new(stdout).lines() {
match line {
Ok(line) => emit_progress(&stdout_app, &stdout_label, &line),
Ok(line) => {
append_backend_log(&stdout_log_path, &format!("[stdout] {line}"));
emit_progress(&stdout_app, &stdout_label, &line);
}
Err(_) => break,
}
}
});
let stderr_app = app.clone();
let stderr_label = window_label.to_string();
let stderr_buffer_clone = Arc::clone(&stderr_buffer);
let stderr_handle = thread::spawn(move || {
for line in BufReader::new(stderr).lines() {
match line {
Ok(line) => {
emit_progress(&stderr_app, &stderr_label, &line);
append_backend_log(&stderr_log_path, &format!("[stderr] {line}"));
if let Ok(mut buffer) = stderr_buffer_clone.lock() {
buffer.push_str(&line);
buffer.push('\n');
@@ -486,9 +598,15 @@ fn run_backend_json_command(
}
});
let status = child
.wait()
.map_err(|err| format!("Failed to wait for bundled backend: {err}"))?;
let status = child.wait().map_err(|err| {
let message = format!("Failed to wait for bundled backend: {err}");
append_backend_log(&state.backend_log_path, &message);
message
})?;
append_backend_log(
&state.backend_log_path,
&format!("Bundled backend exit status: {status}"),
);
let _ = stdout_handle.join();
let _ = stderr_handle.join();
@@ -498,34 +616,66 @@ fn run_backend_json_command(
.lock()
.map(|buffer| buffer.trim().to_string())
.unwrap_or_else(|_| String::new());
let message = if stderr_output.is_empty() {
format!("Bundled backend exited with status {status}.")
} else {
stderr_output
};
let message = backend_failure_message(
&stderr_output,
format!("Bundled backend exited with status {status}."),
);
append_backend_log(
&state.backend_log_path,
&format!("Backend failure: {message}"),
);
let _ = fs::remove_file(&output_path);
return Err(message);
}
let raw_json = fs::read_to_string(&output_path)
.map_err(|err| format!("Failed to read backend output JSON: {err}"))?;
let raw_json = fs::read_to_string(&output_path).map_err(|err| {
let message = format!("Failed to read backend output JSON: {err}");
append_backend_log(&state.backend_log_path, &message);
message
})?;
let _ = fs::remove_file(&output_path);
serde_json::from_str(&raw_json).map_err(|err| format!("Invalid backend output JSON: {err}"))
serde_json::from_str(&raw_json).map_err(|err| {
let message = format!("Invalid backend output JSON: {err}");
append_backend_log(&state.backend_log_path, &message);
message
})
}
fn run_backend_text_command(state: &AppState, args: &[String]) -> Result<String, String> {
let output = build_backend_command(state, args)
.output()
.map_err(|err| format!("Failed to start translation backend: {err}"))?;
append_backend_log(
&state.backend_log_path,
&format!("=== translate {} ===", args.join(" ")),
);
let output = build_backend_command(state, args).output().map_err(|err| {
let message = format!("Failed to start translation backend: {err}");
append_backend_log(&state.backend_log_path, &message);
message
})?;
for line in String::from_utf8_lossy(&output.stdout).lines() {
append_backend_log(&state.backend_log_path, &format!("[stdout] {line}"));
}
for line in String::from_utf8_lossy(&output.stderr).lines() {
append_backend_log(&state.backend_log_path, &format!("[stderr] {line}"));
}
append_backend_log(
&state.backend_log_path,
&format!("Translation backend exit status: {}", output.status),
);
if !output.status.success() {
let stderr = String::from_utf8_lossy(&output.stderr).trim().to_string();
return Err(if stderr.is_empty() {
let message = if stderr.is_empty() {
format!("Translation backend exited with status {}.", output.status)
} else {
stderr
});
};
append_backend_log(
&state.backend_log_path,
&format!("Translation failure: {message}"),
);
return Err(message);
}
let translation = String::from_utf8(output.stdout)
@@ -533,7 +683,9 @@ fn run_backend_text_command(state: &AppState, args: &[String]) -> Result<String,
.trim()
.to_string();
if translation.is_empty() {
return Err("Translation backend returned an empty result.".to_string());
let message = "Translation backend returned an empty result.".to_string();
append_backend_log(&state.backend_log_path, &message);
return Err(message);
}
Ok(translation)
@@ -559,19 +711,42 @@ fn summarize_video_inner(
window_label: &str,
request: SummarizeVideoRequest,
) -> Result<SummaryEntry, String> {
let model = normalize_model(request.model);
let SummarizeVideoRequest {
url,
use_whisper,
model,
master_prompt,
} = request;
let model = normalize_model(model);
let mut args = vec![
"summarize".to_string(),
"--url".to_string(),
request.url,
url,
"--model".to_string(),
model,
];
if !request.use_whisper {
if !use_whisper {
args.push("--no-whisper".to_string());
}
let info = run_backend_json_command(state, app, window_label, &args)?;
let prompt_path = if let Some(prompt) = normalize_prompt_template(master_prompt) {
let path = state
.app_dir
.join(format!("tmp_prompt_{}.txt", now_millis()));
fs::write(&path, prompt)
.map_err(|err| format!("Failed to write temporary prompt file: {err}"))?;
args.push("--prompt-template-file".to_string());
args.push(path.to_string_lossy().into_owned());
Some(path)
} else {
None
};
let result = run_backend_json_command(state, app, window_label, &args);
if let Some(path) = prompt_path {
let _ = fs::remove_file(path);
}
let info = result?;
cleanup_artifacts(state, info.audio.as_deref(), info.transcript.as_deref());
let db = open_connection(state)?;
@@ -601,11 +776,17 @@ fn translate_summary_inner(
state: &AppState,
request: TranslateSummaryRequest,
) -> Result<SummaryEntry, String> {
let TranslateSummaryRequest {
id,
lang,
model,
prompt_template,
} = request;
let db = open_connection(state)?;
let summary_text = db
.query_row(
"SELECT summary_en FROM summaries WHERE id = ?",
[request.id],
[id],
|row| row.get::<_, Option<String>>(0),
)
.optional()
@@ -613,29 +794,49 @@ fn translate_summary_inner(
.flatten()
.ok_or_else(|| "No English summary found for translation.".to_string())?;
let tmp_summary_path =
state
.app_dir
.join(format!("tmp_summary_{}_{}.txt", request.id, now_millis()));
let tmp_summary_path = state
.app_dir
.join(format!("tmp_summary_{}_{}.txt", id, now_millis()));
fs::write(&tmp_summary_path, summary_text)
.map_err(|err| format!("Failed to write temporary summary file: {err}"))?;
let model = normalize_model(request.model);
let args = vec![
let model = normalize_model(model);
let mut args = vec![
"translate".to_string(),
"--summary-file".to_string(),
tmp_summary_path.to_string_lossy().into_owned(),
"--lang".to_string(),
request.lang.clone(),
lang.clone(),
"--model".to_string(),
model,
];
let tmp_prompt_path = if let Some(prompt) = normalize_prompt_template(prompt_template) {
let path = state.app_dir.join(format!(
"tmp_translation_prompt_{}_{}.txt",
id,
now_millis()
));
if let Err(err) = fs::write(&path, prompt) {
let _ = fs::remove_file(&tmp_summary_path);
return Err(format!(
"Failed to write temporary translation prompt file: {err}"
));
}
args.push("--prompt-template-file".to_string());
args.push(path.to_string_lossy().into_owned());
Some(path)
} else {
None
};
let result = run_backend_text_command(state, &args);
let _ = fs::remove_file(&tmp_summary_path);
if let Some(path) = tmp_prompt_path {
let _ = fs::remove_file(path);
}
let translation = result?;
let column = match request.lang.as_str() {
let column = match lang.as_str() {
"de" => "summary_de",
"jp" => "summary_jp",
_ => return Err("Unsupported language code.".to_string()),
@@ -643,11 +844,11 @@ fn translate_summary_inner(
db.execute(
&format!("UPDATE summaries SET {column} = ? WHERE id = ?"),
params![translation, request.id],
params![translation, id],
)
.map_err(|err| format!("Failed to save translated summary: {err}"))?;
get_entry_by_id(state, request.id)
get_entry_by_id(state, id)
}
#[tauri::command]
@@ -733,13 +934,57 @@ fn open_file(file_path: String) -> Result<(), String> {
that(path).map_err(|err| format!("Failed to open file: {err}"))
}
fn install_app_menu(app: &mut tauri::App) -> tauri::Result<()> {
let handle = app.handle();
let menu_builder = MenuBuilder::new(handle);
#[cfg(target_os = "macos")]
let menu_builder = {
let app_menu = SubmenuBuilder::new(handle, "YouTube Summarizer")
.hide()
.hide_others()
.show_all()
.separator()
.quit()
.build()?;
menu_builder.item(&app_menu)
};
let settings_menu = SubmenuBuilder::new(handle, "Settings")
.text("open_settings", "Settings...")
.build()?;
let edit_menu = SubmenuBuilder::new(handle, "Edit")
.undo()
.redo()
.separator()
.cut()
.copy()
.paste()
.select_all()
.build()?;
let menu = menu_builder.item(&settings_menu).item(&edit_menu).build()?;
app.set_menu(menu)?;
Ok(())
}
fn main() {
tauri::Builder::default()
.plugin(tauri_plugin_dialog::init())
.on_menu_event(|app, event| {
if event.id() == "open_settings" {
let _ = app.emit_to("main", "open-settings", ());
}
})
.setup(|app| {
let state = ensure_app_state(app.handle())?;
app.manage(state);
Ok(())
install_app_menu(app)?;
match ensure_app_state(app.handle()) {
Ok(state) => {
app.manage(state);
Ok(())
}
Err(err) => {
write_startup_error_log(app.handle(), &err);
Err(err.into())
}
}
})
.invoke_handler(tauri::generate_handler![
get_models,

View File

@@ -27,13 +27,20 @@
},
"bundle": {
"active": true,
"resources": [
"../backend_cli.py",
"../youtube_summarizer.py",
"../translate_summary.py",
"../requirements.txt",
"resources/backend",
"resources/ffmpeg"
]
"icon": [
"icons/32x32.png",
"icons/128x128.png",
"icons/128x128@2x.png",
"icons/icon.icns",
"icons/icon.ico"
],
"resources": {
"../backend_cli.py": "backend_cli.py",
"../youtube_summarizer.py": "youtube_summarizer.py",
"../translate_summary.py": "translate_summary.py",
"../requirements.txt": "requirements.txt",
"resources/backend": "backend",
"resources/ffmpeg": "ffmpeg"
}
}
}

View File

@@ -4,9 +4,11 @@ import os
import sqlite3
import subprocess
import sys
from pathlib import Path
DB_FILE = os.path.join(os.path.dirname(__file__), 'summaries.db')
TRANSLATE_SCRIPT = os.path.join(os.path.dirname(__file__), 'translate_summary.py')
ROOT = Path(__file__).resolve().parents[1]
DB_FILE = os.environ.get("YTS_DB_FILE", str(ROOT / "summaries.db"))
TRANSLATE_SCRIPT = ROOT / "translate_summary.py"
MODEL = "mistral-small3.1:24b"
def get_entries_needing_translation(conn):
@@ -30,7 +32,7 @@ def translate(summary_text, lang):
# Führe das Übersetzungsskript aus
cmd = [
sys.executable, # benutzt aktuelles Python
TRANSLATE_SCRIPT,
str(TRANSLATE_SCRIPT),
"--summary-file", tmp_summary_path,
"--lang", lang,
"--model", MODEL,

View File

@@ -18,20 +18,54 @@ Example:
import sys
import argparse
import json
import math
import requests
LANG_MAP = {
"de": "German",
"jp": "Japanese"
}
OLLAMA_CHARS_PER_TOKEN = 3.5
OLLAMA_OUTPUT_TOKEN_BUDGET = 2048
OLLAMA_CONTEXT_BUCKETS = (4096, 8192, 16384, 32768, 65536)
def translate_summary_text(summary_text, target_language, model="mistral:latest"):
def default_translation_prompt_template(target_language):
if target_language not in LANG_MAP:
raise ValueError("Supported languages: de (German), jp (Japanese)")
return (
f"Translate the following summary into {LANG_MAP[target_language]}. Only output the translated summary, "
"no explanation or intro. If it's already in the target language, do nothing but repeat it.\n\n"
"Summary:\n{summary}\n\nTranslation:"
)
def render_translation_prompt(summary_text, target_language, prompt_template=None):
template = (prompt_template or default_translation_prompt_template(target_language)).strip()
prompt = (
template
.replace("{language}", LANG_MAP[target_language])
.replace("{summary}", summary_text)
)
if "{summary}" not in template:
prompt = f"{prompt}\n\nSummary:\n{summary_text}\n\nTranslation:"
return prompt
def choose_ollama_num_ctx(prompt, output_budget=OLLAMA_OUTPUT_TOKEN_BUDGET):
estimated_input_tokens = math.ceil(len(prompt) / OLLAMA_CHARS_PER_TOKEN)
needed_tokens = estimated_input_tokens + output_budget
for bucket in OLLAMA_CONTEXT_BUCKETS:
if needed_tokens <= bucket:
return bucket
return OLLAMA_CONTEXT_BUCKETS[-1]
def translate_summary_text(summary_text, target_language, model="mistral:latest", prompt_template=None):
if target_language not in LANG_MAP:
raise ValueError("Supported languages: de (German), jp (Japanese)")
prompt = (
f"Translate the following summary into {LANG_MAP[target_language]}. Only output the translated summary, "
"no explanation or intro. If it's already in the target language, do nothing but repeat it.\n\n"
f"Summary:\n{summary_text}\n\nTranslation:"
render_translation_prompt(summary_text, target_language, prompt_template)
)
payload = {
"model": model,
@@ -39,6 +73,9 @@ def translate_summary_text(summary_text, target_language, model="mistral:latest"
{"role": "system", "content": f"You are an expert translator proficient in {LANG_MAP[target_language]} and English."},
{"role": "user", "content": prompt}
],
"options": {
"num_ctx": choose_ollama_num_ctx(prompt)
},
"stream": False
}
resp = requests.post("http://localhost:11434/api/chat", json=payload)
@@ -47,24 +84,31 @@ def translate_summary_text(summary_text, target_language, model="mistral:latest"
return data.get("message", {}).get("content", "").strip()
def translate_summary_file(summary_file, target_language, model="mistral:latest"):
def translate_summary_file(summary_file, target_language, model="mistral:latest", prompt_template=None):
with open(summary_file, "r", encoding="utf-8") as f:
summary_text = f.read().strip()
if not summary_text:
raise ValueError("Empty summary text!")
return translate_summary_text(summary_text, target_language, model)
return translate_summary_text(summary_text, target_language, model, prompt_template)
def main():
parser = argparse.ArgumentParser(description="Translate summary using Ollama")
parser.add_argument("--summary-file", required=True, help="Path to file with English summary text")
parser.add_argument("--lang", required=True, choices=["de", "jp"], help="Target language: 'de' or 'jp'")
parser.add_argument("--model", default="mistral:latest", help="Ollama model to use")
parser.add_argument("--prompt-template", help="Prompt template for the translation LLM call")
parser.add_argument("--prompt-template-file", help="Path to a text file containing the translation prompt template")
parser.add_argument("--output-file", help="Output file for translated summary")
args = parser.parse_args()
prompt_template = args.prompt_template
if args.prompt_template_file:
with open(args.prompt_template_file, "r", encoding="utf-8") as f:
prompt_template = f.read()
# Read summary
try:
translation = translate_summary_file(args.summary_file, args.lang, args.model)
translation = translate_summary_file(args.summary_file, args.lang, args.model, prompt_template)
except Exception as e:
print(f"Translation failed: {e}", file=sys.stderr)
sys.exit(2)

View File

@@ -98,6 +98,12 @@
.entry .summary {
transition: max-height 0.2s;
}
.entry .summary hr {
border: 0;
border-top: 1px solid currentColor;
color: inherit;
margin: 0.7em 0;
}
.entry.collapsed .summary {
display: -webkit-box !important;
-webkit-line-clamp: 2;
@@ -130,36 +136,160 @@
color: white;
font-weight: bold;
}
header button:disabled {
background-color: #ffe4e6;
color: #9f1239;
opacity: 0.7;
cursor: default;
border: 1px solid #fbb6ce;
}
</style>
</head>
<body>
<header style="display:flex; flex-direction:column; gap:5px;">
<form id="summarize-form" style="display:flex; width:100%; gap:10px; align-items:center; flex-wrap: wrap;">
<input type="text" id="url-input" placeholder="Enter YouTube URL" />
<button type="submit">Summarize!</button>
<div style="display: flex; flex-direction: column; gap: 2px; min-width:120px;">
<label style="font-size:14px; color:#9f1239; display: flex; align-items: center; gap: 5px;">
<input type="checkbox" id="whisper-checkbox" checked />Use Whisper
</label>
<label style="font-size:14px; color:#9f1239; display: flex; align-items: center; gap: 5px;">
<input type="checkbox" id="autotranslate-checkbox" checked />Auto Translate
</label>
</div>
<select id="model-select" style="padding:6px; font-size:14px;">
<option disabled selected>Loading models…</option>
</select>
</form>
<div id="loading" class="loading" style="display:none;">Loading…</div>
</header>
<div id="pagination-top" class="pagination" style="display:none;"></div>
<div id="summaries-container"></div>
header button:disabled {
background-color: #ffe4e6;
color: #9f1239;
opacity: 0.7;
cursor: default;
border: 1px solid #fbb6ce;
}
.settings-dialog[hidden] {
display: none;
}
.settings-dialog {
position: fixed;
inset: 0;
z-index: 1000;
display: flex;
align-items: center;
justify-content: center;
padding: 24px;
background: rgba(69, 10, 10, 0.28);
}
.settings-panel {
width: min(720px, 100%);
max-height: min(760px, calc(100vh - 48px));
overflow: auto;
padding: 18px;
border: 1px solid #fecdd3;
border-radius: 6px;
background: #fff1f2;
box-shadow: 0 16px 40px rgba(159, 18, 57, 0.18);
}
.settings-panel-header {
display: flex;
align-items: center;
justify-content: space-between;
gap: 16px;
margin-bottom: 14px;
}
.settings-panel-header h2 {
margin: 0;
font-size: 20px;
}
.settings-close-button {
width: 30px;
height: 30px;
display: flex;
align-items: center;
justify-content: center;
padding: 0;
border: none;
border-radius: 4px;
background: transparent;
color: #9f1239;
font-size: 24px;
line-height: 1;
cursor: pointer;
}
.settings-close-button:hover {
background: #ffe4e6;
}
.settings-row {
display: flex;
align-items: center;
gap: 8px;
margin: 10px 0;
font-size: 15px;
}
.settings-row input[type="checkbox"] {
accent-color: #9f1239;
}
.settings-field-label {
display: block;
margin: 16px 0 6px;
font-size: 14px;
font-weight: bold;
}
.settings-prompt-textarea {
width: 100%;
min-height: 260px;
box-sizing: border-box;
padding: 10px;
border: 1px solid #fda4af;
border-radius: 4px;
background: #fff;
color: #7f1d1d;
font: 13px/1.45 ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", monospace;
resize: vertical;
}
.settings-prompt-textarea.compact {
min-height: 150px;
}
.settings-actions {
display: flex;
justify-content: flex-end;
gap: 10px;
margin-top: 12px;
}
.settings-actions button {
padding: 8px 12px;
border: 1px solid #9f1239;
border-radius: 4px;
background: #fff1f2;
color: #9f1239;
cursor: pointer;
}
.settings-actions button:hover {
background: #9f1239;
color: #fff;
}
</style>
</head>
<body>
<header style="display:flex; flex-direction:column; gap:5px;">
<form id="summarize-form" style="display:flex; width:100%; gap:10px; align-items:center; flex-wrap: wrap;">
<input type="text" id="url-input" placeholder="Enter YouTube URL" />
<button type="submit">Summarize!</button>
<select id="model-select" style="padding:6px; font-size:14px;">
<option disabled selected>Loading models…</option>
</select>
</form>
<div id="loading" class="loading" style="display:none;">Loading…</div>
</header>
<div id="settings-dialog" class="settings-dialog" hidden role="dialog" aria-modal="true" aria-labelledby="settings-title">
<section class="settings-panel">
<div class="settings-panel-header">
<h2 id="settings-title">Settings</h2>
<button type="button" id="settings-close-button" class="settings-close-button" aria-label="Close settings">&times;</button>
</div>
<label class="settings-row">
<input type="checkbox" id="whisper-checkbox" checked />
<span>Use Whisper</span>
</label>
<label class="settings-row">
<input type="checkbox" id="autotranslate-checkbox" checked />
<span>Auto Translate</span>
</label>
<label for="master-prompt-textarea" class="settings-field-label">Master Prompt</label>
<textarea id="master-prompt-textarea" class="settings-prompt-textarea" spellcheck="false"></textarea>
<div class="settings-actions">
<button type="button" id="reset-master-prompt-button">Reset to default</button>
</div>
<label for="translation-prompt-de-textarea" class="settings-field-label">German Translation Prompt</label>
<textarea id="translation-prompt-de-textarea" class="settings-prompt-textarea compact" spellcheck="false"></textarea>
<div class="settings-actions">
<button type="button" id="reset-translation-prompt-de-button">Reset to default</button>
</div>
<label for="translation-prompt-jp-textarea" class="settings-field-label">Japanese Translation Prompt</label>
<textarea id="translation-prompt-jp-textarea" class="settings-prompt-textarea compact" spellcheck="false"></textarea>
<div class="settings-actions">
<button type="button" id="reset-translation-prompt-jp-button">Reset to default</button>
</div>
</section>
</div>
<div id="pagination-top" class="pagination" style="display:none;"></div>
<div id="summaries-container"></div>
<div id="pagination-bottom" class="pagination" style="display:none;"></div>
<script src="renderer.js"></script>
</body>

View File

@@ -3,6 +3,28 @@ const invoke = tauriApi?.core?.invoke;
const listen = tauriApi?.event?.listen;
const convertFileSrc = tauriApi?.core?.convertFileSrc;
const confirmDialog = tauriApi?.dialog?.confirm;
const DEFAULT_MASTER_PROMPT = `You are an expert summarizer. Summarize the following video concisely:
Title: {title}
Transcript:
{transcript}
Summary:`;
const DEFAULT_TRANSLATION_PROMPTS = {
de: `Translate the following summary into German. Only output the translated summary, no explanation or intro. If it's already in the target language, do nothing but repeat it.
Summary:
{summary}
Translation:`,
jp: `Translate the following summary into Japanese. Only output the translated summary, no explanation or intro. If it's already in the target language, do nothing but repeat it.
Summary:
{summary}
Translation:`
};
if (!invoke || !listen) {
throw new Error('Tauri runtime API is unavailable.');
@@ -21,11 +43,12 @@ function toWebviewFileUrl(filePath) {
window.api = {
getModels: () => invoke('get_models'),
getSummaries: () => invoke('get_summaries'),
summarizeVideo: (url, useWhisper, model) => invoke('summarize_video', {
summarizeVideo: (url, useWhisper, model, masterPrompt) => invoke('summarize_video', {
request: {
url,
useWhisper,
model: model || null
model: model || null,
masterPrompt: masterPrompt || null
}
}),
openExternal: (url) => invoke('open_external', { url }),
@@ -33,16 +56,18 @@ window.api = {
deleteSummary: (id) => invoke('delete_summary', {
request: { id }
}),
translateSummary: (id, lang, model) => invoke('translate_summary', {
translateSummary: (id, lang, model, promptTemplate) => invoke('translate_summary', {
request: {
id,
lang,
model: model || null
model: model || null,
promptTemplate: promptTemplate || null
}
}),
onSummarizeProgress: (callback) => listen('summarize-progress', (event) => {
callback(String(event.payload || ''));
})
}),
onOpenSettings: (callback) => listen('open-settings', callback)
};
window.addEventListener('DOMContentLoaded', async () => {
@@ -56,6 +81,15 @@ window.addEventListener('DOMContentLoaded', async () => {
const paginationBottom = document.getElementById('pagination-bottom');
const summarizeButton = form.querySelector('button[type="submit"]');
const autoTranslateCheckbox = document.getElementById('autotranslate-checkbox');
const settingsDialog = document.getElementById('settings-dialog');
const settingsPanel = settingsDialog.querySelector('.settings-panel');
const settingsCloseButton = document.getElementById('settings-close-button');
const masterPromptTextarea = document.getElementById('master-prompt-textarea');
const resetMasterPromptButton = document.getElementById('reset-master-prompt-button');
const translationPromptDeTextarea = document.getElementById('translation-prompt-de-textarea');
const translationPromptJpTextarea = document.getElementById('translation-prompt-jp-textarea');
const resetTranslationPromptDeButton = document.getElementById('reset-translation-prompt-de-button');
const resetTranslationPromptJpButton = document.getElementById('reset-translation-prompt-jp-button');
let fullSummaries = [];
let currentPage = 1;
@@ -71,8 +105,45 @@ window.addEventListener('DOMContentLoaded', async () => {
loadingIndicator.textContent = message;
}
function getMasterPrompt() {
const savedPrompt = localStorage.getItem('masterPrompt');
if (savedPrompt && savedPrompt.trim()) {
return savedPrompt;
}
return DEFAULT_MASTER_PROMPT;
}
function getTranslationPrompt(lang) {
const savedPrompt = localStorage.getItem(`translationPrompt.${lang}`);
if (savedPrompt && savedPrompt.trim()) {
return savedPrompt;
}
return DEFAULT_TRANSLATION_PROMPTS[lang];
}
function syncSettingsFields() {
whisperCheckbox.checked = localStorage.getItem('useWhisper') === '0' ? false : true;
autoTranslateCheckbox.checked = localStorage.getItem('autoTranslate') === '1' ? true : false;
masterPromptTextarea.value = getMasterPrompt();
translationPromptDeTextarea.value = getTranslationPrompt('de');
translationPromptJpTextarea.value = getTranslationPrompt('jp');
}
function openSettings() {
syncSettingsFields();
settingsDialog.hidden = false;
masterPromptTextarea.focus();
}
function closeSettings() {
settingsDialog.hidden = true;
}
whisperCheckbox.checked = localStorage.getItem('useWhisper') === '0' ? false : true;
autoTranslateCheckbox.checked = localStorage.getItem('autoTranslate') === '1' ? true : false;
masterPromptTextarea.value = getMasterPrompt();
translationPromptDeTextarea.value = getTranslationPrompt('de');
translationPromptJpTextarea.value = getTranslationPrompt('jp');
whisperCheckbox.addEventListener('change', () => {
localStorage.setItem('useWhisper', whisperCheckbox.checked ? '1' : '0');
@@ -80,6 +151,41 @@ window.addEventListener('DOMContentLoaded', async () => {
autoTranslateCheckbox.addEventListener('change', () => {
localStorage.setItem('autoTranslate', autoTranslateCheckbox.checked ? '1' : '0');
});
masterPromptTextarea.addEventListener('input', () => {
localStorage.setItem('masterPrompt', masterPromptTextarea.value);
});
translationPromptDeTextarea.addEventListener('input', () => {
localStorage.setItem('translationPrompt.de', translationPromptDeTextarea.value);
});
translationPromptJpTextarea.addEventListener('input', () => {
localStorage.setItem('translationPrompt.jp', translationPromptJpTextarea.value);
});
resetMasterPromptButton.addEventListener('click', () => {
masterPromptTextarea.value = DEFAULT_MASTER_PROMPT;
localStorage.setItem('masterPrompt', DEFAULT_MASTER_PROMPT);
masterPromptTextarea.focus();
});
resetTranslationPromptDeButton.addEventListener('click', () => {
translationPromptDeTextarea.value = DEFAULT_TRANSLATION_PROMPTS.de;
localStorage.setItem('translationPrompt.de', DEFAULT_TRANSLATION_PROMPTS.de);
translationPromptDeTextarea.focus();
});
resetTranslationPromptJpButton.addEventListener('click', () => {
translationPromptJpTextarea.value = DEFAULT_TRANSLATION_PROMPTS.jp;
localStorage.setItem('translationPrompt.jp', DEFAULT_TRANSLATION_PROMPTS.jp);
translationPromptJpTextarea.focus();
});
settingsCloseButton.addEventListener('click', closeSettings);
settingsDialog.addEventListener('click', (event) => {
if (!settingsPanel.contains(event.target)) {
closeSettings();
}
});
window.addEventListener('keydown', (event) => {
if (event.key === 'Escape' && !settingsDialog.hidden) {
closeSettings();
}
});
function renderSummaries(list) {
summariesContainer.innerHTML = '';
@@ -345,6 +451,8 @@ window.addEventListener('DOMContentLoaded', async () => {
.replace(/^## (.+)$/gm, '<h2>$1</h2>')
.replace(/^# (.+)$/gm, '<h1>$1</h1>');
escaped = escaped.replace(/^---$/gm, '<hr>');
escaped = escaped.replace(
/(^|\n)([ \t]*\* .+(?:\n[ \t]*\* .+)*)/g,
(_, lead, listBlock) => {
@@ -525,7 +633,7 @@ window.addEventListener('DOMContentLoaded', async () => {
setActionLinksDisabled(true);
const selectedModel = modelSelect.value;
window.api.summarizeVideo(url, useWhisper, selectedModel)
window.api.summarizeVideo(url, useWhisper, selectedModel, getMasterPrompt())
.then((newEntry) => {
if (!newEntry || !newEntry.id) {
return window.api.getSummaries().then(setSummaries);
@@ -539,10 +647,10 @@ window.addEventListener('DOMContentLoaded', async () => {
let translationsOk = true;
setLoadingMessage('Translating to German (DE)…');
return window.api.translateSummary(newEntry.id, 'de', selectedModel)
return window.api.translateSummary(newEntry.id, 'de', selectedModel, getTranslationPrompt('de'))
.then(() => {
setLoadingMessage('Translating to Japanese (JP)…');
return window.api.translateSummary(newEntry.id, 'jp', selectedModel);
return window.api.translateSummary(newEntry.id, 'jp', selectedModel, getTranslationPrompt('jp'));
})
.catch(err => {
translationsOk = false;
@@ -575,4 +683,5 @@ window.addEventListener('DOMContentLoaded', async () => {
}
setLoadingMessage(line);
});
window.api.onOpenSettings(openSettings);
});

View File

@@ -35,8 +35,10 @@ import re
import time
import json
import glob
import math
import subprocess
import multiprocessing
import threading
import requests
import yt_dlp
import webvtt
@@ -62,6 +64,17 @@ NUM_SLICES = 8
OVERLAP_SEC = 1
MAX_OVERLAP_WORDS = 7
WHISPER_MODEL = "small" # e.g. "small", "medium", "large-v3" …
OLLAMA_CHARS_PER_TOKEN = 3.5
OLLAMA_OUTPUT_TOKEN_BUDGET = 2048
OLLAMA_CONTEXT_BUCKETS = (4096, 8192, 16384, 32768, 65536)
DEFAULT_SUMMARY_PROMPT_TEMPLATE = """You are an expert summarizer. Summarize the following video concisely:
Title: {title}
Transcript:
{transcript}
Summary:"""
def debug_print(*args, **kwargs):
@@ -70,6 +83,30 @@ def debug_print(*args, **kwargs):
print("[DEBUG]", *args, **kwargs, file=sys.stderr)
class ProgressHeartbeat:
"""Emit periodic progress while a blocking backend operation is active."""
def __init__(self, message_fn, interval: float = 15.0):
self.message_fn = message_fn
self.interval = interval
self._stop = threading.Event()
self._thread = threading.Thread(target=self._run, daemon=True)
def __enter__(self):
self._thread.start()
return self
def __exit__(self, exc_type, exc, tb):
self._stop.set()
self._thread.join(timeout=1.0)
def _run(self):
while not self._stop.wait(self.interval):
message = self.message_fn()
if message:
print(message, flush=True)
def get_ffmpeg_binary() -> str:
"""Return the ffmpeg executable path, preferring a bundled override."""
value = os.environ.get("YTS_FFMPEG", "").strip()
@@ -82,6 +119,31 @@ def get_ffprobe_binary() -> str:
return value or "ffprobe"
def get_ffmpeg_directory() -> Optional[str]:
"""Return the directory containing the configured ffmpeg binary."""
value = os.environ.get("YTS_FFMPEG", "").strip()
if not value:
return None
if os.path.isfile(value):
return os.path.dirname(value)
return value
def get_yt_dlp_ffmpeg_location() -> Optional[str]:
"""Return an ffmpeg location suitable for yt_dlp postprocessors."""
return get_ffmpeg_directory()
def ensure_ffmpeg_on_path() -> None:
"""Expose bundled ffmpeg to libraries that shell out to plain `ffmpeg`."""
ffmpeg_dir = get_ffmpeg_directory()
if not ffmpeg_dir:
return
path_entries = [entry for entry in os.environ.get("PATH", "").split(os.pathsep) if entry]
if ffmpeg_dir not in path_entries:
os.environ["PATH"] = os.pathsep.join([ffmpeg_dir, *path_entries])
def get_whisper_download_root() -> Optional[str]:
"""Return a stable Whisper cache directory when one is configured."""
value = os.environ.get("YTS_WHISPER_CACHE_DIR", "").strip()
@@ -286,6 +348,9 @@ def _download_audio_with_yt_dlp(url: str, vid: str, extractor_args: Optional[dic
}
if extractor_args:
opts["extractor_args"] = extractor_args
ffmpeg_location = get_yt_dlp_ffmpeg_location()
if ffmpeg_location:
opts["ffmpeg_location"] = ffmpeg_location
with yt_dlp.YoutubeDL(opts) as ydl:
ydl.download([url])
if not os.path.exists(audio_fn):
@@ -295,7 +360,8 @@ def _download_audio_with_yt_dlp(url: str, vid: str, extractor_args: Optional[dic
def download_video_audio(url: str, vid: str) -> str:
"""Download the best available audio for a YouTube video."""
print(f"📥 Downloading audio from {url}")
ensure_ffmpeg_on_path()
print(f"Downloading audio from {url} ...")
# Clean up any stale partials that can trigger HTTP 416 resume errors.
_cleanup_audio_artifacts(vid)
@@ -322,16 +388,63 @@ def download_video_audio(url: str, vid: str) -> str:
def get_audio_duration(path: str) -> float:
"""Return the duration of an audio file using ffprobe."""
res = subprocess.run([
try:
file_size = os.path.getsize(path)
except OSError:
file_size = None
ffprobe_args = [
get_ffprobe_binary(), "-v", "error", "-show_entries", "format=duration",
"-of", "default=noprint_wrappers=1:nokey=1", path
], capture_output=True, text=True)
return float(res.stdout.strip())
]
res = subprocess.run(
ffprobe_args,
capture_output=True,
text=True,
encoding="utf-8",
errors="replace",
)
print(f"[ffprobe] command: {ffprobe_args}", file=sys.stderr, flush=True)
print(f"[ffprobe] target: {path}", file=sys.stderr, flush=True)
print(f"[ffprobe] target size bytes: {file_size}", file=sys.stderr, flush=True)
print(f"[ffprobe] returncode: {res.returncode}", file=sys.stderr, flush=True)
print("[ffprobe] stdout raw begin", file=sys.stderr, flush=True)
if res.stdout:
sys.stderr.write(res.stdout)
if not res.stdout.endswith("\n"):
sys.stderr.write("\n")
sys.stderr.flush()
else:
print("<empty>", file=sys.stderr, flush=True)
print("[ffprobe] stdout raw end", file=sys.stderr, flush=True)
print("[ffprobe] stderr raw begin", file=sys.stderr, flush=True)
if res.stderr:
sys.stderr.write(res.stderr)
if not res.stderr.endswith("\n"):
sys.stderr.write("\n")
sys.stderr.flush()
else:
print("<empty>", file=sys.stderr, flush=True)
print("[ffprobe] stderr raw end", file=sys.stderr, flush=True)
stdout_value = res.stdout.strip()
try:
return float(stdout_value)
except ValueError:
match = re.search(r"[-+]?\d+(?:[.,]\d+)?", stdout_value or res.stderr)
if match:
return float(match.group(0).replace(",", "."))
raise RuntimeError(
"ffprobe did not return a parseable duration "
f"(returncode={res.returncode}, stdout={stdout_value!r}, stderr={res.stderr.strip()!r})"
)
def slice_audio(audio_path: str, vid: str) -> List[Tuple[str, float, float]]:
"""Slice a long audio file into overlapping chunks for Whisper."""
print("Slicing audio ")
print("Slicing audio ...")
duration = get_audio_duration(audio_path)
length = duration / NUM_SLICES
slices = []
@@ -351,6 +464,7 @@ def slice_audio(audio_path: str, vid: str) -> List[Tuple[str, float, float]]:
def transcribe_slice(args: Tuple[str, int, str, str]) -> str:
"""Transcribe a single audio slice using Whisper and save to a text file."""
ensure_ffmpeg_on_path()
slice_path, idx, model_name, vid = args
if whisper is None:
raise RuntimeError("Whisper package is required but not installed")
@@ -398,9 +512,10 @@ def clean_temp(pattern: str) -> None:
def whisper_transcript(url: str, vid: str) -> str:
"""Run the Whisper pipeline and return the final transcript text."""
ensure_ffmpeg_on_path()
audio = download_video_audio(url, vid)
slices = slice_audio(audio, vid)
print("✍️ Transcribing using Whisper...", flush=True)
print("Transcribing using Whisper...", flush=True)
args = [(p, i, WHISPER_MODEL, vid) for i, (p, _, _) in enumerate(slices)]
with multiprocessing.Pool(len(slices)) as pool:
t_files = pool.map(transcribe_slice, args)
@@ -415,17 +530,36 @@ def whisper_transcript(url: str, vid: str) -> str:
# OllamaSummarizer
# -----------------------
def summarize_with_ollama(title: str, transcript: str, model: str = "mistral:latest") -> str:
def render_summary_prompt(title: str, transcript: str, prompt_template: Optional[str] = None) -> str:
template = (prompt_template or DEFAULT_SUMMARY_PROMPT_TEMPLATE).strip()
prompt = template.replace("{title}", title).replace("{transcript}", transcript)
if "{title}" not in template:
prompt = f"{prompt}\n\nTitle: {title}"
if "{transcript}" not in template:
prompt = f"{prompt}\n\nTranscript:\n{transcript}"
return prompt
def choose_ollama_num_ctx(prompt: str, output_budget: int = OLLAMA_OUTPUT_TOKEN_BUDGET) -> int:
estimated_input_tokens = math.ceil(len(prompt) / OLLAMA_CHARS_PER_TOKEN)
needed_tokens = estimated_input_tokens + output_budget
for bucket in OLLAMA_CONTEXT_BUCKETS:
if needed_tokens <= bucket:
return bucket
return OLLAMA_CONTEXT_BUCKETS[-1]
def summarize_with_ollama(
title: str,
transcript: str,
model: str = "mistral:latest",
prompt_template: Optional[str] = None,
) -> str:
"""
Send video title and transcript text to Ollama and return the summary string.
"""
debug_print(f"Preparing summary with model {model}, transcript length={len(transcript)}")
prompt = (
"You are an expert summarizer. Summarize the following video concisely:\n\n"
f"Title: {title}\n\n"
f"Transcript:\n{transcript}\n\n"
"Summary:"
)
prompt = render_summary_prompt(title, transcript, prompt_template)
debug_print(prompt)
payload = {
"model": model,
@@ -433,21 +567,50 @@ def summarize_with_ollama(title: str, transcript: str, model: str = "mistral:lat
{"role": "system", "content": "You are an intelligent summarizer."},
{"role": "user", "content": prompt}
],
"options": {
"num_ctx": choose_ollama_num_ctx(prompt)
},
"stream": True
}
debug_print("Sending request to Ollama ")
resp = requests.post("http://localhost:11434/api/chat", json=payload, stream=True)
debug_print(f"Ollama status: {resp.status_code}")
debug_print("Sending request to Ollama ...")
summary = ""
for line in resp.iter_lines(decode_unicode=True):
if not line:
continue
try:
msg = json.loads(line).get("message", {}).get("content", "")
summary += msg
except Exception:
continue
last_progress_chars = 0
def heartbeat_message() -> str:
if summary:
return f"Ollama is generating summary... {len(summary)} characters received."
return "Waiting for Ollama to start responding..."
try:
with ProgressHeartbeat(heartbeat_message):
resp = requests.post(
"http://localhost:11434/api/chat",
json=payload,
stream=True,
timeout=(10, 1800),
)
debug_print(f"Ollama status: {resp.status_code}")
resp.raise_for_status()
for line in resp.iter_lines(decode_unicode=True):
if not line:
continue
try:
msg = json.loads(line).get("message", {}).get("content", "")
summary += msg
if len(summary) - last_progress_chars >= 1000:
last_progress_chars = len(summary)
print(
f"Ollama is generating summary... {last_progress_chars} characters received.",
flush=True,
)
except Exception:
continue
except requests.RequestException as exc:
raise RuntimeError(f"Ollama request failed: {exc}") from exc
if not summary.strip():
raise RuntimeError("Ollama returned an empty summary.")
debug_print(f"Summary generated, length={len(summary)}")
print("Summary generated.", flush=True)
return summary
@@ -517,7 +680,13 @@ def download_thumbnail(vid: str, thumbnail_url: str) -> Optional[str]:
# Main
# -----------------------
def process_video(url: str, use_whisper: bool, model: str = "mistral:latest", output_json: Optional[str] = None) -> dict:
def process_video(
url: str,
use_whisper: bool,
model: str = "mistral:latest",
output_json: Optional[str] = None,
prompt_template: Optional[str] = None,
) -> dict:
"""
Core processing routine. Retrieves metadata, obtains transcript via the
selected workflow, generates a summary using Ollama and writes the
@@ -552,17 +721,17 @@ def process_video(url: str, use_whisper: bool, model: str = "mistral:latest", ou
# Fetch transcript
if use_whisper:
print("🤖 Using Whisper parallel transcription")
print("Using Whisper parallel transcription...")
transcript_text = whisper_transcript(url, vid)
if not transcript_text.strip():
raise SystemExit("Whisper transcription failed or empty.")
else:
print("▶️ Using classic API/subtitle workflow")
print("Using classic API/subtitle workflow...")
# Try API first
try:
transcript_text = get_transcript_api(vid)
except Exception:
print("API failed, falling back to subtitles")
print("API failed, falling back to subtitles...")
transcript_text = get_subtitles_via_yt_dlp(url)
if not transcript_text:
raise SystemExit("No transcript/subtitles available.")
@@ -601,8 +770,8 @@ def process_video(url: str, use_whisper: bool, model: str = "mistral:latest", ou
audio_filename = None
# Generate summary
print("✍️ Generating summary with Ollama", flush=True)
summary_text = summarize_with_ollama(title, transcript_text, model)
print("Generating summary with Ollama...", flush=True)
summary_text = summarize_with_ollama(title, transcript_text, model, prompt_template)
# Create metadata dictionary
meta = {
@@ -625,7 +794,13 @@ def process_video(url: str, use_whisper: bool, model: str = "mistral:latest", ou
return meta
def rewrite_summary(title: str, transcript_file: str, model: str = "mistral:latest", output_json: Optional[str] = None) -> dict:
def rewrite_summary(
title: str,
transcript_file: str,
model: str = "mistral:latest",
output_json: Optional[str] = None,
prompt_template: Optional[str] = None,
) -> dict:
"""
Regenerate a summary from an existing transcript file using the specified model.
@@ -648,7 +823,7 @@ def rewrite_summary(title: str, transcript_file: str, model: str = "mistral:late
with open(transcript_file, 'r', encoding='utf-8') as f:
transcript_text = f.read()
debug_print(f"Rewriting summary using model {model} for {transcript_file}")
summary_text = summarize_with_ollama(title, transcript_text, model)
summary_text = summarize_with_ollama(title, transcript_text, model, prompt_template)
meta = {'summary': summary_text}
if output_json:
with open(output_json, 'w', encoding='utf-8') as f:
@@ -669,17 +844,25 @@ def main():
help="Ollama model to use for summarization (default: mistral:latest)")
parser.add_argument('--transcript-file', type=str, default=None,
help="Path to an existing transcript file; when provided the script will skip transcription and only generate a summary.")
parser.add_argument('--prompt-template', type=str, default=None,
help="Prompt template for the summary LLM call.")
parser.add_argument('--prompt-template-file', type=str, default=None,
help="Path to a text file containing the prompt template.")
args = parser.parse_args()
use_whisper = not args.no_ai
prompt_template = args.prompt_template
if args.prompt_template_file:
with open(args.prompt_template_file, 'r', encoding='utf-8') as f:
prompt_template = f.read()
try:
# If a transcript file is provided, skip the normal processing and only rewrite summary
if args.transcript_file:
vid, title, _ = fetch_video_metadata(args.url)
meta = rewrite_summary(title, args.transcript_file, args.model, args.output_json)
meta = rewrite_summary(title, args.transcript_file, args.model, args.output_json, prompt_template)
else:
meta = process_video(args.url, use_whisper, args.model, args.output_json)
meta = process_video(args.url, use_whisper, args.model, args.output_json, prompt_template)
# If no JSON output specified, print metadata as JSON to stdout
if not args.output_json:
print(json.dumps(meta, ensure_ascii=False, indent=2))