Prepare publishable Tauri app

This commit is contained in:
2026-05-04 10:13:03 +02:00
parent dd135a6089
commit ff66d0aea3
20 changed files with 6386 additions and 159 deletions

44
.gitignore vendored Normal file
View File

@@ -0,0 +1,44 @@
venv
.venv
sync.sh
summaries.db
.env
node_modules
data
package-lock.json
dump_summaries.sql
.DS_Store
.git
__pycache__
.mypy_cache
.pytest_cache
dist
build
out
.next
.nuxt
.turbo
.parcel-cache
.cache
target
src-tauri/target
src-tauri/gen
coverage
logs
tmp
temp
output
tmp*
*.log
*.tmp
*.swp
*.lock
*.exe
Cargo.lock
!src-tauri/Cargo.lock
src-tauri/resources/backend/*
!src-tauri/resources/backend/.gitkeep
src-tauri/resources/ffmpeg/*
!src-tauri/resources/ffmpeg/.gitkeep

26
MIGRATION_NOTES.md Normal file
View File

@@ -0,0 +1,26 @@
# Migration Notes
## What Was Preserved
- Static frontend design from `ui/index.html`, including the rose color palette, compact header form, list layout, collapsed summary previews and pagination.
- Frontend behavior from `ui/renderer.js`: model loading, local UI preferences, Whisper toggle, auto-translation toggle, per-entry language tabs, delete confirmation, progress updates, expandable summaries and thumbnail external links.
- Tauri command surface: `get_models`, `get_summaries`, `summarize_video`, `delete_summary`, `translate_summary`, `open_external` and `open_file`.
- Local runtime model: SQLite history in the OS app local data directory, media under that data directory, Ollama on `localhost:11434`, and Python helpers for YouTube metadata, transcripts, Whisper, summaries and translation.
- Release bundling path: a PyInstaller-built backend sidecar plus copied `ffmpeg` and `ffprobe` resources under `src-tauri/resources`.
## Electron Reality Check
No active Electron app was present in the source snapshot used for this migration. There was no Electron main process, preload script, `ipcMain`/`ipcRenderer` bridge, `BrowserWindow` setup, `package.json` or Electron build configuration. The working desktop shell was already Tauri 2, so this folder packages that actual implementation as a standalone Tauri project rather than inventing behavior from missing Electron files.
## Important Runtime Details
- The Tauri identifier remains `com.victorgiers.youtube-summarizer` so OS-level app data and history stay aligned with the existing app identity.
- `run.sh` and `run.bat` now change into this folder before creating the Python environment or launching Cargo.
- The frontend still uses `window.__TAURI__` because `withGlobalTauri` is enabled in `src-tauri/tauri.conf.json`.
- Development falls back to local Python scripts when no bundled backend sidecar exists.
## Imported Legacy Data
- The old Electron database from `/Users/giers/Tools/victors-tools/youtube_summarizer/summaries.db` was copied into the Tauri runtime data directory at `/Users/giers/Library/Application Support/com.victorgiers.youtube-summarizer/summaries.db`.
- A copy also exists at `summaries.db` in this folder for local migration reference.
- Thumbnail files from the old `data/` folder were copied so historical entries keep their images. Audio and transcript files were not copied because the Tauri runtime clears those artifact references on startup.

View File

@@ -1,4 +1,4 @@
# YouTube Summarizer
# YouTube Summarizer Tauri
This is a local-first desktop app for summarizing YouTube videos with Ollama.
@@ -9,6 +9,10 @@ It uses:
- Ollama on `localhost` for summarization and translation
- SQLite for local history
## Migration State
This folder is the standalone Tauri version of the app. The repository snapshot this was created from did not contain an active Electron runtime, `package.json`, preload script or Electron main process; the actual app behavior was already represented by a static HTML/CSS/JS frontend, a Tauri 2 Rust shell and Python backend helpers. The migration work here keeps that behavior and design intact inside `ytsummarizer_tauri` so it can be built and run without depending on files outside this folder.
## What It Does
Given a YouTube URL, the app can:
@@ -48,7 +52,7 @@ For development in this repo you still need:
- FFmpeg in `PATH`
- Ollama running locally on `http://localhost:11434`
Python dependencies are listed in [requirements.txt](/Users/giers/youtube_summarizer/requirements.txt).
Python dependencies are listed in [requirements.txt](requirements.txt).
## Run In Development
@@ -73,7 +77,7 @@ pip install -r requirements.txt
cargo run --manifest-path src-tauri/Cargo.toml
```
The app prefers a bundled backend executable when one is present under [src-tauri/resources/backend](/Users/giers/youtube_summarizer/src-tauri/resources/backend), and otherwise falls back to the local Python environment for development.
The app prefers a bundled backend executable when one is present under [src-tauri/resources/backend](src-tauri/resources/backend), and otherwise falls back to the local Python environment for development.
## Build A Shippable Bundle
@@ -93,21 +97,21 @@ cargo tauri build
What `tools/prepare_bundle.py` does:
- installs PyInstaller into the current Python environment
- builds a single-file backend executable from [backend_cli.py](/Users/giers/youtube_summarizer/backend_cli.py)
- copies that executable into [src-tauri/resources/backend](/Users/giers/youtube_summarizer/src-tauri/resources/backend)
- copies `ffmpeg` and `ffprobe` from the build machine into [src-tauri/resources/ffmpeg](/Users/giers/youtube_summarizer/src-tauri/resources/ffmpeg)
- builds a single-file backend executable from [backend_cli.py](backend_cli.py)
- copies that executable into [src-tauri/resources/backend](src-tauri/resources/backend)
- copies `ffmpeg` and `ffprobe` from the build machine into [src-tauri/resources/ffmpeg](src-tauri/resources/ffmpeg)
Build once on each target OS you want to ship. For Windows 10, build on Windows.
## Build On GitHub Actions
A Windows build workflow is included at [.github/workflows/windows-installer.yml](/Users/giers/youtube_summarizer/.github/workflows/windows-installer.yml).
A Windows build workflow from the original repository can be pointed at this folder by running the same commands from `ytsummarizer_tauri`.
It runs on `windows-latest`, installs `ffmpeg` and NSIS, prepares the bundled Python backend with [tools/prepare_bundle.py](/Users/giers/youtube_summarizer/tools/prepare_bundle.py), builds an NSIS installer, and uploads the result as a workflow artifact named `windows-installer`.
It should run on `windows-latest`, install `ffmpeg` and NSIS, prepare the bundled Python backend with [tools/prepare_bundle.py](tools/prepare_bundle.py), build an NSIS installer, and upload the result as a workflow artifact named `windows-installer`.
## Notes
- If Python is not on your `PATH` for development, set `YTS_PYTHON` to the interpreter you want the Tauri backend to use.
- If you want to test a prebuilt backend executable during development, set `YTS_BACKEND_BIN` to its full path.
- If `ffmpeg` or `ffprobe` are not on `PATH` during bundle prep, set `YTS_FFMPEG` and `YTS_FFPROBE` to their full paths before running [tools/prepare_bundle.py](/Users/giers/youtube_summarizer/tools/prepare_bundle.py).
- If `ffmpeg` or `ffprobe` are not on `PATH` during bundle prep, set `YTS_FFMPEG` and `YTS_FFPROBE` to their full paths before running [tools/prepare_bundle.py](tools/prepare_bundle.py).
- Generated thumbnails and the SQLite database are created on first run in the app's local data directory.

View File

@@ -8,6 +8,7 @@ while still supporting direct Python execution during development.
import argparse
import json
import multiprocessing
import sys
from pathlib import Path
@@ -18,19 +19,38 @@ from youtube_summarizer import process_video
DEFAULT_MODEL = "mistral:latest"
def compact_error_message(exc: BaseException) -> str:
"""Build a short error string without dumping a traceback into the GUI."""
parts = []
current = exc
while current:
text = " ".join(str(current).split())
if text and text not in parts:
parts.append(text)
current = current.__cause__ or current.__context__
return ": ".join(parts) or exc.__class__.__name__
def configure_stdio() -> None:
"""Keep progress output line-buffered for the desktop app."""
if hasattr(sys.stdout, "reconfigure"):
sys.stdout.reconfigure(line_buffering=True)
sys.stdout.reconfigure(encoding="utf-8", errors="replace", line_buffering=True)
if hasattr(sys.stderr, "reconfigure"):
sys.stderr.reconfigure(line_buffering=True)
sys.stderr.reconfigure(encoding="utf-8", errors="replace", line_buffering=True)
def summarize(args: argparse.Namespace) -> int:
prompt_template = None
if args.prompt_template_file:
prompt_template = Path(args.prompt_template_file).read_text(encoding="utf-8")
elif args.prompt_template:
prompt_template = args.prompt_template
meta = process_video(
args.url,
use_whisper=args.use_whisper,
model=args.model,
prompt_template=prompt_template,
output_json=args.output_json,
)
if not args.output_json:
@@ -44,7 +64,13 @@ def translate(args: argparse.Namespace) -> int:
if not summary_text:
raise SystemExit("Empty summary text!")
translation = translate_summary_text(summary_text, args.lang, args.model)
prompt_template = None
if args.prompt_template_file:
prompt_template = Path(args.prompt_template_file).read_text(encoding="utf-8")
elif args.prompt_template:
prompt_template = args.prompt_template
translation = translate_summary_text(summary_text, args.lang, args.model, prompt_template)
if args.output_file:
Path(args.output_file).write_text(translation, encoding="utf-8")
@@ -60,6 +86,8 @@ def build_parser() -> argparse.ArgumentParser:
summarize_parser = subparsers.add_parser("summarize", help="Summarize a YouTube video")
summarize_parser.add_argument("--url", required=True, help="YouTube video URL")
summarize_parser.add_argument("--model", default=DEFAULT_MODEL, help="Ollama model to use")
summarize_parser.add_argument("--prompt-template", help="Prompt template for the summary LLM call")
summarize_parser.add_argument("--prompt-template-file", help="Path to a prompt template file")
summarize_parser.add_argument(
"--no-whisper",
dest="use_whisper",
@@ -76,6 +104,8 @@ def build_parser() -> argparse.ArgumentParser:
translate_parser.add_argument("--summary-file", required=True, help="Path to the English summary text")
translate_parser.add_argument("--lang", required=True, choices=["de", "jp"], help="Target language")
translate_parser.add_argument("--model", default=DEFAULT_MODEL, help="Ollama model to use")
translate_parser.add_argument("--prompt-template", help="Prompt template for the translation LLM call")
translate_parser.add_argument("--prompt-template-file", help="Path to a translation prompt template file")
translate_parser.add_argument("--output-file", help="Optional path to write the translated text")
translate_parser.set_defaults(handler=translate)
@@ -83,10 +113,18 @@ def build_parser() -> argparse.ArgumentParser:
def main() -> int:
multiprocessing.freeze_support()
configure_stdio()
parser = build_parser()
args = parser.parse_args()
try:
return args.handler(args)
except KeyboardInterrupt:
print("[error] Cancelled.", file=sys.stderr, flush=True)
return 130
except Exception as exc:
print(f"[error] {compact_error_message(exc)}", file=sys.stderr, flush=True)
return 1
if __name__ == "__main__":

View File

@@ -1,5 +1,6 @@
@echo off
setlocal
cd /d "%~dp0"
REM 1. Prüfen, ob venv existiert, sonst erstellen
if not exist venv (

2
run.sh
View File

@@ -1,6 +1,8 @@
#!/usr/bin/env bash
set -e
cd "$(dirname "$0")"
# 1. Python venv einrichten
GREEN="\033[0;32m"
CYAN="\033[0;36m"

5392
src-tauri/Cargo.lock generated Normal file

File diff suppressed because it is too large Load Diff

BIN
src-tauri/icons/128x128.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.2 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

BIN
src-tauri/icons/32x32.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.2 KiB

BIN
src-tauri/icons/icon.icns Normal file

Binary file not shown.

BIN
src-tauri/icons/icon.ico Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 49 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.3 KiB

After

Width:  |  Height:  |  Size: 44 KiB

View File

@@ -1,8 +1,9 @@
#![cfg_attr(not(debug_assertions), windows_subsystem = "windows")]
use std::{
env, fs,
io::{BufRead, BufReader, ErrorKind},
env,
fs::{self, OpenOptions},
io::{BufRead, BufReader, ErrorKind, Write},
path::{Path, PathBuf},
process::{Command, Stdio},
sync::{Arc, Mutex},
@@ -10,16 +11,22 @@ use std::{
time::{SystemTime, UNIX_EPOCH},
};
#[cfg(target_os = "windows")]
use std::os::windows::process::CommandExt;
use open::that;
use reqwest::blocking::Client;
use rusqlite::{params, Connection, OptionalExtension};
use serde::{Deserialize, Serialize};
use tauri::{AppHandle, Emitter, Manager, State, WebviewWindow};
use tauri::menu::{MenuBuilder, SubmenuBuilder};
use tauri::{path::BaseDirectory, AppHandle, Emitter, Manager, State, WebviewWindow};
const DEFAULT_MODEL: &str = "mistral:latest";
const OLLAMA_TAGS_URL: &str = "http://localhost:11434/api/tags";
const BACKEND_EXECUTABLE_NAME: &str = "yts-backend";
const TARGET_TRIPLE: &str = env!("TAURI_BUILD_TARGET");
#[cfg(target_os = "windows")]
const CREATE_NO_WINDOW: u32 = 0x0800_0000;
#[derive(Clone)]
enum BackendRuntime {
@@ -37,6 +44,7 @@ struct AppState {
app_dir: PathBuf,
media_dir: PathBuf,
db_path: PathBuf,
backend_log_path: PathBuf,
backend: BackendRuntime,
ffmpeg_path: Option<PathBuf>,
ffprobe_path: Option<PathBuf>,
@@ -49,6 +57,7 @@ struct SummarizeVideoRequest {
url: String,
use_whisper: bool,
model: Option<String>,
master_prompt: Option<String>,
}
#[derive(Debug, Deserialize)]
@@ -57,10 +66,12 @@ struct DeleteSummaryRequest {
}
#[derive(Debug, Deserialize)]
#[serde(rename_all = "camelCase")]
struct TranslateSummaryRequest {
id: i64,
lang: String,
model: Option<String>,
prompt_template: Option<String>,
}
#[derive(Debug, Deserialize)]
@@ -165,6 +176,12 @@ fn normalize_model(model: Option<String>) -> String {
.unwrap_or_else(|| DEFAULT_MODEL.to_string())
}
fn normalize_prompt_template(prompt: Option<String>) -> Option<String> {
prompt
.map(|value| value.trim().to_string())
.filter(|value| !value.is_empty())
}
fn now_millis() -> u128 {
SystemTime::now()
.duration_since(UNIX_EPOCH)
@@ -190,15 +207,16 @@ fn platform_executable_name(base_name: &str) -> String {
fn resolve_resource_file(app: &AppHandle, relative_path: &Path) -> Option<PathBuf> {
let mut candidates = Vec::new();
if let Ok(resource_dir) = app.path().resource_dir() {
candidates.push(resource_dir.join(relative_path));
if let Ok(resource_path) = app.path().resolve(relative_path, BaseDirectory::Resource) {
candidates.push(resource_path);
}
candidates.push(
PathBuf::from(env!("CARGO_MANIFEST_DIR"))
.join("resources")
.join(relative_path),
);
let manifest_dir = PathBuf::from(env!("CARGO_MANIFEST_DIR"));
candidates.push(manifest_dir.join(relative_path));
candidates.push(manifest_dir.join("resources").join(relative_path));
if let Ok(project_root) = resolve_project_root() {
candidates.push(project_root.join(relative_path));
}
candidates.into_iter().find(|path| path.exists())
}
@@ -218,9 +236,9 @@ fn resolve_backend_binary(app: &AppHandle) -> Option<PathBuf> {
}
fn resolve_script_dir(app: &AppHandle) -> Result<PathBuf, String> {
if let Ok(resource_dir) = app.path().resource_dir() {
if resource_dir.join("backend_cli.py").exists() {
return Ok(resource_dir);
if let Some(resource_file) = resolve_resource_file(app, Path::new("backend_cli.py")) {
if let Some(parent) = resource_file.parent() {
return Ok(parent.to_path_buf());
}
}
@@ -292,6 +310,21 @@ fn resolve_whisper_cache_dir(app: &AppHandle) -> Result<PathBuf, String> {
Ok(whisper_cache_dir)
}
fn resolve_log_dir(app: &AppHandle) -> Result<PathBuf, String> {
let log_dir = app
.path()
.app_log_dir()
.or_else(|_| {
app.path()
.app_local_data_dir()
.map(|path| path.join("logs"))
})
.map_err(|err| format!("Failed to resolve application log directory: {err}"))?;
fs::create_dir_all(&log_dir)
.map_err(|err| format!("Failed to create application log directory: {err}"))?;
Ok(log_dir)
}
fn open_connection(state: &AppState) -> Result<Connection, String> {
Connection::open(&state.db_path).map_err(|err| format!("Failed to open SQLite database: {err}"))
}
@@ -341,7 +374,9 @@ fn cleanup_artifacts(state: &AppState, audio: Option<&str>, transcript: Option<&
fn purge_existing_artifacts(state: &AppState) -> Result<(), String> {
let db = open_connection(state)?;
let mut stmt = db
.prepare("SELECT id, audio, transcript FROM summaries WHERE audio IS NOT NULL OR transcript IS NOT NULL")
.prepare(
"SELECT id, audio, transcript FROM summaries WHERE audio IS NOT NULL OR transcript IS NOT NULL",
)
.map_err(|err| format!("Failed to prepare artifact cleanup query: {err}"))?;
let rows = stmt
@@ -372,6 +407,30 @@ fn purge_existing_artifacts(state: &AppState) -> Result<(), String> {
Ok(())
}
fn write_startup_error_log(app: &AppHandle, message: &str) {
let mut candidates = Vec::new();
if let Ok(path) = app.path().app_log_dir() {
candidates.push(path);
}
if let Ok(path) = app.path().app_local_data_dir() {
candidates.push(path);
}
candidates.push(env::temp_dir().join("youtube-summarizer"));
for directory in candidates {
if fs::create_dir_all(&directory).is_ok() {
let log_path = directory.join("startup-error.log");
if fs::write(&log_path, message).is_ok() {
eprintln!("Startup failure written to {}", log_path.display());
return;
}
}
}
eprintln!("{message}");
}
fn ensure_app_state(app: &AppHandle) -> Result<AppState, String> {
let app_dir = app
.path()
@@ -386,6 +445,7 @@ fn ensure_app_state(app: &AppHandle) -> Result<AppState, String> {
ffmpeg_path: resolve_optional_tool_path(app, "YTS_FFMPEG", "ffmpeg"),
ffprobe_path: resolve_optional_tool_path(app, "YTS_FFPROBE", "ffprobe"),
whisper_cache_dir: resolve_whisper_cache_dir(app)?,
backend_log_path: resolve_log_dir(app)?.join("backend.log"),
app_dir: app_dir.clone(),
media_dir,
db_path: app_dir.join("summaries.db"),
@@ -403,8 +463,47 @@ fn emit_progress(app: &AppHandle, window_label: &str, line: &str) {
}
}
fn append_backend_log(log_path: &Path, line: &str) {
if let Ok(mut file) = OpenOptions::new().create(true).append(true).open(log_path) {
let _ = writeln!(file, "{line}");
}
}
fn backend_failure_message(stderr_output: &str, fallback: String) -> String {
for line in stderr_output.lines().rev() {
let trimmed = line.trim();
if let Some(message) = trimmed.strip_prefix("[error]") {
let message = message.trim();
if !message.is_empty() {
return message.to_string();
}
}
}
for line in stderr_output.lines().rev() {
let trimmed = line.trim();
if trimmed.is_empty()
|| trimmed.starts_with("WARNING:")
|| trimmed.starts_with("Traceback")
|| trimmed.starts_with("File ")
|| trimmed.starts_with("During handling")
{
continue;
}
if trimmed.starts_with("ERROR:")
|| trimmed.contains("RuntimeError:")
|| trimmed.contains("SystemExit:")
{
return trimmed.to_string();
}
}
fallback
}
fn apply_backend_env(command: &mut Command, state: &AppState) {
command.env("PYTHONUNBUFFERED", "1");
command.env("PYTHONIOENCODING", "utf-8");
command.env("YTS_WHISPER_CACHE_DIR", &state.whisper_cache_dir);
if let Some(ffmpeg_path) = &state.ffmpeg_path {
@@ -427,6 +526,8 @@ fn build_backend_command(state: &AppState, args: &[String]) -> Command {
command.args(args).current_dir(&state.media_dir);
apply_backend_env(&mut command, state);
#[cfg(target_os = "windows")]
command.creation_flags(CREATE_NO_WINDOW);
command
}
@@ -440,12 +541,20 @@ fn run_backend_json_command(
let mut command_args = args.to_vec();
command_args.push("--output-json".to_string());
command_args.push(output_path.to_string_lossy().into_owned());
append_backend_log(
&state.backend_log_path,
&format!("=== summarize {} ===", command_args.join(" ")),
);
let mut child = build_backend_command(state, &command_args)
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.spawn()
.map_err(|err| format!("Failed to start bundled backend: {err}"))?;
.map_err(|err| {
let message = format!("Failed to start bundled backend: {err}");
append_backend_log(&state.backend_log_path, &message);
message
})?;
let stdout = child
.stdout
@@ -456,26 +565,29 @@ fn run_backend_json_command(
.take()
.ok_or_else(|| "Backend stderr was not captured.".to_string())?;
let stderr_buffer = Arc::new(Mutex::new(String::new()));
let stdout_log_path = state.backend_log_path.clone();
let stderr_log_path = state.backend_log_path.clone();
let stdout_app = app.clone();
let stdout_label = window_label.to_string();
let stdout_handle = thread::spawn(move || {
for line in BufReader::new(stdout).lines() {
match line {
Ok(line) => emit_progress(&stdout_app, &stdout_label, &line),
Ok(line) => {
append_backend_log(&stdout_log_path, &format!("[stdout] {line}"));
emit_progress(&stdout_app, &stdout_label, &line);
}
Err(_) => break,
}
}
});
let stderr_app = app.clone();
let stderr_label = window_label.to_string();
let stderr_buffer_clone = Arc::clone(&stderr_buffer);
let stderr_handle = thread::spawn(move || {
for line in BufReader::new(stderr).lines() {
match line {
Ok(line) => {
emit_progress(&stderr_app, &stderr_label, &line);
append_backend_log(&stderr_log_path, &format!("[stderr] {line}"));
if let Ok(mut buffer) = stderr_buffer_clone.lock() {
buffer.push_str(&line);
buffer.push('\n');
@@ -486,9 +598,15 @@ fn run_backend_json_command(
}
});
let status = child
.wait()
.map_err(|err| format!("Failed to wait for bundled backend: {err}"))?;
let status = child.wait().map_err(|err| {
let message = format!("Failed to wait for bundled backend: {err}");
append_backend_log(&state.backend_log_path, &message);
message
})?;
append_backend_log(
&state.backend_log_path,
&format!("Bundled backend exit status: {status}"),
);
let _ = stdout_handle.join();
let _ = stderr_handle.join();
@@ -498,34 +616,66 @@ fn run_backend_json_command(
.lock()
.map(|buffer| buffer.trim().to_string())
.unwrap_or_else(|_| String::new());
let message = if stderr_output.is_empty() {
format!("Bundled backend exited with status {status}.")
} else {
stderr_output
};
let message = backend_failure_message(
&stderr_output,
format!("Bundled backend exited with status {status}."),
);
append_backend_log(
&state.backend_log_path,
&format!("Backend failure: {message}"),
);
let _ = fs::remove_file(&output_path);
return Err(message);
}
let raw_json = fs::read_to_string(&output_path)
.map_err(|err| format!("Failed to read backend output JSON: {err}"))?;
let raw_json = fs::read_to_string(&output_path).map_err(|err| {
let message = format!("Failed to read backend output JSON: {err}");
append_backend_log(&state.backend_log_path, &message);
message
})?;
let _ = fs::remove_file(&output_path);
serde_json::from_str(&raw_json).map_err(|err| format!("Invalid backend output JSON: {err}"))
serde_json::from_str(&raw_json).map_err(|err| {
let message = format!("Invalid backend output JSON: {err}");
append_backend_log(&state.backend_log_path, &message);
message
})
}
fn run_backend_text_command(state: &AppState, args: &[String]) -> Result<String, String> {
let output = build_backend_command(state, args)
.output()
.map_err(|err| format!("Failed to start translation backend: {err}"))?;
append_backend_log(
&state.backend_log_path,
&format!("=== translate {} ===", args.join(" ")),
);
let output = build_backend_command(state, args).output().map_err(|err| {
let message = format!("Failed to start translation backend: {err}");
append_backend_log(&state.backend_log_path, &message);
message
})?;
for line in String::from_utf8_lossy(&output.stdout).lines() {
append_backend_log(&state.backend_log_path, &format!("[stdout] {line}"));
}
for line in String::from_utf8_lossy(&output.stderr).lines() {
append_backend_log(&state.backend_log_path, &format!("[stderr] {line}"));
}
append_backend_log(
&state.backend_log_path,
&format!("Translation backend exit status: {}", output.status),
);
if !output.status.success() {
let stderr = String::from_utf8_lossy(&output.stderr).trim().to_string();
return Err(if stderr.is_empty() {
let message = if stderr.is_empty() {
format!("Translation backend exited with status {}.", output.status)
} else {
stderr
});
};
append_backend_log(
&state.backend_log_path,
&format!("Translation failure: {message}"),
);
return Err(message);
}
let translation = String::from_utf8(output.stdout)
@@ -533,7 +683,9 @@ fn run_backend_text_command(state: &AppState, args: &[String]) -> Result<String,
.trim()
.to_string();
if translation.is_empty() {
return Err("Translation backend returned an empty result.".to_string());
let message = "Translation backend returned an empty result.".to_string();
append_backend_log(&state.backend_log_path, &message);
return Err(message);
}
Ok(translation)
@@ -559,19 +711,42 @@ fn summarize_video_inner(
window_label: &str,
request: SummarizeVideoRequest,
) -> Result<SummaryEntry, String> {
let model = normalize_model(request.model);
let SummarizeVideoRequest {
url,
use_whisper,
model,
master_prompt,
} = request;
let model = normalize_model(model);
let mut args = vec![
"summarize".to_string(),
"--url".to_string(),
request.url,
url,
"--model".to_string(),
model,
];
if !request.use_whisper {
if !use_whisper {
args.push("--no-whisper".to_string());
}
let info = run_backend_json_command(state, app, window_label, &args)?;
let prompt_path = if let Some(prompt) = normalize_prompt_template(master_prompt) {
let path = state
.app_dir
.join(format!("tmp_prompt_{}.txt", now_millis()));
fs::write(&path, prompt)
.map_err(|err| format!("Failed to write temporary prompt file: {err}"))?;
args.push("--prompt-template-file".to_string());
args.push(path.to_string_lossy().into_owned());
Some(path)
} else {
None
};
let result = run_backend_json_command(state, app, window_label, &args);
if let Some(path) = prompt_path {
let _ = fs::remove_file(path);
}
let info = result?;
cleanup_artifacts(state, info.audio.as_deref(), info.transcript.as_deref());
let db = open_connection(state)?;
@@ -601,11 +776,17 @@ fn translate_summary_inner(
state: &AppState,
request: TranslateSummaryRequest,
) -> Result<SummaryEntry, String> {
let TranslateSummaryRequest {
id,
lang,
model,
prompt_template,
} = request;
let db = open_connection(state)?;
let summary_text = db
.query_row(
"SELECT summary_en FROM summaries WHERE id = ?",
[request.id],
[id],
|row| row.get::<_, Option<String>>(0),
)
.optional()
@@ -613,29 +794,49 @@ fn translate_summary_inner(
.flatten()
.ok_or_else(|| "No English summary found for translation.".to_string())?;
let tmp_summary_path =
state
let tmp_summary_path = state
.app_dir
.join(format!("tmp_summary_{}_{}.txt", request.id, now_millis()));
.join(format!("tmp_summary_{}_{}.txt", id, now_millis()));
fs::write(&tmp_summary_path, summary_text)
.map_err(|err| format!("Failed to write temporary summary file: {err}"))?;
let model = normalize_model(request.model);
let args = vec![
let model = normalize_model(model);
let mut args = vec![
"translate".to_string(),
"--summary-file".to_string(),
tmp_summary_path.to_string_lossy().into_owned(),
"--lang".to_string(),
request.lang.clone(),
lang.clone(),
"--model".to_string(),
model,
];
let tmp_prompt_path = if let Some(prompt) = normalize_prompt_template(prompt_template) {
let path = state.app_dir.join(format!(
"tmp_translation_prompt_{}_{}.txt",
id,
now_millis()
));
if let Err(err) = fs::write(&path, prompt) {
let _ = fs::remove_file(&tmp_summary_path);
return Err(format!(
"Failed to write temporary translation prompt file: {err}"
));
}
args.push("--prompt-template-file".to_string());
args.push(path.to_string_lossy().into_owned());
Some(path)
} else {
None
};
let result = run_backend_text_command(state, &args);
let _ = fs::remove_file(&tmp_summary_path);
if let Some(path) = tmp_prompt_path {
let _ = fs::remove_file(path);
}
let translation = result?;
let column = match request.lang.as_str() {
let column = match lang.as_str() {
"de" => "summary_de",
"jp" => "summary_jp",
_ => return Err("Unsupported language code.".to_string()),
@@ -643,11 +844,11 @@ fn translate_summary_inner(
db.execute(
&format!("UPDATE summaries SET {column} = ? WHERE id = ?"),
params![translation, request.id],
params![translation, id],
)
.map_err(|err| format!("Failed to save translated summary: {err}"))?;
get_entry_by_id(state, request.id)
get_entry_by_id(state, id)
}
#[tauri::command]
@@ -733,13 +934,57 @@ fn open_file(file_path: String) -> Result<(), String> {
that(path).map_err(|err| format!("Failed to open file: {err}"))
}
fn install_app_menu(app: &mut tauri::App) -> tauri::Result<()> {
let handle = app.handle();
let menu_builder = MenuBuilder::new(handle);
#[cfg(target_os = "macos")]
let menu_builder = {
let app_menu = SubmenuBuilder::new(handle, "YouTube Summarizer")
.hide()
.hide_others()
.show_all()
.separator()
.quit()
.build()?;
menu_builder.item(&app_menu)
};
let settings_menu = SubmenuBuilder::new(handle, "Settings")
.text("open_settings", "Settings...")
.build()?;
let edit_menu = SubmenuBuilder::new(handle, "Edit")
.undo()
.redo()
.separator()
.cut()
.copy()
.paste()
.select_all()
.build()?;
let menu = menu_builder.item(&settings_menu).item(&edit_menu).build()?;
app.set_menu(menu)?;
Ok(())
}
fn main() {
tauri::Builder::default()
.plugin(tauri_plugin_dialog::init())
.on_menu_event(|app, event| {
if event.id() == "open_settings" {
let _ = app.emit_to("main", "open-settings", ());
}
})
.setup(|app| {
let state = ensure_app_state(app.handle())?;
install_app_menu(app)?;
match ensure_app_state(app.handle()) {
Ok(state) => {
app.manage(state);
Ok(())
}
Err(err) => {
write_startup_error_log(app.handle(), &err);
Err(err.into())
}
}
})
.invoke_handler(tauri::generate_handler![
get_models,

View File

@@ -27,13 +27,20 @@
},
"bundle": {
"active": true,
"resources": [
"../backend_cli.py",
"../youtube_summarizer.py",
"../translate_summary.py",
"../requirements.txt",
"resources/backend",
"resources/ffmpeg"
]
"icon": [
"icons/32x32.png",
"icons/128x128.png",
"icons/128x128@2x.png",
"icons/icon.icns",
"icons/icon.ico"
],
"resources": {
"../backend_cli.py": "backend_cli.py",
"../youtube_summarizer.py": "youtube_summarizer.py",
"../translate_summary.py": "translate_summary.py",
"../requirements.txt": "requirements.txt",
"resources/backend": "backend",
"resources/ffmpeg": "ffmpeg"
}
}
}

View File

@@ -4,9 +4,11 @@ import os
import sqlite3
import subprocess
import sys
from pathlib import Path
DB_FILE = os.path.join(os.path.dirname(__file__), 'summaries.db')
TRANSLATE_SCRIPT = os.path.join(os.path.dirname(__file__), 'translate_summary.py')
ROOT = Path(__file__).resolve().parents[1]
DB_FILE = os.environ.get("YTS_DB_FILE", str(ROOT / "summaries.db"))
TRANSLATE_SCRIPT = ROOT / "translate_summary.py"
MODEL = "mistral-small3.1:24b"
def get_entries_needing_translation(conn):
@@ -30,7 +32,7 @@ def translate(summary_text, lang):
# Führe das Übersetzungsskript aus
cmd = [
sys.executable, # benutzt aktuelles Python
TRANSLATE_SCRIPT,
str(TRANSLATE_SCRIPT),
"--summary-file", tmp_summary_path,
"--lang", lang,
"--model", MODEL,

View File

@@ -18,20 +18,54 @@ Example:
import sys
import argparse
import json
import math
import requests
LANG_MAP = {
"de": "German",
"jp": "Japanese"
}
OLLAMA_CHARS_PER_TOKEN = 3.5
OLLAMA_OUTPUT_TOKEN_BUDGET = 2048
OLLAMA_CONTEXT_BUCKETS = (4096, 8192, 16384, 32768, 65536)
def translate_summary_text(summary_text, target_language, model="mistral:latest"):
def default_translation_prompt_template(target_language):
if target_language not in LANG_MAP:
raise ValueError("Supported languages: de (German), jp (Japanese)")
return (
f"Translate the following summary into {LANG_MAP[target_language]}. Only output the translated summary, "
"no explanation or intro. If it's already in the target language, do nothing but repeat it.\n\n"
"Summary:\n{summary}\n\nTranslation:"
)
def render_translation_prompt(summary_text, target_language, prompt_template=None):
template = (prompt_template or default_translation_prompt_template(target_language)).strip()
prompt = (
template
.replace("{language}", LANG_MAP[target_language])
.replace("{summary}", summary_text)
)
if "{summary}" not in template:
prompt = f"{prompt}\n\nSummary:\n{summary_text}\n\nTranslation:"
return prompt
def choose_ollama_num_ctx(prompt, output_budget=OLLAMA_OUTPUT_TOKEN_BUDGET):
estimated_input_tokens = math.ceil(len(prompt) / OLLAMA_CHARS_PER_TOKEN)
needed_tokens = estimated_input_tokens + output_budget
for bucket in OLLAMA_CONTEXT_BUCKETS:
if needed_tokens <= bucket:
return bucket
return OLLAMA_CONTEXT_BUCKETS[-1]
def translate_summary_text(summary_text, target_language, model="mistral:latest", prompt_template=None):
if target_language not in LANG_MAP:
raise ValueError("Supported languages: de (German), jp (Japanese)")
prompt = (
f"Translate the following summary into {LANG_MAP[target_language]}. Only output the translated summary, "
"no explanation or intro. If it's already in the target language, do nothing but repeat it.\n\n"
f"Summary:\n{summary_text}\n\nTranslation:"
render_translation_prompt(summary_text, target_language, prompt_template)
)
payload = {
"model": model,
@@ -39,6 +73,9 @@ def translate_summary_text(summary_text, target_language, model="mistral:latest"
{"role": "system", "content": f"You are an expert translator proficient in {LANG_MAP[target_language]} and English."},
{"role": "user", "content": prompt}
],
"options": {
"num_ctx": choose_ollama_num_ctx(prompt)
},
"stream": False
}
resp = requests.post("http://localhost:11434/api/chat", json=payload)
@@ -47,24 +84,31 @@ def translate_summary_text(summary_text, target_language, model="mistral:latest"
return data.get("message", {}).get("content", "").strip()
def translate_summary_file(summary_file, target_language, model="mistral:latest"):
def translate_summary_file(summary_file, target_language, model="mistral:latest", prompt_template=None):
with open(summary_file, "r", encoding="utf-8") as f:
summary_text = f.read().strip()
if not summary_text:
raise ValueError("Empty summary text!")
return translate_summary_text(summary_text, target_language, model)
return translate_summary_text(summary_text, target_language, model, prompt_template)
def main():
parser = argparse.ArgumentParser(description="Translate summary using Ollama")
parser.add_argument("--summary-file", required=True, help="Path to file with English summary text")
parser.add_argument("--lang", required=True, choices=["de", "jp"], help="Target language: 'de' or 'jp'")
parser.add_argument("--model", default="mistral:latest", help="Ollama model to use")
parser.add_argument("--prompt-template", help="Prompt template for the translation LLM call")
parser.add_argument("--prompt-template-file", help="Path to a text file containing the translation prompt template")
parser.add_argument("--output-file", help="Output file for translated summary")
args = parser.parse_args()
prompt_template = args.prompt_template
if args.prompt_template_file:
with open(args.prompt_template_file, "r", encoding="utf-8") as f:
prompt_template = f.read()
# Read summary
try:
translation = translate_summary_file(args.summary_file, args.lang, args.model)
translation = translate_summary_file(args.summary_file, args.lang, args.model, prompt_template)
except Exception as e:
print(f"Translation failed: {e}", file=sys.stderr)
sys.exit(2)

View File

@@ -98,6 +98,12 @@
.entry .summary {
transition: max-height 0.2s;
}
.entry .summary hr {
border: 0;
border-top: 1px solid currentColor;
color: inherit;
margin: 0.7em 0;
}
.entry.collapsed .summary {
display: -webkit-box !important;
-webkit-line-clamp: 2;
@@ -137,27 +143,151 @@
cursor: default;
border: 1px solid #fbb6ce;
}
.settings-dialog[hidden] {
display: none;
}
.settings-dialog {
position: fixed;
inset: 0;
z-index: 1000;
display: flex;
align-items: center;
justify-content: center;
padding: 24px;
background: rgba(69, 10, 10, 0.28);
}
.settings-panel {
width: min(720px, 100%);
max-height: min(760px, calc(100vh - 48px));
overflow: auto;
padding: 18px;
border: 1px solid #fecdd3;
border-radius: 6px;
background: #fff1f2;
box-shadow: 0 16px 40px rgba(159, 18, 57, 0.18);
}
.settings-panel-header {
display: flex;
align-items: center;
justify-content: space-between;
gap: 16px;
margin-bottom: 14px;
}
.settings-panel-header h2 {
margin: 0;
font-size: 20px;
}
.settings-close-button {
width: 30px;
height: 30px;
display: flex;
align-items: center;
justify-content: center;
padding: 0;
border: none;
border-radius: 4px;
background: transparent;
color: #9f1239;
font-size: 24px;
line-height: 1;
cursor: pointer;
}
.settings-close-button:hover {
background: #ffe4e6;
}
.settings-row {
display: flex;
align-items: center;
gap: 8px;
margin: 10px 0;
font-size: 15px;
}
.settings-row input[type="checkbox"] {
accent-color: #9f1239;
}
.settings-field-label {
display: block;
margin: 16px 0 6px;
font-size: 14px;
font-weight: bold;
}
.settings-prompt-textarea {
width: 100%;
min-height: 260px;
box-sizing: border-box;
padding: 10px;
border: 1px solid #fda4af;
border-radius: 4px;
background: #fff;
color: #7f1d1d;
font: 13px/1.45 ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", monospace;
resize: vertical;
}
.settings-prompt-textarea.compact {
min-height: 150px;
}
.settings-actions {
display: flex;
justify-content: flex-end;
gap: 10px;
margin-top: 12px;
}
.settings-actions button {
padding: 8px 12px;
border: 1px solid #9f1239;
border-radius: 4px;
background: #fff1f2;
color: #9f1239;
cursor: pointer;
}
.settings-actions button:hover {
background: #9f1239;
color: #fff;
}
</style>
</head>
<body>
</head>
<body>
<header style="display:flex; flex-direction:column; gap:5px;">
<form id="summarize-form" style="display:flex; width:100%; gap:10px; align-items:center; flex-wrap: wrap;">
<input type="text" id="url-input" placeholder="Enter YouTube URL" />
<button type="submit">Summarize!</button>
<div style="display: flex; flex-direction: column; gap: 2px; min-width:120px;">
<label style="font-size:14px; color:#9f1239; display: flex; align-items: center; gap: 5px;">
<input type="checkbox" id="whisper-checkbox" checked />Use Whisper
</label>
<label style="font-size:14px; color:#9f1239; display: flex; align-items: center; gap: 5px;">
<input type="checkbox" id="autotranslate-checkbox" checked />Auto Translate
</label>
</div>
<select id="model-select" style="padding:6px; font-size:14px;">
<option disabled selected>Loading models…</option>
</select>
</form>
<div id="loading" class="loading" style="display:none;">Loading…</div>
</header>
<div id="settings-dialog" class="settings-dialog" hidden role="dialog" aria-modal="true" aria-labelledby="settings-title">
<section class="settings-panel">
<div class="settings-panel-header">
<h2 id="settings-title">Settings</h2>
<button type="button" id="settings-close-button" class="settings-close-button" aria-label="Close settings">&times;</button>
</div>
<label class="settings-row">
<input type="checkbox" id="whisper-checkbox" checked />
<span>Use Whisper</span>
</label>
<label class="settings-row">
<input type="checkbox" id="autotranslate-checkbox" checked />
<span>Auto Translate</span>
</label>
<label for="master-prompt-textarea" class="settings-field-label">Master Prompt</label>
<textarea id="master-prompt-textarea" class="settings-prompt-textarea" spellcheck="false"></textarea>
<div class="settings-actions">
<button type="button" id="reset-master-prompt-button">Reset to default</button>
</div>
<label for="translation-prompt-de-textarea" class="settings-field-label">German Translation Prompt</label>
<textarea id="translation-prompt-de-textarea" class="settings-prompt-textarea compact" spellcheck="false"></textarea>
<div class="settings-actions">
<button type="button" id="reset-translation-prompt-de-button">Reset to default</button>
</div>
<label for="translation-prompt-jp-textarea" class="settings-field-label">Japanese Translation Prompt</label>
<textarea id="translation-prompt-jp-textarea" class="settings-prompt-textarea compact" spellcheck="false"></textarea>
<div class="settings-actions">
<button type="button" id="reset-translation-prompt-jp-button">Reset to default</button>
</div>
</section>
</div>
<div id="pagination-top" class="pagination" style="display:none;"></div>
<div id="summaries-container"></div>
<div id="pagination-bottom" class="pagination" style="display:none;"></div>

View File

@@ -3,6 +3,28 @@ const invoke = tauriApi?.core?.invoke;
const listen = tauriApi?.event?.listen;
const convertFileSrc = tauriApi?.core?.convertFileSrc;
const confirmDialog = tauriApi?.dialog?.confirm;
const DEFAULT_MASTER_PROMPT = `You are an expert summarizer. Summarize the following video concisely:
Title: {title}
Transcript:
{transcript}
Summary:`;
const DEFAULT_TRANSLATION_PROMPTS = {
de: `Translate the following summary into German. Only output the translated summary, no explanation or intro. If it's already in the target language, do nothing but repeat it.
Summary:
{summary}
Translation:`,
jp: `Translate the following summary into Japanese. Only output the translated summary, no explanation or intro. If it's already in the target language, do nothing but repeat it.
Summary:
{summary}
Translation:`
};
if (!invoke || !listen) {
throw new Error('Tauri runtime API is unavailable.');
@@ -21,11 +43,12 @@ function toWebviewFileUrl(filePath) {
window.api = {
getModels: () => invoke('get_models'),
getSummaries: () => invoke('get_summaries'),
summarizeVideo: (url, useWhisper, model) => invoke('summarize_video', {
summarizeVideo: (url, useWhisper, model, masterPrompt) => invoke('summarize_video', {
request: {
url,
useWhisper,
model: model || null
model: model || null,
masterPrompt: masterPrompt || null
}
}),
openExternal: (url) => invoke('open_external', { url }),
@@ -33,16 +56,18 @@ window.api = {
deleteSummary: (id) => invoke('delete_summary', {
request: { id }
}),
translateSummary: (id, lang, model) => invoke('translate_summary', {
translateSummary: (id, lang, model, promptTemplate) => invoke('translate_summary', {
request: {
id,
lang,
model: model || null
model: model || null,
promptTemplate: promptTemplate || null
}
}),
onSummarizeProgress: (callback) => listen('summarize-progress', (event) => {
callback(String(event.payload || ''));
})
}),
onOpenSettings: (callback) => listen('open-settings', callback)
};
window.addEventListener('DOMContentLoaded', async () => {
@@ -56,6 +81,15 @@ window.addEventListener('DOMContentLoaded', async () => {
const paginationBottom = document.getElementById('pagination-bottom');
const summarizeButton = form.querySelector('button[type="submit"]');
const autoTranslateCheckbox = document.getElementById('autotranslate-checkbox');
const settingsDialog = document.getElementById('settings-dialog');
const settingsPanel = settingsDialog.querySelector('.settings-panel');
const settingsCloseButton = document.getElementById('settings-close-button');
const masterPromptTextarea = document.getElementById('master-prompt-textarea');
const resetMasterPromptButton = document.getElementById('reset-master-prompt-button');
const translationPromptDeTextarea = document.getElementById('translation-prompt-de-textarea');
const translationPromptJpTextarea = document.getElementById('translation-prompt-jp-textarea');
const resetTranslationPromptDeButton = document.getElementById('reset-translation-prompt-de-button');
const resetTranslationPromptJpButton = document.getElementById('reset-translation-prompt-jp-button');
let fullSummaries = [];
let currentPage = 1;
@@ -71,8 +105,45 @@ window.addEventListener('DOMContentLoaded', async () => {
loadingIndicator.textContent = message;
}
function getMasterPrompt() {
const savedPrompt = localStorage.getItem('masterPrompt');
if (savedPrompt && savedPrompt.trim()) {
return savedPrompt;
}
return DEFAULT_MASTER_PROMPT;
}
function getTranslationPrompt(lang) {
const savedPrompt = localStorage.getItem(`translationPrompt.${lang}`);
if (savedPrompt && savedPrompt.trim()) {
return savedPrompt;
}
return DEFAULT_TRANSLATION_PROMPTS[lang];
}
function syncSettingsFields() {
whisperCheckbox.checked = localStorage.getItem('useWhisper') === '0' ? false : true;
autoTranslateCheckbox.checked = localStorage.getItem('autoTranslate') === '1' ? true : false;
masterPromptTextarea.value = getMasterPrompt();
translationPromptDeTextarea.value = getTranslationPrompt('de');
translationPromptJpTextarea.value = getTranslationPrompt('jp');
}
function openSettings() {
syncSettingsFields();
settingsDialog.hidden = false;
masterPromptTextarea.focus();
}
function closeSettings() {
settingsDialog.hidden = true;
}
whisperCheckbox.checked = localStorage.getItem('useWhisper') === '0' ? false : true;
autoTranslateCheckbox.checked = localStorage.getItem('autoTranslate') === '1' ? true : false;
masterPromptTextarea.value = getMasterPrompt();
translationPromptDeTextarea.value = getTranslationPrompt('de');
translationPromptJpTextarea.value = getTranslationPrompt('jp');
whisperCheckbox.addEventListener('change', () => {
localStorage.setItem('useWhisper', whisperCheckbox.checked ? '1' : '0');
@@ -80,6 +151,41 @@ window.addEventListener('DOMContentLoaded', async () => {
autoTranslateCheckbox.addEventListener('change', () => {
localStorage.setItem('autoTranslate', autoTranslateCheckbox.checked ? '1' : '0');
});
masterPromptTextarea.addEventListener('input', () => {
localStorage.setItem('masterPrompt', masterPromptTextarea.value);
});
translationPromptDeTextarea.addEventListener('input', () => {
localStorage.setItem('translationPrompt.de', translationPromptDeTextarea.value);
});
translationPromptJpTextarea.addEventListener('input', () => {
localStorage.setItem('translationPrompt.jp', translationPromptJpTextarea.value);
});
resetMasterPromptButton.addEventListener('click', () => {
masterPromptTextarea.value = DEFAULT_MASTER_PROMPT;
localStorage.setItem('masterPrompt', DEFAULT_MASTER_PROMPT);
masterPromptTextarea.focus();
});
resetTranslationPromptDeButton.addEventListener('click', () => {
translationPromptDeTextarea.value = DEFAULT_TRANSLATION_PROMPTS.de;
localStorage.setItem('translationPrompt.de', DEFAULT_TRANSLATION_PROMPTS.de);
translationPromptDeTextarea.focus();
});
resetTranslationPromptJpButton.addEventListener('click', () => {
translationPromptJpTextarea.value = DEFAULT_TRANSLATION_PROMPTS.jp;
localStorage.setItem('translationPrompt.jp', DEFAULT_TRANSLATION_PROMPTS.jp);
translationPromptJpTextarea.focus();
});
settingsCloseButton.addEventListener('click', closeSettings);
settingsDialog.addEventListener('click', (event) => {
if (!settingsPanel.contains(event.target)) {
closeSettings();
}
});
window.addEventListener('keydown', (event) => {
if (event.key === 'Escape' && !settingsDialog.hidden) {
closeSettings();
}
});
function renderSummaries(list) {
summariesContainer.innerHTML = '';
@@ -345,6 +451,8 @@ window.addEventListener('DOMContentLoaded', async () => {
.replace(/^## (.+)$/gm, '<h2>$1</h2>')
.replace(/^# (.+)$/gm, '<h1>$1</h1>');
escaped = escaped.replace(/^---$/gm, '<hr>');
escaped = escaped.replace(
/(^|\n)([ \t]*\* .+(?:\n[ \t]*\* .+)*)/g,
(_, lead, listBlock) => {
@@ -525,7 +633,7 @@ window.addEventListener('DOMContentLoaded', async () => {
setActionLinksDisabled(true);
const selectedModel = modelSelect.value;
window.api.summarizeVideo(url, useWhisper, selectedModel)
window.api.summarizeVideo(url, useWhisper, selectedModel, getMasterPrompt())
.then((newEntry) => {
if (!newEntry || !newEntry.id) {
return window.api.getSummaries().then(setSummaries);
@@ -539,10 +647,10 @@ window.addEventListener('DOMContentLoaded', async () => {
let translationsOk = true;
setLoadingMessage('Translating to German (DE)…');
return window.api.translateSummary(newEntry.id, 'de', selectedModel)
return window.api.translateSummary(newEntry.id, 'de', selectedModel, getTranslationPrompt('de'))
.then(() => {
setLoadingMessage('Translating to Japanese (JP)…');
return window.api.translateSummary(newEntry.id, 'jp', selectedModel);
return window.api.translateSummary(newEntry.id, 'jp', selectedModel, getTranslationPrompt('jp'));
})
.catch(err => {
translationsOk = false;
@@ -575,4 +683,5 @@ window.addEventListener('DOMContentLoaded', async () => {
}
setLoadingMessage(line);
});
window.api.onOpenSettings(openSettings);
});

View File

@@ -35,8 +35,10 @@ import re
import time
import json
import glob
import math
import subprocess
import multiprocessing
import threading
import requests
import yt_dlp
import webvtt
@@ -62,6 +64,17 @@ NUM_SLICES = 8
OVERLAP_SEC = 1
MAX_OVERLAP_WORDS = 7
WHISPER_MODEL = "small" # e.g. "small", "medium", "large-v3" …
OLLAMA_CHARS_PER_TOKEN = 3.5
OLLAMA_OUTPUT_TOKEN_BUDGET = 2048
OLLAMA_CONTEXT_BUCKETS = (4096, 8192, 16384, 32768, 65536)
DEFAULT_SUMMARY_PROMPT_TEMPLATE = """You are an expert summarizer. Summarize the following video concisely:
Title: {title}
Transcript:
{transcript}
Summary:"""
def debug_print(*args, **kwargs):
@@ -70,6 +83,30 @@ def debug_print(*args, **kwargs):
print("[DEBUG]", *args, **kwargs, file=sys.stderr)
class ProgressHeartbeat:
"""Emit periodic progress while a blocking backend operation is active."""
def __init__(self, message_fn, interval: float = 15.0):
self.message_fn = message_fn
self.interval = interval
self._stop = threading.Event()
self._thread = threading.Thread(target=self._run, daemon=True)
def __enter__(self):
self._thread.start()
return self
def __exit__(self, exc_type, exc, tb):
self._stop.set()
self._thread.join(timeout=1.0)
def _run(self):
while not self._stop.wait(self.interval):
message = self.message_fn()
if message:
print(message, flush=True)
def get_ffmpeg_binary() -> str:
"""Return the ffmpeg executable path, preferring a bundled override."""
value = os.environ.get("YTS_FFMPEG", "").strip()
@@ -82,6 +119,31 @@ def get_ffprobe_binary() -> str:
return value or "ffprobe"
def get_ffmpeg_directory() -> Optional[str]:
"""Return the directory containing the configured ffmpeg binary."""
value = os.environ.get("YTS_FFMPEG", "").strip()
if not value:
return None
if os.path.isfile(value):
return os.path.dirname(value)
return value
def get_yt_dlp_ffmpeg_location() -> Optional[str]:
"""Return an ffmpeg location suitable for yt_dlp postprocessors."""
return get_ffmpeg_directory()
def ensure_ffmpeg_on_path() -> None:
"""Expose bundled ffmpeg to libraries that shell out to plain `ffmpeg`."""
ffmpeg_dir = get_ffmpeg_directory()
if not ffmpeg_dir:
return
path_entries = [entry for entry in os.environ.get("PATH", "").split(os.pathsep) if entry]
if ffmpeg_dir not in path_entries:
os.environ["PATH"] = os.pathsep.join([ffmpeg_dir, *path_entries])
def get_whisper_download_root() -> Optional[str]:
"""Return a stable Whisper cache directory when one is configured."""
value = os.environ.get("YTS_WHISPER_CACHE_DIR", "").strip()
@@ -286,6 +348,9 @@ def _download_audio_with_yt_dlp(url: str, vid: str, extractor_args: Optional[dic
}
if extractor_args:
opts["extractor_args"] = extractor_args
ffmpeg_location = get_yt_dlp_ffmpeg_location()
if ffmpeg_location:
opts["ffmpeg_location"] = ffmpeg_location
with yt_dlp.YoutubeDL(opts) as ydl:
ydl.download([url])
if not os.path.exists(audio_fn):
@@ -295,7 +360,8 @@ def _download_audio_with_yt_dlp(url: str, vid: str, extractor_args: Optional[dic
def download_video_audio(url: str, vid: str) -> str:
"""Download the best available audio for a YouTube video."""
print(f"📥 Downloading audio from {url}")
ensure_ffmpeg_on_path()
print(f"Downloading audio from {url} ...")
# Clean up any stale partials that can trigger HTTP 416 resume errors.
_cleanup_audio_artifacts(vid)
@@ -322,16 +388,63 @@ def download_video_audio(url: str, vid: str) -> str:
def get_audio_duration(path: str) -> float:
"""Return the duration of an audio file using ffprobe."""
res = subprocess.run([
try:
file_size = os.path.getsize(path)
except OSError:
file_size = None
ffprobe_args = [
get_ffprobe_binary(), "-v", "error", "-show_entries", "format=duration",
"-of", "default=noprint_wrappers=1:nokey=1", path
], capture_output=True, text=True)
return float(res.stdout.strip())
]
res = subprocess.run(
ffprobe_args,
capture_output=True,
text=True,
encoding="utf-8",
errors="replace",
)
print(f"[ffprobe] command: {ffprobe_args}", file=sys.stderr, flush=True)
print(f"[ffprobe] target: {path}", file=sys.stderr, flush=True)
print(f"[ffprobe] target size bytes: {file_size}", file=sys.stderr, flush=True)
print(f"[ffprobe] returncode: {res.returncode}", file=sys.stderr, flush=True)
print("[ffprobe] stdout raw begin", file=sys.stderr, flush=True)
if res.stdout:
sys.stderr.write(res.stdout)
if not res.stdout.endswith("\n"):
sys.stderr.write("\n")
sys.stderr.flush()
else:
print("<empty>", file=sys.stderr, flush=True)
print("[ffprobe] stdout raw end", file=sys.stderr, flush=True)
print("[ffprobe] stderr raw begin", file=sys.stderr, flush=True)
if res.stderr:
sys.stderr.write(res.stderr)
if not res.stderr.endswith("\n"):
sys.stderr.write("\n")
sys.stderr.flush()
else:
print("<empty>", file=sys.stderr, flush=True)
print("[ffprobe] stderr raw end", file=sys.stderr, flush=True)
stdout_value = res.stdout.strip()
try:
return float(stdout_value)
except ValueError:
match = re.search(r"[-+]?\d+(?:[.,]\d+)?", stdout_value or res.stderr)
if match:
return float(match.group(0).replace(",", "."))
raise RuntimeError(
"ffprobe did not return a parseable duration "
f"(returncode={res.returncode}, stdout={stdout_value!r}, stderr={res.stderr.strip()!r})"
)
def slice_audio(audio_path: str, vid: str) -> List[Tuple[str, float, float]]:
"""Slice a long audio file into overlapping chunks for Whisper."""
print("Slicing audio ")
print("Slicing audio ...")
duration = get_audio_duration(audio_path)
length = duration / NUM_SLICES
slices = []
@@ -351,6 +464,7 @@ def slice_audio(audio_path: str, vid: str) -> List[Tuple[str, float, float]]:
def transcribe_slice(args: Tuple[str, int, str, str]) -> str:
"""Transcribe a single audio slice using Whisper and save to a text file."""
ensure_ffmpeg_on_path()
slice_path, idx, model_name, vid = args
if whisper is None:
raise RuntimeError("Whisper package is required but not installed")
@@ -398,9 +512,10 @@ def clean_temp(pattern: str) -> None:
def whisper_transcript(url: str, vid: str) -> str:
"""Run the Whisper pipeline and return the final transcript text."""
ensure_ffmpeg_on_path()
audio = download_video_audio(url, vid)
slices = slice_audio(audio, vid)
print("✍️ Transcribing using Whisper...", flush=True)
print("Transcribing using Whisper...", flush=True)
args = [(p, i, WHISPER_MODEL, vid) for i, (p, _, _) in enumerate(slices)]
with multiprocessing.Pool(len(slices)) as pool:
t_files = pool.map(transcribe_slice, args)
@@ -415,17 +530,36 @@ def whisper_transcript(url: str, vid: str) -> str:
# OllamaSummarizer
# -----------------------
def summarize_with_ollama(title: str, transcript: str, model: str = "mistral:latest") -> str:
def render_summary_prompt(title: str, transcript: str, prompt_template: Optional[str] = None) -> str:
template = (prompt_template or DEFAULT_SUMMARY_PROMPT_TEMPLATE).strip()
prompt = template.replace("{title}", title).replace("{transcript}", transcript)
if "{title}" not in template:
prompt = f"{prompt}\n\nTitle: {title}"
if "{transcript}" not in template:
prompt = f"{prompt}\n\nTranscript:\n{transcript}"
return prompt
def choose_ollama_num_ctx(prompt: str, output_budget: int = OLLAMA_OUTPUT_TOKEN_BUDGET) -> int:
estimated_input_tokens = math.ceil(len(prompt) / OLLAMA_CHARS_PER_TOKEN)
needed_tokens = estimated_input_tokens + output_budget
for bucket in OLLAMA_CONTEXT_BUCKETS:
if needed_tokens <= bucket:
return bucket
return OLLAMA_CONTEXT_BUCKETS[-1]
def summarize_with_ollama(
title: str,
transcript: str,
model: str = "mistral:latest",
prompt_template: Optional[str] = None,
) -> str:
"""
Send video title and transcript text to Ollama and return the summary string.
"""
debug_print(f"Preparing summary with model {model}, transcript length={len(transcript)}")
prompt = (
"You are an expert summarizer. Summarize the following video concisely:\n\n"
f"Title: {title}\n\n"
f"Transcript:\n{transcript}\n\n"
"Summary:"
)
prompt = render_summary_prompt(title, transcript, prompt_template)
debug_print(prompt)
payload = {
"model": model,
@@ -433,21 +567,50 @@ def summarize_with_ollama(title: str, transcript: str, model: str = "mistral:lat
{"role": "system", "content": "You are an intelligent summarizer."},
{"role": "user", "content": prompt}
],
"options": {
"num_ctx": choose_ollama_num_ctx(prompt)
},
"stream": True
}
debug_print("Sending request to Ollama ")
resp = requests.post("http://localhost:11434/api/chat", json=payload, stream=True)
debug_print(f"Ollama status: {resp.status_code}")
debug_print("Sending request to Ollama ...")
summary = ""
last_progress_chars = 0
def heartbeat_message() -> str:
if summary:
return f"Ollama is generating summary... {len(summary)} characters received."
return "Waiting for Ollama to start responding..."
try:
with ProgressHeartbeat(heartbeat_message):
resp = requests.post(
"http://localhost:11434/api/chat",
json=payload,
stream=True,
timeout=(10, 1800),
)
debug_print(f"Ollama status: {resp.status_code}")
resp.raise_for_status()
for line in resp.iter_lines(decode_unicode=True):
if not line:
continue
try:
msg = json.loads(line).get("message", {}).get("content", "")
summary += msg
if len(summary) - last_progress_chars >= 1000:
last_progress_chars = len(summary)
print(
f"Ollama is generating summary... {last_progress_chars} characters received.",
flush=True,
)
except Exception:
continue
except requests.RequestException as exc:
raise RuntimeError(f"Ollama request failed: {exc}") from exc
if not summary.strip():
raise RuntimeError("Ollama returned an empty summary.")
debug_print(f"Summary generated, length={len(summary)}")
print("Summary generated.", flush=True)
return summary
@@ -517,7 +680,13 @@ def download_thumbnail(vid: str, thumbnail_url: str) -> Optional[str]:
# Main
# -----------------------
def process_video(url: str, use_whisper: bool, model: str = "mistral:latest", output_json: Optional[str] = None) -> dict:
def process_video(
url: str,
use_whisper: bool,
model: str = "mistral:latest",
output_json: Optional[str] = None,
prompt_template: Optional[str] = None,
) -> dict:
"""
Core processing routine. Retrieves metadata, obtains transcript via the
selected workflow, generates a summary using Ollama and writes the
@@ -552,17 +721,17 @@ def process_video(url: str, use_whisper: bool, model: str = "mistral:latest", ou
# Fetch transcript
if use_whisper:
print("🤖 Using Whisper parallel transcription")
print("Using Whisper parallel transcription...")
transcript_text = whisper_transcript(url, vid)
if not transcript_text.strip():
raise SystemExit("Whisper transcription failed or empty.")
else:
print("▶️ Using classic API/subtitle workflow")
print("Using classic API/subtitle workflow...")
# Try API first
try:
transcript_text = get_transcript_api(vid)
except Exception:
print("API failed, falling back to subtitles")
print("API failed, falling back to subtitles...")
transcript_text = get_subtitles_via_yt_dlp(url)
if not transcript_text:
raise SystemExit("No transcript/subtitles available.")
@@ -601,8 +770,8 @@ def process_video(url: str, use_whisper: bool, model: str = "mistral:latest", ou
audio_filename = None
# Generate summary
print("✍️ Generating summary with Ollama", flush=True)
summary_text = summarize_with_ollama(title, transcript_text, model)
print("Generating summary with Ollama...", flush=True)
summary_text = summarize_with_ollama(title, transcript_text, model, prompt_template)
# Create metadata dictionary
meta = {
@@ -625,7 +794,13 @@ def process_video(url: str, use_whisper: bool, model: str = "mistral:latest", ou
return meta
def rewrite_summary(title: str, transcript_file: str, model: str = "mistral:latest", output_json: Optional[str] = None) -> dict:
def rewrite_summary(
title: str,
transcript_file: str,
model: str = "mistral:latest",
output_json: Optional[str] = None,
prompt_template: Optional[str] = None,
) -> dict:
"""
Regenerate a summary from an existing transcript file using the specified model.
@@ -648,7 +823,7 @@ def rewrite_summary(title: str, transcript_file: str, model: str = "mistral:late
with open(transcript_file, 'r', encoding='utf-8') as f:
transcript_text = f.read()
debug_print(f"Rewriting summary using model {model} for {transcript_file}")
summary_text = summarize_with_ollama(title, transcript_text, model)
summary_text = summarize_with_ollama(title, transcript_text, model, prompt_template)
meta = {'summary': summary_text}
if output_json:
with open(output_json, 'w', encoding='utf-8') as f:
@@ -669,17 +844,25 @@ def main():
help="Ollama model to use for summarization (default: mistral:latest)")
parser.add_argument('--transcript-file', type=str, default=None,
help="Path to an existing transcript file; when provided the script will skip transcription and only generate a summary.")
parser.add_argument('--prompt-template', type=str, default=None,
help="Prompt template for the summary LLM call.")
parser.add_argument('--prompt-template-file', type=str, default=None,
help="Path to a text file containing the prompt template.")
args = parser.parse_args()
use_whisper = not args.no_ai
prompt_template = args.prompt_template
if args.prompt_template_file:
with open(args.prompt_template_file, 'r', encoding='utf-8') as f:
prompt_template = f.read()
try:
# If a transcript file is provided, skip the normal processing and only rewrite summary
if args.transcript_file:
vid, title, _ = fetch_video_metadata(args.url)
meta = rewrite_summary(title, args.transcript_file, args.model, args.output_json)
meta = rewrite_summary(title, args.transcript_file, args.model, args.output_json, prompt_template)
else:
meta = process_video(args.url, use_whisper, args.model, args.output_json)
meta = process_video(args.url, use_whisper, args.model, args.output_json, prompt_template)
# If no JSON output specified, print metadata as JSON to stdout
if not args.output_json:
print(json.dumps(meta, ensure_ascii=False, indent=2))