Backend
- /chat: support streaming via StreamingResponse; save full reply after stream ends. Non-stream path unchanged.
- ChatRequest: add stream flag (default false).
- GenerateTitleRequest: add model and use it instead of hardcoded llama3.
- ollama_client.chat_stream(): new async generator parsing Ollama streaming JSON (both formats).
- Remove response_model from /chat to allow streaming; non-stream still returns { reply }.
Electron
- Open external links in system browser (setWindowOpenHandler, shell.openExternal).
- New IPC: update-settings, open-external-link.
- Set minimum window size; preload exposes updateSettings and openExternalLink.
Frontend (React)
- Streaming UI with live chunking; sticky-bottom only when user at bottom.
- Per-session scroll persistence and robust restore.
- New message tip to jump to latest reply when scrolled up.
- Disable Send while sending; spinner.
- General Settings: stream output toggle; propagate model/stream changes.
- Apply color scheme at boot; extract colorSchemes helper.
- Sidebar UX tweaks and unread badges.
Markdown/rendering
- Code blocks: language title bar and wrapper.
- Tables: GitHub-style parsing, per-cell borders, rounded wrapper, spacing, alignment.
- Headings: remove blank line after h1-h4.
- <hr>: handle after tables; strip following whitespace.
- Links: target=_blank with icon and URL tooltip.
Styles
- Add styles for code/table wrappers, new-message tip, toggle, spinner; hover/active vars; narrower sidebar.
API notes / breaking changes
- /chat accepts stream=true and returns text/plain streamed chunks.
- generate-title now requires a model.
- Non-stream /chat response shape unchanged.
56 lines
2.1 KiB
Python
56 lines
2.1 KiB
Python
|
|
import httpx
|
|
import json
|
|
from typing import Dict, Any, List, AsyncGenerator
|
|
|
|
OLLAMA_URL = "http://127.0.0.1:11434"
|
|
|
|
async def list_models() -> Dict[str, Any]:
|
|
async with httpx.AsyncClient(timeout=30.0) as client:
|
|
r = await client.get(f"{OLLAMA_URL}/api/tags")
|
|
r.raise_for_status()
|
|
data = r.json()
|
|
# Normalize to a simple list of names
|
|
models = [m.get('name') for m in data.get('models', [])]
|
|
return {"models": models}
|
|
|
|
async def chat(model: str, messages: List[Dict[str, str]]) -> str:
|
|
payload = {
|
|
"model": model,
|
|
"messages": messages,
|
|
"stream": False
|
|
}
|
|
async with httpx.AsyncClient(timeout=600.0) as client:
|
|
r = await client.post(f"{OLLAMA_URL}/api/chat", json=payload)
|
|
r.raise_for_status()
|
|
data = r.json()
|
|
# Ollama returns full conversation; pick last message content
|
|
try:
|
|
return data["message"]["content"]
|
|
except Exception:
|
|
# Newer Ollama formats may return messages list
|
|
msgs = data.get("messages") or []
|
|
if msgs:
|
|
return msgs[-1].get("content", "")
|
|
return data.get("content", "")
|
|
|
|
async def chat_stream(model: str, messages: List[Dict[str, str]]) -> AsyncGenerator[str, None]:
|
|
payload = {
|
|
"model": model,
|
|
"messages": messages,
|
|
"stream": True
|
|
}
|
|
async with httpx.AsyncClient(timeout=600.0) as client:
|
|
async with client.stream("POST", f"{OLLAMA_URL}/api/chat", json=payload) as r:
|
|
r.raise_for_status()
|
|
async for line in r.aiter_lines():
|
|
if line:
|
|
try:
|
|
chunk = json.loads(line)
|
|
if "content" in chunk: # Newer Ollama format
|
|
yield chunk["content"]
|
|
elif "message" in chunk and "content" in chunk["message"]: # Older format
|
|
yield chunk["message"]["content"]
|
|
except json.JSONDecodeError:
|
|
pass # Ignore invalid JSON lines
|