steipete/summarize

每日信息看板 · 2026-03-08
开源项目
Category
github_search
Source
2
Score
2026-03-08T01:44:11Z
Published

AI 总结

GitHub 开源项目 summarize 发布 0.12.0 预览,提供 Chrome/Firefox 侧边栏与 CLI 的网页、文件和音视频摘要能力,并新增聊天、YouTube 幻灯片 OCR 等功能,重要性在于把多模态内容总结与本地流式处理整合到统一工具链。
#GitHub #repo #开源项目 #CLI #YouTube #OCR

内容摘录

Summarize 📝 — Chrome Side Panel + CLI

!GitHub Repo Banner

<!-- Created with GitHub Repo Banner by Waren Gonzaga: https://ghrb.waren.build -->

Fast summaries from URLs, files, and media. Works in the terminal, a Chrome Side Panel and Firefox Sidebar.

**0.12.0 preview (unreleased):** this README reflects the upcoming release.
0.12.0 preview highlights (most interesting first)
Chrome Side Panel **chat** (streaming agent + history) inside the sidebar.
**YouTube slides**: screenshots + OCR + transcript cards, timestamped seek, OCR/Transcript toggle.
Media-aware summaries: auto‑detect video/audio vs page content.
Streaming Markdown + metrics + cache‑aware status.
CLI supports URLs, files, podcasts, YouTube, audio/video, PDFs.
Feature overview
URLs, files, and media: web pages, PDFs, images, audio/video, YouTube, podcasts, RSS.
Slide extraction for video sources (YouTube/direct media) with OCR + timestamped cards.
Transcript-first media flow: published transcripts when available, then Groq/ONNX/whisper.cpp/AssemblyAI/Gemini/OpenAI/FAL transcription fallback when not.
Streaming output with Markdown rendering, metrics, and cache-aware status.
Local, paid, and free models: OpenAI‑compatible local endpoints, paid providers, plus an OpenRouter free preset.
Output modes: Markdown/text, JSON diagnostics, extract-only, metrics, timing, and cost estimates.
Smart default: if content is shorter than the requested length, we return it as-is (use --force-summary to override).
Get the extension (recommended)

!Summarize extension screenshot

One‑click summarizer for the current tab. Chrome Side Panel + Firefox Sidebar + local daemon for streaming Markdown.

**Chrome Web Store:** Summarize Side Panel

YouTube slide screenshots (from the browser):

!Summarize YouTube slide screenshots
Beginner quickstart (extension)
Install the CLI (choose one):
**npm** (cross‑platform): npm i -g @steipete/summarize
**Homebrew** (macOS arm64): brew install steipete/tap/summarize
Install the extension (Chrome Web Store link above) and open the Side Panel.
The panel shows a token + install command. Run it in Terminal:
summarize daemon install --token <TOKEN>

Why a daemon/service?
The extension can’t run heavy extraction inside the browser. It talks to a local background service on 127.0.0.1 for fast streaming and media tools (yt‑dlp, ffmpeg, OCR, transcription).
The service autostarts (launchd/systemd/Scheduled Task) so the Side Panel is always ready.

If you only want the **CLI**, you can skip the daemon install entirely.

Notes:
Summarization only runs when the Side Panel is open.
Auto mode summarizes on navigation (incl. SPAs); otherwise use the button.
Daemon is localhost-only and requires a shared token; rerunning summarize daemon install --token <TOKEN> adds another paired browser token instead of invalidating the old one.
Autostart: macOS (launchd), Linux (systemd user), Windows (Scheduled Task).
Tip: configure free via summarize refresh-free (needs OPENROUTER_API_KEY). Add --set-default to set model=free.

More:
Step-by-step install: apps/chrome-extension/README.md
Architecture + troubleshooting: docs/chrome-extension.md
Firefox compatibility notes: apps/chrome-extension/docs/firefox.md
Slides (extension)
Select **Video + Slides** in the Summarize picker.
Slides render at the top; expand to full‑width cards with timestamps.
Click a slide to seek the video; toggle **Transcript/OCR** when OCR is significant.
Requirements: yt-dlp + ffmpeg for extraction; tesseract for OCR. Missing tools show an in‑panel notice.
Advanced (unpacked / dev)
Build + load the extension (unpacked):
Chrome: pnpm -C apps/chrome-extension build
chrome://extensions → Developer mode → Load unpacked
Pick: apps/chrome-extension/.output/chrome-mv3
Firefox: pnpm -C apps/chrome-extension build:firefox
about:debugging#/runtime/this-firefox → Load Temporary Add-on
Pick: apps/chrome-extension/.output/firefox-mv3/manifest.json
Open Side Panel/Sidebar → copy token.
Install daemon in dev mode:
pnpm summarize daemon install --token <TOKEN> --dev
CLI

!Summarize CLI screenshot
Install

Requires Node 22+.
npx (no install):
npm (global):
npm (library / minimal deps):
Homebrew (custom tap):

Homebrew availability depends on the current tap formula for your architecture.
If Homebrew install fails on Intel/x64, use the npm global install above.
Optional local dependencies

Install these if you want media-heavy features:
ffmpeg: required for --slides and many local media/transcription flows
yt-dlp: required for YouTube slide extraction and some remote media flows
tesseract: optional OCR for --slides-ocr
Optional cloud transcription providers:
GROQ_API_KEY
ASSEMBLYAI_API_KEY
GEMINI_API_KEY / GOOGLE_GENERATIVE_AI_API_KEY / GOOGLE_API_KEY
OPENAI_API_KEY
FAL_KEY

macOS (Homebrew):

If --slides is enabled and these tools are missing, Summarize warns and continues without slides.
CLI vs extension
**CLI only:** just install via npm/Homebrew and run summarize ... (no daemon needed).
**Chrome/Firefox extension:** install the CLI **and** run summarize daemon install --token <TOKEN> so the Side Panel can stream results and use local tools.
Quickstart
Inputs

URLs or local paths:

Stdin (pipe content using -):

**Notes:**
Stdin has a 50MB size limit
The - argument tells summarize to read from standard input
Text stdin is treated as UTF-8 text (whitespace-only input is rejected as empty)
Binary stdin is preserved as raw bytes and file type is auto-detected when possible
Useful for piping clipboard content or command output

YouTube (supports youtube.com and youtu.be):

Podcast RSS (transcribes latest enclosure):

Apple Podcasts episode page:

Spotify episode page (best-effort; may fail for exclusives):
Output length

--length controls how much output we ask for (guideline), not a hard cap.
Presets: short|medium|long|xl|xxl
Character targets: 1500, 20k, 20000
Optional hard cap: --max-output-tokens <count> (e.g. 2000, 2k)
Provider/model APIs still enforce their own maximum output…