As of 2026-04-28 | Built for Notion / Obsidian / Readwise / Cubox users

TL;DR: The real bottleneck of a Second Brain isn't "how much you save" — it's "how fast you digest." Podcasts, YouTube, lectures, and Bilibili are 90% of the modern knowledge worker's raw material, but classic PKM frameworks (PARA, Zettelkasten) were designed for text. This guide gives the 2026 video-first version: Capture → Distill → Connect → Express, and shows how BibiGPT plugs that pipeline into your existing Notion / Obsidian system.

试试粘贴你的视频链接

支持 YouTube、B站、抖音、小红书等 30+ 平台

YouTube

B站

TikTok

小红书

播客

+30

Why Classic PKM Breaks in the Video Era

Text scans fast, video doesn't — you can't "skim" a 1-hour podcast
Note apps only accept text input — videos and audio are black boxes, unsearchable from the inside
"Watched once, forgotten by tomorrow" is the default
Format silos between Notion, Obsidian, and Readwise — no cross-vault search

Methodologies like PARA (Tiago Forte) and Zettelkasten (Niklas Luhmann) answered "how to organize text," but they never answered "how does video enter the system." AI is the patch.

The 4-Step Method at a Glance

Step	Goal	Output
1. Capture	Pull videos / audio into the system	Transcript, link, metadata
2. Distill	Turn raw material into knowledge	Summary, key points, mind map
3. Connect	Wire the knowledge into your second brain	Backlinks, tags, indexes
4. Express	Make knowledge produce output	Articles, slides, flashcards

看看 BibiGPT 的 AI 总结效果

Let's build GPT: from scratch, in code, spelled out

Andrej Karpathy walks through building a tiny GPT in PyTorch — tokenizer, attention, transformer block, training loop.

Summary

Andrej Karpathy spends two hours rebuilding a tiny but architecturally faithful version of GPT in a single Jupyter notebook. He starts from a 1MB Shakespeare text file with a character-level tokenizer, derives self-attention from a humble running average, layers in queries/keys/values, scales up to multi-head attention, and stacks the canonical transformer block. By the end the model produces uncanny pseudo-Shakespeare and the audience has a complete mental map of pretraining, supervised fine-tuning, and RLHF — the three stages that turn a next-token predictor into ChatGPT.

Highlights

🧱 Build the dumbest version first. A bigram baseline gives a working training loop and a loss number to beat before any attention is introduced.
🧮 Self-attention rederived three times. Explicit loop → triangular matmul → softmax-weighted matmul makes the formula click instead of memorise.
🎯 Queries, keys, values are just learned linear projections. Once you see them as that, the famous attention diagram stops being magical.
🩺 Residuals + LayerNorm are what make depth trainable. Karpathy shows how each one earns its place in a transformer block.
🌍 Pretraining is only stage one. The toy model is what we built; supervised fine-tuning and RLHF are what turn it into an assistant.

Questions

- To keep the vocabulary tiny (65 symbols) and the focus on the model. Production GPTs use BPE for efficiency, but the architecture is identical.
- It keeps the variance of the scores roughly constant as the head dimension grows, so the softmax does not collapse to a one-hot distribution.
- Scale (billions vs. tens of millions of parameters), data, and two extra training stages: supervised fine-tuning on conversation data and reinforcement learning from human feedback.

Key Terms

Bigram model: A baseline language model that predicts the next token using only the previous token, implemented as a single embedding lookup.
Self-attention: A mechanism where each token attends to all earlier tokens via softmax-weighted dot products of query and key projections.
LayerNorm (pre-norm): Normalisation applied before each sublayer in modern transformers; keeps activations well-conditioned and lets you train deeper.
RLHF: Reinforcement learning from human feedback — the alignment stage that nudges a pretrained model toward responses humans actually prefer.

想要总结你自己的视频？

BibiGPT 支持 YouTube、B站、抖音等 30+ 平台，一键获得 AI 智能总结

免费试用 BibiGPT

Step 1 — Capture: Turn "watched/listened" into "processable"

1.1 Sources

YouTube / Bilibili / podcasts → browser extension sends to BibiGPT
Local lectures / meetings → drag into local video-to-text
Apple Podcasts / Spotify / Xiaoyuzhou → paste into podcast transcript
Meeting recordings → see our full meeting workflow

1.2 Be selective

PKM Law #1: More capture is not better capture.

Not every video deserves a place in your second brain. Use PARA's 4 categories: only capture material relevant to a current Project, long-term Area, future Resource, or worth Archiving. Pure entertainment? Watch and let it go.

Step 2 — Distill: Transcripts to consumable knowledge

2.1 Three layers of structured output

BibiGPT outputs three layers by default:

30-second summary — decides "should I keep going"
Section-level highlights — full content in 5-10 minutes
Full transcript with timestamps — drill in any time

This maps directly onto Tiago Forte's Progressive Summarization — match effort to need.

2.2 Mind maps — the visual skeleton

Video naturally fits mind maps: topic → sub-topic → example. BibiGPT generates them in one click; export to SVG / PNG / Markmap, drop straight into Notion or Obsidian Canvas.

2.3 AI chat — pull, don't read

The best distillation is active questioning, not passive reading. BibiGPT's AI video chat with source tracing lets you ask:

"What specific number did the guest cite?"
"Where does the guest disagree with last episode's guest?"
"If I'm building a SaaS, how does this argument apply?"

Answers come with click-through timestamps. This is the line between "knowledge work" and "knowledge lookup."

Step 3 — Connect: Wire it into your second brain

3.1 Pick your note system

System	Recommended workflow	Reference
Notion	Auto-archive each summary as a database row via Notion API	Notion + BibiGPT workflow
Obsidian	Export Markdown to vault with auto backlinks	Obsidian + BibiGPT video notes
Readwise	Auto-sync highlights	YouTube → Readwise
Cubox	Send summary + outline + timestamps via Cubox API	Configure Cubox API in settings

3.2 Tags and backlinks

PKM Law #2: A standalone note has no value — connections do.

Tag every video note with at least three:

Topic tag (e.g., #AI-Agents #Podcasts #Interviews)
Author / source tag (e.g., #Lex-Fridman #Lenny)
Status tag (e.g., #to-digest #applied #cited)

Obsidian users add [[]] backlinks so each new video joins your existing graph.

3.3 Index notes

Build a monthly index — one list of every video that month + a one-line summary + jump links. This is the video version of Maps of Content (MOC).

Step 4 — Express: Make knowledge ship

4.1 Output formats

Output	BibiGPT feature
Newsletter / blog	Video-to-article
Slides	One-click PPT from summary
Anki flashcards	Flashcard export
Cross-video synthesis	Collection summary
Multi-platform repurposing	Video → article → short-video script

4.2 Feynman test

PKM Law #3: Knowledge isn't digested until it leaves your head.

Apply the Feynman Technique — re-explain in your own words, find the gaps. BibiGPT's AI chat is a natural Feynman partner: paste your re-explanation and ask, "What did I get wrong?"

See our deep dives: Feynman + Bilibili learning loop and video-learning science system.

4.3 Cross-video synthesis (where compounding lives)

The compounding effect of PKM comes from connecting multiple videos. BibiGPT's collection summary turns 10 episodes on a single topic into one synthesis:

"Compare the 10 AI-Agent videos I watched last month — what are their core arguments, where do they disagree, where are the actionable bets?"

That's PKM's multiplier — single videos have limited value; ten connected videos generate insight no single watch can.

Cross-Walk Against Other Methodologies

Method	Text-era answer	Video-era patch (via BibiGPT)
PARA	Files into P/A/R/A	Video notes same buckets + searchable transcripts
Zettelkasten	Atomic notes + backlinks	Chapter summaries = atomic notes + timestamp citations
Building a Second Brain	CODE: Capture-Organize-Distill-Express	Same four moves with video as raw layer
Linking Your Thinking	MOCs	Monthly video-index notes = video MOCs
Progressive Summarization	4 layers of bolding	30-sec / section / transcript + AI chat

FAQ

Q1: I already have Notion + Readwise. Where does BibiGPT fit?

A: BibiGPT is your raw + distill layer. Capture, transcript, summary, mind map all happen in BibiGPT; final archive and linkage stay in your Notion / Readwise.

Q2: Do I run all four steps for every video?

A: No — apply PARA filtering. Only run the full pipeline for project-/area-/resource-relevant videos.

Q3: How much does cross-video synthesis cost?

A: Plus / Pro tiers include collection summaries, billed by video count. See pricing.

Q4: How do I handle local podcasts?

A: Use local audio-to-text — drag in. For sensitive content, enable Local Privacy Mode.

Q5: How is this different from Whisper + ChatGPT?

A: Whisper + ChatGPT gives you transcripts and one-shot summaries. They can't do mind maps, source-traced chat, collection synthesis, knowledge-tool integrations, flashcards, or video-to-article. BibiGPT is a PKM pipeline, not a one-trick tool.

Q6: How do I avoid information overload?

A: That's exactly why PKM has selectivity rules. PARA filtering + monthly index notes + cross-video synthesis are the three gates against overload.

Closing: The 2026 Shape of a Second Brain

A Second Brain isn't "store every video you watched." It's "make every watched video participate in some future decision or output." AI compresses "watched → digested → reused" into 10 minutes. BibiGPT's role is simple: make video and audio first-class citizens in your note system.

Start building your video-first second brain:

Web: https://bibigpt.co
Desktop: https://bibigpt.co/download/desktop
Mobile: https://bibigpt.co/app
Browser extension: https://bibigpt.co/apps/browser

立即体验 BibiGPT

想要体验这些强大的新功能吗？立即访问 BibiGPT，开启您的智能音视频总结之旅！

开始使用

BibiGPT Team

Build Your AI Second Brain from Videos and Podcasts: The 4-Step PKM Method (2026)