A large language model is, at its core, just two things: a file of model parameters and the code that runs them. Taking Llama 2 70B as an example, the parameters are the neural network's weights, while the inference code can be as little as a few lines of C and runs locally with no internet connection. The model works by training on a 'lossy compression' of vast amounts of internet text, learning to predict the next word in a sequence — and in doing so mathematically captures a surprising amount of knowledge about the world.
Pretraining and fine-tuning: building an assistant
Building an LLM happens in two stages. Pretraining burns millions of dollars of GPU time to process terabytes of text, compressing internet-scale knowledge into the weights. Fine-tuning then uses a smaller, high-quality set of question-answer conversations — usually human-labeled — to turn that raw 'document generator' into a genuinely useful assistant. A third stage, reinforcement learning from human feedback (RLHF), pushes usability and alignment further still.
Tools and multimodality: the LLM as an expert system
Modern LLMs are no longer simple text generators; they increasingly act like the kernel of an operating system that orchestrates external tools. By calling a browser to search, a Python interpreter to compute, and multimodal abilities to handle vision and audio, the model can tackle complex, composite tasks. The analogy to an OS is deliberate — the context window serves as working memory, much like process management, dramatically increasing the model's practical usefulness.
The future: system-2 thinking and self-improvement
The frontier of LLM progress lies in giving models a human-like 'system-2' mode of thinking — letting them reason and iterate over a tree of possibilities on hard problems rather than answering on pure intuition. Inspired by AlphaGo, a major open question is how a language model might automatically self-improve within a narrow domain. Through repeated training iterations, models may eventually surpass human performance on specific tasks and spawn a rich ecosystem of customized variants.
As LLMs spread, security threats grow more prominent. The talk breaks down jailbreak attacks (using role-play or encodings to slip past safety limits), prompt injection (hijacking the model with hidden instructions), and data poisoning (planting trigger words via the training data). These attacks show that LLM security is the same cat-and-mouse game traditional software faces, and underscore how important evaluation and defense research are in this new computing paradigm.
Highlights
🧠 An LLM is two files: a large parameter file plus a small amount of code to run it — a lossy "compression" of a huge slice of the internet.
⚙️ Pretraining builds raw knowledge by next-token prediction; fine-tuning on high-quality Q&A turns that into an assistant that follows instructions.
🛠️ With tools (browsers, code interpreters, calculators) and multiple modalities (vision, audio), the LLM grows from a chatbot into an expert system.
🔮 The frontier is "system-2" thinking — trading more compute at inference time for more deliberate reasoning — plus self-improvement loops.
🛡️ New attack surfaces (jailbreaks, prompt injection, data poisoning) make LLM security an ongoing cat-and-mouse game.
Summary
This talk gives an accessible tour of how large language models (LLMs) work. It
frames a model as essentially a next-word predictor — two files, the
parameters and the code to run them — and walks from pretraining (compressing
the internet) through fine-tuning (turning a raw predictor into a helpful
assistant), then on to tools, multimodality, the road ahead, and security.
Terminology
Pretraining: the expensive stage where a model learns general knowledge by predicting the next token across a massive corpus.
Fine-tuning: a cheaper stage that aligns the pretrained model to behave as a helpful, instruction-following assistant.
System-2 thinking: spending extra inference-time compute to reason more carefully, rather than answering in a single pass.
Prompt injection: an attack that hides adversarial instructions in content the model reads, hijacking its behavior.