Describe what an LLM does
In plain terms, explain what Claude is doing when it answers — building a response one piece at a time from patterns it learned in training.
Build a working mental model of large language models — one level deeper than "it predicts language." Tokens, training, context windows, and why Claude says what it says, without the engineering lecture.
Start here
In plain terms, explain what Claude is doing when it answers — building a response one piece at a time from patterns it learned in training.
Tell a coworker why long prompts, long files, and long chats all consume the same finite working space — and why that changes how you should use Claude.
Describe what training on a large body of text actually gave Claude, and why a knowledge cutoff is not the same thing as memory.
The 3 ideas that unlock the rest
The vendor fed Claude a vast body of text and code over months of compute. Claude did not memorize that material — it learned statistical patterns about how language is used. Training has a cutoff date, after which the model knows nothing new unless you provide it.
When you send a prompt, Claude builds the reply one piece at a time, predicting useful next text from your prompt plus everything it learned. It is not retrieving a stored answer — it is composing one in the moment.
Everything in the conversation — system instructions, your prompt, attached files, prior turns, Claude's own running output — sits in a single working space called the context window. It is finite, and it is what Claude can actually "see" when it predicts the next word.
Inside a conversation
The standing rules Claude operates under — set by the product (Claude.ai, Cowork, an enterprise integration). Small but always present.
Always onEverything you type and every file you attach. A short question costs almost nothing. A 40-page PDF can eat most of the budget on its own.
You controlEvery prior turn — yours and Claude's — stays in the window as the chat grows. This is why long chats start to drift: the model is reading more and more material each turn.
Grows silentlyMechanics playbook
Put the most important context, instructions, and questions at the top of your prompt. Claude reads like a person skimming on deadline — the headline goes first.
Trim ruthlessly before sending. Extra paragraphs of background, copy-pasted boilerplate, and "just in case" detail all crowd out the answer.
If a document is what you actually want Claude to work from, attach it. Paraphrasing a 14-page disclosure in two sentences gives Claude two sentences to reason from.
When answers start drifting — wrong tone, contradictions, repeated mistakes — open a clean chat and paste only the important context. You're resetting the budget.
On high-stakes work, ask Claude to walk through its thinking and cite which part of the source it used. Hidden reasoning is harder to verify than visible reasoning.
Prompt upgrade
Tell me about our underwriting policy on appraisal gaps.
Mechanics-aware prompt
Attached is our 2026 underwriting guide (PDF). On page 14, the section on appraisal gaps. In plain English, what are the options when an appraisal comes in $8,000 low on a conventional loan? Quote the exact line for each option.
Mechanics pay off here
Attaching a long file and asking the right question front-loaded gets a far better answer than pasting the document text into a long prompt.
Working through one problem in a tight, focused chat — instead of stacking unrelated questions on top of it — keeps the context clean.
Attaching both, labeling them clearly ("Source A:", "Source B:"), and asking a specific comparison question — instead of pasting both into one wall of text.
Pulling names, dates, or numbers out of unstructured text. The cleaner the prompt and the leaner the source, the more reliable the extraction.
Quality control
Build the mental model
Pick three. Save anything that produces a finding you'll remember.
You've finished this module when you can explain — in two sentences each — what training gave Claude, what tokens are, and why long chats drift. Bonus credit if you can describe one habit you'll change because of it.
Context visual
A fresh chat: a small system prompt, your one short question, and almost all the room left for Claude to think and reply.
Every part of a conversation spends from the same budget. When the budget is mostly spent on an attachment or a long history, Claude has less room to reason and to reply.
When the budget feels tight — drift, contradictions, forgetful answers — start a clean chat with just the context that matters. You've reset the working space.