Epistemology of AI Knowledge
What does it mean for a machine to 'know' something? Epistemic frameworks for AI.
Plato had a theory about knowledge. He thought knowledge was justified true belief: you know something when you believe it, it’s true, and you have good reasons for believing it. That definition held up for about 2,400 years, which is a pretty good run for any theory. Then in 1963, Edmund Gettier published a three-page paper that blew it apart.
Gettier showed you could have justified true belief and still not have knowledge. His examples were simple. You believe your coworker owns a Ford because you’ve seen him driving one every day. From that, you deduce “someone in my office owns a Ford.” It turns out your coworker’s Ford is a rental and he actually drives a Toyota. But someone else in the office, someone you know nothing about, does own a Ford. Your belief (“someone in my office owns a Ford”) is true. Your justification (you see your coworker driving one) is reasonable. And yet you clearly don’t know it, because your reasoning path was completely wrong. You arrived at truth by accident.
Why am I telling you about a 1963 epistemology paper? Because large language models are Gettier machines.
The Most Sophisticated Gettier Case Ever Built
GPT-4, Claude, Gemini, Llama: they produce statements that are often true, generated through processes that look like justification, but arrived at through mechanisms that have no reliable connection to truth. When Claude tells you that the Battle of Hastings happened in 1066, it’s not because Claude traced the causal chain of historical evidence. It’s because “1066” appeared near “Battle of Hastings” in enough training documents that the statistical association is strong. The output is true. The process is reliable for this particular fact. But the reason it’s reliable has nothing to do with the model understanding anything about Norman conquests or Anglo-Saxon England.
This matters philosophically because it forces us to ask what we actually mean by “justification.” And the answer, it turns out, depends heavily on which epistemological framework you choose.
Let me walk through the major frameworks and show how each one handles the question of whether AI systems can have knowledge.
Classical JTB and Its Discontents
The justified true belief (JTB) account says knowledge requires three things:
| Component | What it means | Does an LLM satisfy it? |
|---|---|---|
| Belief | The agent holds the proposition as true | Unclear. LLMs don’t have beliefs in any obvious sense. They produce token sequences. |
| Truth | The proposition corresponds to reality | Sometimes. Depends on the query, the training data, and luck. |
| Justification | The agent has adequate reasons | This is where it gets interesting. |
The justification component is the hard part. When a human knows that water boils at 100 degrees Celsius at sea level, their justification might include: personal experience boiling water, understanding of molecular kinetics, reading reliable sources, or trusting a teacher who demonstrated it. Each of these justification paths connects the knower to the fact through some causal or rational chain.
When an LLM “knows” this same fact, the justification would need to be something like: “the statistical patterns in my training data reliably associate this temperature with this phenomenon.” Is that justification? Goldman (1979) would say it depends on whether the process is reliable. Plantinga would say it depends on whether the cognitive faculty is functioning properly in the environment it was designed for. BonJour might ask whether the agent has any internal awareness of why this belief is justified.
The problem is that LLMs satisfy a strange subset of these criteria. The process is reliable for well-attested facts, which looks like it should count for something. But there’s no internal model of why the output is correct, no awareness that the process could fail, and no mechanism for the model to distinguish between cases where its statistical associations track truth and cases where they don’t.
This is exactly the structure of a Gettier case. True output. Seemingly justified process. But the connection between the justification and the truth is fragile in ways the agent can’t detect.
Reliabilism: Goldman’s Gift to AI Epistemology
Alvin Goldman introduced reliabilism in 1979 with a simple idea: forget about whether the agent can articulate their justification. What matters is whether the process that produced the belief is reliable. If you have good eyesight and you see a barn, you know there’s a barn there, not because you can explain the optics of vision, but because vision is a reliable belief-forming process.
This is the epistemological framework most friendly to AI knowledge claims. An LLM that consistently produces true statements about chemistry could be said to “know” chemistry under reliabilism, because the process (statistical pattern matching over scientific text) is reliable in that domain.
But Goldman himself identified a killer problem with naive reliabilism, and it maps perfectly onto LLMs. He called it the “generality problem”: how do you define the relevant process type? Consider:
Process: "LLM generating text about chemistry"
Reliability: ~92% (let's say)
Verdict: Reliable enough? Maybe.
Process: "LLM generating text about chemistry involving
compounds first synthesized after 2023"
Reliability: ~61%
Verdict: Not reliable.
Process: "LLM generating text"
Reliability: ~78%
Verdict: Depends on your threshold.
The reliability of the process changes dramatically depending on how you describe the process. This isn’t a technicality. It’s the central challenge of applying reliabilism to AI systems, because the same model, using the same weights, using the same inference procedure, can be reliable in one domain and wildly unreliable in an adjacent domain. And the model itself has no way to tell which domain it’s in.
This is why a 2025 paper from the MDPI journal Philosophies (“Epistemology in the Age of Large Language Models”) argues that LLMs require a domain-indexed reliabilism: we can only assess reliability relative to specific, well-defined query types. The moment you step outside the characterized domain, all reliability guarantees evaporate.
The Frame Problem: What AI Can’t Not Know
In 1969, John McCarthy and Patrick Hayes identified the frame problem in AI. The original formulation was narrow and technical: how do you efficiently represent what doesn’t change when an action occurs? If I move a cup from the table to the shelf, I need to update the cup’s location. But I also need my system to understand that the table still exists, the shelf still exists, gravity still works, and the cup is still a cup. In a formal logical system, you’d need explicit “frame axioms” for every property that doesn’t change, and the number of those axioms explodes combinatorially.
The philosophical version of the frame problem, articulated by Jerry Fodor (1987) and Daniel Dennett (1984), is much deeper: how does a cognitive system determine what’s relevant to a given situation? Humans do this effortlessly. You walk into a room and you instantly know that the color of the walls is irrelevant to whether you should sit down, that the presence of a table is relevant to where you can put your coffee, and that the faint hum of the air conditioner is irrelevant to everything. You don’t reason through these relevance judgments. You just… know.
LLMs have an interesting relationship with the frame problem. In one sense, they’ve “solved” it better than any previous AI system. Because they were trained on human language (which implicitly encodes relevance judgments), they have surprisingly good intuitions about what matters in a given context. Ask Claude about cooking pasta and it won’t start talking about the gravitational constant, even though gravity is technically relevant to boiling water. The relevance filtering happens naturally through the statistical structure of the training data.
But in another sense, LLMs have the frame problem worse than ever. They can’t distinguish between what they know and what they don’t know. They can’t tell the difference between a context where their training data gives them reliable information and a context where it doesn’t. They have no frame for their own knowledge, no meta-awareness of the boundaries of their competence.
This connects to what Daniel Dennett called the “robot’s dilemma”: a robot that reasons about everything is paralyzed by computational cost, but a robot that ignores irrelevant information needs to already know what’s irrelevant, which seems to presuppose the very capacity it’s trying to build.
LLMs sidestep the robot’s dilemma through statistical shortcuts, but they pay for it with a new dilemma: they can’t tell when their shortcuts will fail.
Hallucination as an Epistemological Phenomenon
Let’s talk about hallucination. The AI industry treats it as a bug, a technical failure to be engineered away with better training data, retrieval augmentation, or Constitutional AI. But from an epistemological perspective, hallucination is something more fundamental. It’s a window into the relationship between information processing and knowledge.
When Claude confidently states that a nonexistent paper was published in a nonexistent journal, what’s happening epistemologically? The model has generated a token sequence that is:
- Syntactically well-formed
- Semantically coherent
- Consistent with the statistical patterns of academic citation
- Completely false
This is not a malfunction. This is the system working exactly as designed. The training objective (predict the next token) does not distinguish between true and false completions. It distinguishes between likely and unlikely completions. Truth and likelihood are correlated, sometimes strongly. But they’re not the same thing, and the gap between them is where hallucination lives.
A 2026 paper from Duke University’s library system (“It’s 2026. Why Are LLMs Still Hallucinating?”) makes the case that hallucination is architectural, not incidental. The transformer architecture compresses information into statistical associations between tokens. Some facts have strong statistical signals (the capital of France, the chemical formula for water). Some facts have weak signals (the publication date of a specific paper, the population of a small town in a specific year). The model treats both with equal confidence because confidence is not calibrated to evidence strength. It’s calibrated to token probability.
This is epistemologically revealing because it shows us something about what knowledge isn’t. Knowledge isn’t just producing true statements. It’s producing true statements through a process that’s sensitive to the truth. A broken clock is right twice a day. Nobody thinks a broken clock knows the time.
Harry Frankfurt’s concept of “bullshit” (from his 2005 book On Bullshit) is useful here. Frankfurt distinguished between lying and bullshitting. A liar knows the truth and deliberately says something false. A bullshitter doesn’t care about truth at all. The bullshitter’s statements might be true or false, but the truth value is incidental to the production process.
LLMs are, in this strict Frankfurtian sense, bullshit machines. Not because they’re trying to deceive, but because their production process is indifferent to truth. They’re optimized for plausibility, not veracity. When they happen to produce truth (which is often), it’s because plausibility and truth are correlated, not because the system is tracking truth as such.
A 2025 paper from ArXiv (“Epistemic Integrity in Large Language Models”) introduces the concept of “epistemic integrity” to describe what LLMs lack: the alignment between a model’s actual confidence and its expressed confidence. Their research shows that models routinely express high linguistic confidence (“The answer is clearly X”) even when their internal probability distributions are nearly flat. The model has no mechanism for translating genuine uncertainty into hedged language, because hedged language was associated with lower human preference scores during RLHF training.
This is a trained epistemological pathology. We literally trained these systems to be epistemically vicious: to express confidence they don’t have, because confident-sounding responses score better with human raters.
Calibration and the Quest for Epistemic Humility
Calibration is the technical term for what epistemologists call “epistemic humility”: knowing what you don’t know. A perfectly calibrated system would express 70% confidence on statements that are true 70% of the time, 90% confidence on statements that are true 90% of the time, and so on.
Human experts are famously poorly calibrated. Philip Tetlock’s research on expert political judgment (2005) showed that domain experts were barely better than random chance at predicting geopolitical events, while consistently expressing high confidence. Weathermen, by contrast, are among the best-calibrated experts: when they say 70% chance of rain, it rains about 70% of the time. The difference is feedback loops. Weathermen get immediate, unambiguous feedback. Political pundits don’t.
LLMs have a calibration problem that’s different from both of these cases. They don’t have a meaningful internal confidence metric that corresponds to truthfulness. The softmax probability of the next token tells you how likely that token is given the context, not how likely the resulting statement is to be true. These are fundamentally different quantities.
Recent work on LLM calibration has tried to extract calibrated confidence from models by asking them to self-report uncertainty. The results are mixed. Models can sometimes distinguish between things they “know well” and things they “know poorly,” but they do so unreliably, and they’re systematically overconfident. A 2025 study from ArXiv (“Epistemic Fragility in Large Language Models: Prompt Framing”) found that LLM epistemic judgments are fragile to prompt framing: the same model will express different confidence levels for the same factual claim depending on how the question is phrased. If the prompt signals expertise (“As a domain expert, what would you say about…”), the model becomes more confident. If the prompt signals uncertainty (“I’m not sure about this, but…”), the model becomes less confident. The model’s “confidence” is tracking the social context of the prompt, not its own reliability.
This is not calibration. This is performative epistemic posturing.
Compare this to how a well-calibrated human expert operates. A good doctor says “I’m confident this is a viral infection based on the symptoms, but let’s run a test to rule out bacterial causes.” The confidence is grounded in domain knowledge, tempered by awareness of edge cases, and accompanied by a plan to resolve remaining uncertainty. The doctor’s epistemic humility isn’t performative. It reflects a genuine internal model of what they know and don’t know.
Can we build this into AI systems? Maybe. But it would require a fundamental shift in training objectives. Instead of optimizing for “helpful, harmless, and honest” responses (where “honest” is evaluated by human raters who reward confident-sounding answers), we’d need to optimize for calibrated uncertainty expression. That’s a research direction, not a shipped product.
The DIKW Pyramid and Where LLMs Live
The Data-Information-Knowledge-Wisdom (DIKW) hierarchy, originally articulated by Russell Ackoff (1989), provides a useful lens for thinking about where LLMs sit in the epistemic landscape.
| Level | Definition | Example | LLM capability |
|---|---|---|---|
| Data | Raw symbols without context | “42” | N/A (LLMs don’t store raw data) |
| Information | Data with context and meaning | “The temperature is 42 degrees F” | Strong. LLMs are excellent at organizing information. |
| Knowledge | Information integrated with experience and judgment | “Bring a coat, it’s cold” | Simulated. LLMs produce knowledge-like outputs but lack experiential grounding. |
| Wisdom | Knowledge applied with ethical judgment and long-term perspective | “We should invest in insulation rather than buying more coats” | Absent. LLMs have no values, no long-term stakes, no skin in the game. |
LLMs are, I think, best understood as information-to-knowledge translators. They take information (patterns in training data) and produce outputs that have the form of knowledge (contextual, integrated, actionable statements) without the substance of knowledge (experiential grounding, causal understanding, epistemic self-awareness).
This is why the term “knowledge” in “knowledge base” or “knowledge graph” has always been misleading. What we call “knowledge representation” in AI has always been information representation. The difference between information and knowledge isn’t format. It’s the relationship between the representation and the agent holding it.
Michael Polanyi drew this distinction clearly in 1966 with his concept of “tacit knowledge”: the things we know but can’t articulate. You know how to ride a bicycle, but you can’t fully explain the physics of balance, steering correction, and momentum management that you’re unconsciously computing. Tacit knowledge is embedded in practice, in the body, in lived experience. It’s not propositional. It can’t be written down completely. And it can’t be extracted from text corpora, because it was never in the text to begin with.
LLMs have zero tacit knowledge. Everything they “know” is explicit, propositional, and derived from text. This is a fundamental epistemological limitation that no amount of scaling will fix, because tacit knowledge isn’t a data problem. It’s an embodiment problem.
What LLMs Taught Us About Internalism vs. Externalism
There’s an old debate in epistemology between internalists and externalists. Internalists say that justification must be accessible to the agent. You know something only if you can, in principle, articulate why you believe it. Externalists (like Goldman) say justification can be external to the agent. You know something if the process that produced your belief is reliable, even if you can’t explain that process.
LLMs make this debate viscerally concrete.
From an internalist perspective, LLMs know nothing. They can’t introspect on their own reasoning processes. They can’t distinguish between reliable and unreliable outputs. They have no internal access to their justification, because their “justification” is a 175-billion-parameter weight matrix that no one, including the model itself, can interpret.
From an externalist perspective, LLMs know quite a lot, at least in well-attested domains. The process (pattern matching over a massive, curated text corpus) is reliable for many categories of factual claim. The fact that the model can’t explain why it’s reliable doesn’t matter under externalism, just as the fact that you can’t explain the neuroscience of perception doesn’t undermine your perceptual knowledge.
What’s philosophically interesting is that LLMs push both positions to their breaking points. Pure internalism seems too restrictive: LLMs produce genuinely useful, often true information, and denying them any epistemic status feels like a philosophical tantrum. Pure externalism seems too permissive: accepting that LLMs “know” things requires accepting that the broken clock “knows” the time twice a day, unless you add some anti-luck condition, which brings you right back to Gettier.
The emerging consensus (if there is one) seems to be a kind of process-indexed externalism: LLMs can be ascribed knowledge in domains where we have empirical evidence that their process is reliable, but this ascription is always conditional, domain-specific, and defeasible. It’s knowledge on probation, not knowledge with a warrant.
The Testimony Problem
There’s another epistemological angle that matters here: the epistemology of testimony. Most of what humans know, we know through testimony. I’ve never been to Australia, but I know it exists because reliable sources told me so. I’ve never performed a physics experiment, but I know about quantum mechanics because I trust the physicists who did.
C.A.J. Coady (1992) argued that testimony is a fundamental source of knowledge, not reducible to perception or inference. We don’t verify every claim someone tells us. We rely on a complex web of social trust, institutional credibility, and track record.
When you ask Claude a question, what you’re getting is something that looks like testimony. But is it? Traditional testimony requires a testifier who has knowledge and intends to communicate it. The testifier believes what they’re saying, has reasons for believing it, and takes responsibility for its truth. LLMs satisfy none of these conditions. They don’t believe their outputs. They don’t have reasons. They take no responsibility.
And yet, practically, people treat LLM outputs exactly like testimony. They trust it, act on it, and cite it. This creates what we might call the “epistemic gap”: the difference between how LLM outputs function socially (as testimony) and what they are epistemically (statistical pattern completion).
The worry is that this gap erodes epistemic standards. If people get used to treating unreliable sources as testimony, they may become less discerning about testimony in general. This isn’t a speculative concern. A 2025 paper from the Taylor & Francis journal Episteme (“AI and Epistemic Agency”) documents how regular LLM use changes people’s belief revision processes, making them more likely to accept claims uncritically when those claims are fluently presented.
Epistemic Dependence and the Outsourcing of Knowledge
There’s a deeper structural issue here that goes beyond individual claims. As AI systems become more capable, humans are outsourcing not just information retrieval but judgment. When a doctor uses an AI diagnostic system, they’re not just getting information. They’re delegating an epistemic task: the task of weighing evidence, considering differentials, and arriving at a conclusion.
John Hardwig (1985) wrote about “epistemic dependence”: the condition of relying on others for knowledge you can’t independently verify. His point was that epistemic dependence is unavoidable in a complex society. No one can verify all the claims they rely on. The question is whether the dependence is well-placed.
Dependence on human experts is (at least in principle) well-placed because human experts have:
- Accountability: they can be questioned, challenged, and held responsible
- Transparency: they can explain their reasoning
- Skin in the game: their reputation depends on being right
- Awareness of limits: they know what they don’t know
Dependence on LLMs lacks all four. LLMs can’t be held accountable (who do you sue when Claude gives bad medical advice?). They can’t genuinely explain their reasoning (they can confabulate explanations, but that’s not the same thing). They have no skin in the game. And they’re not aware of their limits.
This suggests that the epistemological challenge of AI isn’t really about whether machines can “know” things. It’s about what happens to human knowledge when humans delegate epistemic tasks to systems that don’t satisfy the conditions that make delegation epistemologically safe.
What Building AI Has Taught Us About Epistemology
Here’s what I find most interesting about this whole topic. The question “can AI know things?” is less important than what trying to answer it has revealed about knowledge itself.
Before LLMs, epistemologists could wave their hands about justification because human cognition is opaque enough that we can’t fully specify what counts as a reliable process. Now we have systems where we can (in principle) inspect every weight, trace every computation, and analyze every statistical association. And it turns out that having complete access to the “justification process” doesn’t resolve the question of whether knowledge is present. If anything, it makes the question harder.
Here are the lessons I take from the AI epistemology literature:
1. Justification is not just reliability.
Goldman was right that reliability matters, but LLMs show that reliability alone is insufficient. A process can be reliable in aggregate while being unreliable in undetectable individual cases. Justification needs to include some notion of sensitivity: the process should be such that, if the truth were different, the output would be different too. LLMs fail this test comprehensively. They would produce the same confident output regardless of whether the underlying fact were true or false, as long as the statistical signal was the same.
2. Knowledge requires meta-knowledge.
You don’t really know something unless you know (at least roughly) the conditions under which your belief would be false. A doctor who diagnoses strep throat knows that if the rapid test comes back negative, they should reconsider. An LLM that “diagnoses” strep throat has no corresponding meta-knowledge. It doesn’t know what evidence would falsify its output, because it doesn’t know what evidence supports it.
3. The difference between information and knowledge is relational, not formal.
You can’t tell whether a statement constitutes knowledge just by looking at the statement. You have to look at the relationship between the statement and the agent making it. The same sentence (“water boils at 100 degrees Celsius at sea level”) can be information (when spoken by a system that memorized it), knowledge (when spoken by a chemist who understands phase transitions), or bullshit (when spoken by someone who heard it once and is guessing).
4. Embodiment might matter more than we thought.
The rationalist tradition from Descartes through classical AI assumed that knowledge is fundamentally propositional and disembodied: you can, in principle, know everything worth knowing through pure reasoning about propositions. LLMs are the most sophisticated test of this assumption ever built. They process propositions at superhuman scale and speed, and they still can’t be said to genuinely “know” things. Maybe knowledge requires a body, sensory experience, and causal interaction with the world. Maybe the empiricists were right.
5. Epistemic institutions matter more than epistemic agents.
The real story of human knowledge isn’t about individual knowers. It’s about institutions: peer review, replication, education, credentialing, professional accountability. These institutions create the conditions under which testimony is reliable, expertise is trustworthy, and errors get caught. The challenge of AI epistemology is less about the AI itself and more about the institutional structures (or lack thereof) surrounding its use.
The Honest Conclusion
I don’t think LLMs know things. Not in any philosophically robust sense. They produce useful information, often true, frequently actionable, and genuinely valuable. But knowledge requires more than true output from a sometimes-reliable process. It requires sensitivity to truth conditions, awareness of one’s own epistemic limits, and some kind of connection between the agent and the world that goes beyond statistical correlation.
That said, the question of whether machines can know things is ultimately less interesting than the question of what machines have taught us about knowing. They’ve taught us that justification is richer than reliability. That meta-cognition is central to knowledge. That embodiment might be non-negotiable. That the difference between information and knowledge isn’t the information, it’s the knower.
Plato’s theory held up for 2,400 years, then Gettier broke it in three pages. AI might be the next three-page paper: a concrete demonstration that our best theories of knowledge are still missing something fundamental. We built the most sophisticated information-processing systems in history, pointed them at the entirety of human text, and got back something that looks like knowledge from a distance and dissolves into statistical correlation under scrutiny.
That tells us something important. Not about machines, but about us.