Are LLMs building internal models of the world?

This is a discussion migrated from our deprected Discord server.

From davey:

  1. Large Language Model: world models or surface statistics?

The Gradient

Large Language Model: world models or surface statistics?

A mystery Large Language Models (LLM) are on fire, capturing public attention by their ability to provide seemingly impressive completions to user prompts (NYT coverage). They are a delicate combination of a radically simplistic algorithm with massive amounts of data and computing power. They are trained by playing a guess-the-next-word

Large Language Model: world models or surface statistics?

  1. [4:34 PM]

thought id share this article. Its alittle technical but the conclusions are interesting. The authors believe their LLM is building internal models of the world, rather than regurgitating surface level statistical data… so perhaps they arent simply using math to “predict what to say next” but actually have some “understanding” of what they are saying based on internal models or representations of the world. The fact that these LLMs have shown the ability to seemingly understand and comprehend complex semantics is so bizzare. I cant quite rap my head around it. Like I said at the meetup, I think were going to build sentient machines without even realizing it or intending to. I also find it remarkable just how little the engineers of these systems actually understand whats going on under the hood. Crazy stuff!

From davey:

I also wanted to note that many neuroscientists think that self-awareness is simply people “modeling” what it feels like to be themselves. What happens when these LLMs build conceptual models of themselves? perhaps its already happened.

From davey:

  1. The idea that ChatGPT is simply “predicting” the next word is, at best, misleading — LessWrong

The idea that ChatGPT is simply “predicting” the next word is, at b…

But it may also be flat-out wrong. We’ll see when we get a better idea of how inference works in the underlying language model. • * * * * * …

The idea that ChatGPT is simply “predicting” the next word is, at b...

  1. [4:49 PM]

“LLMs are not simply “predicting the next statistically likely word”, as the author says. Actually, nobody knows how LLMs work. We do know how to train them, but we don’t know how the resulting models do what they do. Consider the analogy of humans: we know how humans arose (evolution via natural selection), but we don’t have perfect models of how humans worked; we have not solved psychology and neuroscience yet! A relatively simple and specifiable process (evolution) can produce beings of extreme complexity (humans). Likewise, LLMs are produced by a relatively simple training process (minimizing loss on next-token prediction, using a large training set from the internet, Github, Wikipedia etc.) but the resulting 175 billion parameter model is extremely inscrutable. So the author is confusing the training process with the model. It’s like saying “although it may appear that humans are telling jokes and writing plays, all they are actually doing is optimizing for survival and reproduction”. This fallacy occurs throughout the paper. This is the why the field of “AI interpretability” exists at all: to probe large models such as LLMs, and understand how they are producing the incredible results they are producing.”

From Yujian Tang:

  1. depends what you mean by “not simply predicting the next statistically likely word”

  2. [7:13 AM]

all neural nets are designed to do is just that

  1. [7:13 AM]

predict statistically likely outcomes

  1. [7:14 AM]

they are high dimensional statistical representations

  1. [7:14 AM]

one could argue that humans are similarly wired

  1. [7:14 AM]

perhaps with a different modulating function

I see a strong correlation between how the structures of the brain generate the statistics of thought and how these llm neural nets structure themselves to generate next token probabilities. It is really interesting that we are finding embedded neural nets in the llms that are trained for specific inferences that we would consider symbolic reasoning. I think a revisiting of Lacanianism - Wikipedia is in order as these LLMs may be developing along previously only-theorized lines.


Lacanianism or Lacanian psychoanalysis is a theoretical system that explains the mind, behaviour, and culture through a structuralist and post-structuralist extension of classical psychoanalysis, initiated by the work of Jacques Lacan from the 1950s to the 1980s. Lacanian perspectives contend that the world of language, the Symbolic, structures …