This paper convinced me LLMs are not just “applied statistics”, but learn world models and structure

This paper convinced me LLMs are not just “applied statistics”, but learn world models and structure: https://thegradient.pub/othello/

You can look at an LLM trained on Othello moves, and extract from its internal state the current state of the board after each move you tell it. In other words, an LLM trained on only moves, like “E3, D3,..” contains within it a model of a 8×8 board grid and the current state of each square. — Read More

#nlp