LLMs’ ‘reversal curse’ leads it to fail at drawing relationships between simple facts. It’s a problem that could prove fatal
In 2021, linguist Emily Bender and computer scientist Timnit Gebru published a paper that described the then-nascent field of language models as one of “stochastic parrots”. A language model, they wrote, “is a system for haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning.”
… If a human learns the fact, “Valentina Tereshkova was the first woman to travel to space”, they can also correctly answer, “Who was the first woman to travel to space?” This is such a basic form of generalization that it seems trivial. Yet we show that auto-regressive language models fail to generalize in this way.
This is an instance of an ordering effect we call the Reversal Curse.
[R]esearchers “taught” a bunch of fake facts to large language models, and found time and again that they simply couldn’t do the base work of inferring the reverse. — Read More