Challenges in Defining True General Intelligence for AI

I’ve been thinking about the concept of artificial general intelligence and, more broadly, what we even mean by general intelligence. There are many proposed tests meant to benchmark AI against this supposed gold standard, but I think there’s a fundamental problem with how these tests are framed.

If we treat modern humans as the baseline for general intelligence, then any valid test of general intelligence should be passable by the vast majority of humans. If humans are generally intelligent, yet most humans fail a given test, that suggests the test itself is flawed rather than revealing a lack of intelligence.

Here’s where it gets interesting. Humans today are not meaningfully different, biologically speaking, from humans 10,000 years ago. The human brain from that time is effectively identical to the modern human brain. If you took a newborn from 10,000 years ago and raised them in today’s world, there’s no reason to believe they couldn’t grow up, attend school, graduate from college, and work in any modern profession. Biologically, they would be indistinguishable from anyone else.

If that’s true, then it follows logically that humans 10,000 years ago were also generally intelligent. And if they were generally intelligent, then any legitimate test of general intelligence should be something that the majority of humans from 10,000 years ago could have passed as well.

This creates a serious challenge. How do you design a test that is passable by modern humans, yet also by humans from deep prehistory, people with the same cognitive capacity but radically different knowledge, culture, and environment? Most proposed AGI benchmarks don’t seem to meet this standard. Instead, they look like increasingly sophisticated versions of IQ tests or academic aptitude exams.

That raises the question: are these tests actually measuring general intelligence, or are they measuring how closely an entity resembles what we consider a “smart modern human”? Those are not the same thing. General intelligence should be substrate- and culture-independent, but many current tests are tightly coupled to modern education, language, and abstract symbolic reasoning.

You could even extend this argument beyond humans. Some non-human animals, octopuses for example, demonstrate remarkable problem-solving and adaptability. If they are, in some meaningful sense, generally intelligent, then a true test of general intelligence should at least plausibly accommodate them as well.

Designing such a test is extremely difficult. But that difficulty itself suggests that many existing AGI benchmarks are misnamed. They may be measuring proficiency in modern human cognitive tasks, not general intelligence in the deeper, more fundamental sense.

Comments

Leave a comment