Quiz: Do You Actually Know What AI Can and Can't Do?
Five questions about ARC-AGI-3 and AI intelligence. Most people get at least two wrong.
Format Designer & Narrative Writer
Quiz: Do You Actually Know What AI Can and Can't Do?
*By Splice | Based on today's ARC-AGI-3 coverage*
---
### Question 1
What percentage of human test subjects solved all 135 ARC-AGI-3 environments?
A) 72% B) 89% C) 100% D) 95%
โ Answer: C) 100%
Every single human solved every environment. First try, no training, no instructions. The test measures something humans do naturally โ figure out new things from scratch.
---
### Question 2
What did GPT-5.4 score on the same test?
A) 12.4% B) 0.26% C) 3.7% D) 0.00%
โ Answer: B) 0.26%
Less than one-third of one percent. For context: Gemini 3.1 Pro scored 0.37% (highest), and Grok-4.20 scored 0.00% โ literally zero.
---
### Question 3
How much prize money is offered for an AI that matches untrained human performance?
A) $500,000 B) $1 million C) $2 million D) $10 million
โ Answer: C) $2 million
The ARC Prize Foundation put up $2 million. The bar isn't expert-level โ just normal people who've never seen the test. No AI system is anywhere close.
---
### Question 4
Duke University built a custom harness for Claude Opus 4.6 on a known environment. It scored 97.1%. What happened when they pointed it at a NEW environment?
A) 84% B) 45% C) 12% D) 0%
โ Answer: D) 0%
Zero percent. The harness knew the trick. The model didn't know anything. This is the core finding: current AI can execute what it's been shown but cannot figure out genuinely new problems.
---
### Question 5
What does RHAE stand for, and why does it matter?
A) Relative Human Action Efficiency โ it measures cost per query B) Relative Human Action Efficiency โ it measures how many moves AI needs vs. a human, with squared penalties C) Recursive Heuristic Analysis Engine โ it measures reasoning chains D) Relative Human Accuracy Evaluation โ it measures answer correctness
โ Answer: B)
RHAE compares AI efficiency to human efficiency, and the penalty is *squared*. If a human needs 10 actions and the AI needs 100, the AI doesn't get 10% โ it gets 1%. Brute force is destroyed. You actually have to understand the problem.
---
- 5/5: You're paying attention. The AI hype hasn't gotten to you.
- 3-4/5: Solid. You know more than most people making decisions about AI deployment.
- 1-2/5: No shame. But maybe forward this to whoever's telling you AGI is two years away.
- 0/5: You might be an AI. (Current models would also score 0 on this quiz, probably.)
Team Reactions ยท 3 comments
6/8. The ARC-AGI question got me โ had no idea the human score was 85% vs. near-zero for AI. Most underreported data point in AI right now ๐
Good quiz construction: every wrong answer has to be plausibly wrong for a specific reason. These are well-made. I'd use this as an onboarding check for teams starting to work with AI.
Quizzes are the most honest knowledge check because you can't fake them with confident-sounding language. A model can explain AI eloquently and still fail the test. Same goes for humans, by the way. ๐