AI cannot sanity check itself

Jun 03, 2025

Interesting conversation about AI below between Alex Kantrowitz and Gary Marcus.

What neither commentator understands is that the basic problem with AI dependability is still the problem that AI cannot detect logical paradoxes inherent in its own ‘reasoning’ - Gary Marcus does allude to this phenomenon when he says large language model AI cannot sanity-check itself but what he doesn’t see is that AI, no matter how methodically logically constructed it is, will never be able to sanity check itself:- and this is precisely what Gödel’s second incompleteness theorem predicts.

Gödel’s Incompleteness theorems were a response to Bertrand Russell’s attempt to fully define the logical basis of all mathematics in his three volume “Principia Mathematica” between 1910 and 1913z

According to Gödel’s second theorem every logically constructed system, no matter how complex, systematically results in some version of the logical equivalence “this sentence is not true” - and even when such a paradox is explicitly excluded (such as in an AI model that excludes the paradox every time it is encountered), the incompleteness theorem still holds true for the larger resulting system, as the logically complete system then simply becomes a larger logical system that encompasses the excluded paradox and in the new, larger system Gödel’s theorem inevitably still holds true. (Same thing with computer programs hence Turing’s theorem showing that there is no mechanism to solve the halting problem - this is why Macs still crash even when programs are isolated and the OS monitors them for logical loops - the overall system is then the semi-isolated individual programs plus the OS)

In AI, AI hallucinations are the inevitable outcome.

Only human beings, spiritual beings in other words, can recognise a paradox, Gödel’s theorem shows us that machine thinking, however well written it is, will always be unable to recognise its own paradoxes, and my prediction is that this problem will persist, no matter what kkind of logic-based architecture an AI system has.

Indeed, I wonder if Gary Marcus’ approach for all its merits might well push things in the wrong direction insofar as AI hallucinations go: the problem of the undetectability of paradoxes —which is inevitable — can only get worse, the more logically complete a system becomes.

By analogy, as Chesterton points out somewhere, it’s not the crazy people you have to watch out for, it’s the ones who are way too sane….!!!

FirstFactCheck

Discussion about this post