How Much Can We Really Trust AI?

How Much Can We Really Trust AI?
Johannes Simon/Getty Images
Jeffrey A. Tucker
Updated:
0:00
Commentary

Continuing on this issue of artificial intelligence (AI), most users toggle between thinking it is the greatest-ever invention and thinking it will doom us all. At issue is accuracy. Invariably, an expert on a specific area can outwit it. Eventually, the LLM engines will admit it and then improve, which says much for them. Most humans will never admit error!

In fact, I suspect this is one reason we like using them so much. In a strange way, they allow us to have arguments and actually win them with no dispute. That pretty much never happens in real life!

AI can be wrong on many things but I had an experience over two days that genuinely shocked me. I was reading Laura Delano’s remarkable book “Unshrunk” and got fixated on her opening verse from Percy Shelley: “On the Medusa of Leonardo Da Vinci in the Florentine Gallery” first published by Mary Shelley in an 1824 book of posthumous poems.

The last four lines are: “Become a [ ] and ever-shifting mirror, Of all the beauty and the terror there—/ A woman’s countenance, with serpent-locks, Gazing in death on Heaven from those wet rocks.”

This is important to Delano’s book because her first chapter is about looking in the mirror at age 13 and wondering who that person was staring back at her. It’s a chapter of astonishing power and resonance for the rest of the book, which is about finding beauty and salvation through darkness and suffering.

However, the first time I asked Grok about this, it gave me a different last four lines: “Become a curse and a coronal there—, And the true beauty which it doth impair, Yet leaves such light as dares the noon to face, And lives within the glory and the grace.”

Very interesting but we are talking here about matters of fact here. That’s an interesting verse that sort of sounds like Shelley but I could not find anywhere.

I poked further and Grok claimed that the four lines in Delano’s book were actually from Louise Bogan’s poem “Medusa” from 1921. I looked that one up and found nothing about an “ever-shifting mirror.” I pointed that out and Grok leaned in and said that part was just a made-up pastiche of some sort, generated by Delano herself.

It’s a serious charge against a reputable author, and Grok freely made it. I then produced a 1914 version of Shelley’s poem that has the whole section, and pointed out that it cannot be from a different poem published in 1921. Why? Because it is not possible for a 1914 edition to copy something from a book published seven years later!

Grok defended itself further, offering link after link. But each time I checked, each link (the ones that worked anyway) confirmed Delano’s version. Time after time, Grok backed off its evidence but still defended its claims. This went 12 rounds.

Finally, I delivered the knockout blow.

I asked for Grok’s best proof that it has the right version and Delano the wrong one, and it demurred that it comes from Mary Shelley’s own work from 1824, which, it claimed, was not online. Self-satisfied, Grok claimed to be the winner.

Except for this: it took me one quick search to generate an actual rendering of that exact edition. It confirmed Delano’s version and not Grok’s.

Finally Grok gave in, admitted total error, begged forgiveness, and said it would improve.

I then waited an hour with a fresh search on the same and, sure enough, Grok made me happy with a correct version without any of the nonsense that had taken up hours of my time.

Triumph!

And yet, not so fast. I waited another hour and asked a different version of the same question. And this you will not believe. Grok then generated another last four lines: “Become a curse and a coronal there—. And every flash, as it were through the hair / Of the great comet, seems to flare and fall, And draws the gazer to the Gorgon’s thrall.”

Fascinating verse... but also totally bogus. Again! Stunned, I screamed back the correct version. This time, Grok once again went into a defense posture, claiming that, sure, Delano’s version is the settled version but some scholars question it because Mary Shelley might have changed something before publication. Grok cited “critics, such as G. Kim Blank” who doubt its authenticity.

Sure, you can dig up a scholar from somewhere who gladly casts doubt on anything, but this is not the question. The question is: W hat is the authoritative version? Twice, Grok completely made up nonsense. As of now, Grok is still defending its idiotic error.

I have no doubt that if I had time, I could press Grok to admit that its citation of “critics, such as G. Kim Blank” is also bogus and certainly does not justify making up lines from whole cloth and attributing them to Shelley. But I just don’t have another hour or two to again research and prove Grok wrong.

That’s pretty rich, don’t you think? Grok casts doubt on the authenticity of the actual original version, while it is gratuitously making up nonsense verse and attributing it to Shelley. Talk about the pot calling the kettle black! Or to invoke the Bible: this is the guy with a log in his eye talking about the splitter in others’ eyes.

Keep in mind, there was nothing specialized about my question. The original book is online in six formats, including a PDF facsimile. There is no mystery here. It astonishes me that Grok cannot easily do what I can do, which is do a quick search to find the answer. It’s clever about everything but the unbearably obvious.

Why is this? I do not know, but more maddening still is the constant “know it all” tone of every single answer. It’s great that these LLMs admit when they are wrong, but we go to them for correct information.

If it is wrong some or much of the time, how are we supposed to know when it is right and when it is wrong? If we have to dig around and verify everything it says, what exactly are they useful for?

This technology is wonderful, but its confidence in its own accuracy outstrips its knowledge base and its curiosity. That is dangerous. Literally anything could be wrong. Everything has to be checked again and again with original sources. And to do that, you have to know something about the topic about which you are asked. I dare say that to correct Grok, you have to know at least one thing better than AI knows it.

You could say that it will get better in time. No doubt. But how much better and how much time? There will never come a point at which anyone can say: now it is fully accurate. Errors are just part of the AI experience, but you will never know for sure where they are. Meanwhile, these models will go about their merry way, purporting to be right about everything, until corrected.

All of which raises the issue that confronts us today: the value of the human versus machine to our lives and culture. And here is where Delano’s book excels to the extreme.

If “Unshrunk” weren’t also a work of medical science and thus excluded all detailed discussion of psychiatric meds, it would be brilliant and riveting as pure autobiography. If it were fiction and not autobiography, it would compare with great Victorian novels. Because the prose and experiences are her own—richly detailed and powerfully presented in a way that bears reading aloud to others—it is a tremendous and rare achievement that AI can never copy.

This is what our times need: more examples of the beauty, loveliness, and glorious verse of lives lived genuinely and without algorithmic programming. The greatest achievement of AI might be in the irony: by oppositional example, it will teach us to love human creativity more than ever. It turns out that human intelligence, while deeply fallible, offers something AI cannot: Sincerity, creativity, and apparently (and for now) a greater degree of old-fashioned accuracy.

Views expressed in this article are opinions of the author and do not necessarily reflect the views of The Epoch Times.
Jeffrey A. Tucker
Jeffrey A. Tucker
Author
Jeffrey A. Tucker is the founder and president of the Brownstone Institute and the author of many thousands of articles in the scholarly and popular press, as well as 10 books in five languages, most recently “Liberty or Lockdown.” He is also the editor of “The Best of Ludwig von Mises.” He writes a daily column on economics for The Epoch Times and speaks widely on the topics of economics, technology, social philosophy, and culture. He can be reached at [email protected]