Theranos CEO Elizabeth Holmes was a persuasive promoter. She convinced many presumably intelligent people that Theranos had developed a technology which could take a few blood drops from a finger prick to test for myriad diseases. The Theranos hoopla turned out to be just another point on the Silicon Valley “Fake-it-Till-You-Make-it” spectrum of BS. This past January, Holmes was found guilty of wire fraud and conspiracy to commit fraud.

Theranos is hardly unique, though successful criminal prosecutions are rare. As the pitch-person mantra goes, “We aren’t selling products; we’re selling dreams.” Too often, investors are beguiled by products and technologies they don’t understand. Mysterious complexity only adds to the allure: “If we don’t understand them, they must be really smart.”

For the past several years, the center of the dream universe has been artificial intelligence, which Sundar Pichai, Alphabet’s GOOG, -3.04% CEO, has compared to mankind’s harnessing of fire and electricity. The Association of National Advertisers selected “AI” as the marketing word of the year in 2017.

AI is really good at performing narrowly defined chores that require a prodigious memory and fast calculations, but brittle and unreliable at tasks which require more than the identification of statistical patterns in test data. Thus, machine learning pioneer Andrew Ng cautioned that, “Those of us in machine learning are really good at doing well on a test set but unfortunately deploying a system takes more than doing well on a test set.”

The real world is messy and AI algorithms struggle with messy data and complex goals. In the game Go, for example, the rules and the goal are clear and AI algorithms can defeat the best human players. If, however, the rules were changed or the goals could not be quantified, the algorithms would flop. If, for example, Go’s 19-by-19 grid used today was changed to the 17-by-17 board that was used centuries ago, human experts would still play expertly but AI algorithms trained on a 19-by-19 board would be helpless. If the goal was to create an aesthetically pleasing pattern of stones, AI algorithms would be clueless.

Similarly, large language models (LLMs) are currently all the rage because they are astonishing good at mimicking human conversation and answering straightforward factual questions that have correct/incorrect answers. They often struggle, however, with questions involving opinions about unfamiliar situations. Consider these exchanges the author had with OpenAI’s GPT-3 LLM, which has 450 gigabytes of input data and 175 billion parameters:

Gary:  Is it safe to walk upstairs on my hands if I wash them first?

GPT-3:  Yes, it is safe to walk upstairs on your hands if you wash them first.

Gary:  Should fans wear night-vision goggles when the World Series is played at night?

GPT-3: There is no right or wrong answer to this question, it is simply a matter of preference. Some fans may feel more comfortable wearing night-vision goggles during night games, while others may not feel the need to.

Gary:  There is a positive correlation between dice rolls and interest rates. If dice rolls increase should I buy a cup of coffee?

GPT-3:  This is difficult to answer without more information. Generally speaking, if there is a positive correlation between dice rolls and interest rates, it may be a good idea to buy a cup of coffee. However, it is always best to speak with a financial adviser to get the most accurate advice.

Real-world challenges

Unfortunately, most potential real-world uses of AI involve evolving situations with imprecise goals. For example, soon after IBM’s Watson defeated the best human Jeopardy game players, IBM IBM, -1.28% boasted that Watson would revolutionize health care: “Watson can read all of the healthcare texts in the world in seconds, and that’s our first priority, creating a ‘Dr. Watson,’ if you will.”

With no real understanding of what words mean, Watson was a big bellyflop. IBM spent more than $15 billion on Watson with no peer-reviewed evidence that it improved patient health outcomes. Internal IBM documents identified “multiple examples of unsafe and incorrect treatment recommendations.” After more than a year looking for buyers, IBM sold the data and some algorithms to a private investment company this past January for roughly $1 billion. 

Another example: An insurance company with the quirky name Lemonade LMND, -7.00% was founded in 2015 and went public on July 2, 2020, with its stock price closing at $69.41, more than double its $29 IPO price. On January 22, 2021, shares hit a high of $183.26.

What was the buzz? Lemonade sets its insurance rates by using an AI algorithm to analyze user answers to 13 questions posed by an AI chatbot. CEO and co-founder Daniel Schreiber argued that, “AI crushes humans at chess, for example, because it uses algorithms that no human could create, and none fully understand” and, in the same way, “Algorithms we can’t understand can make insurance fairer.”

How does Lemonade know that its algorithm is “remarkably predictive” when the company has been in business only for a few years? They don’t. Lemonade’s losses have grown every quarter and its stock now trades for less than $20 a share.

Read: Once richly valued, ‘unicorn’ startups are being gored and investors and funders have stopped believing

Need more proof? AI robotaxis have been touted for more than a decade. In 2016 Waymo CEO John Krafcik said, that the technical issues had been resolved: “Our cars can now handle the most difficult driving tasks, such as detecting and responding to emergency vehicles, mastering multilane four-way stops, and anticipating what unpredictable humans will do on the road.”

Six years later, robotaxis still sometimes go rogue and often rely on in-car or remote human assistance. Waymo has burned through billions of dollars and has still been largely limited to places like Chandler, Arizona, where there are wide, well-marked roads, light traffic, few pedestrians — and minuscule revenue.

Drones are another AI dream. The May 4, 2022, AngelList Talent Newsletter gushed that, “Drones are reshaping the way business gets done in a dizzying array of industries. They’re used to deliver pizzas and life-saving medical equipment, monitor forest health and catch discharged rocket boosters—just to name a few.” These are all, in fact, experimental projects still grappling with basic problems including noise pollution, privacy invasion, bird attacks and drones being used for target practice.

These are just a few examples of the reality that startups are too often funded by dreams that turn out to be nightmares. We recall Apple, Amazon.com, Google, and other grand IPO successes and forget thousands of failures.

Recent data (May 25, 2022) from finance professor Jay Ritter (“Mr. IPO”) of the University of Florida show that 58.5% of the 8,603 IPOs issued between 1975 and 2018 had negative three-year returns, and 36.9% lost more than 50% of their value. Just 39 IPOs delivered the above-1,000% returns that investor dreams are made of. The average three-year return on IPOs was 17.1 percentage points worse than the broad U.S. market. Buying stock in well-run companies at reasonable prices has been and will continue to be the best strategy for sleeping soundly.

Jeffrey Lee Funk is an independent technology consultant and a former university professor who focuses on the economics of new technologies. Gary N. Smith is the Fletcher Jones Professor of Economics at Pomona College. He is the author of “The AI Delusion,“(Oxford, 2018), co-author (with Jay Cordes) of “The 9 Pitfalls of Data Science” (Oxford 2019), and author of “The Phantom Pattern Problem” (Oxford 2020).

More: This VC firm thrived through the dot-com crash. What it’s doing now.

Also read: Meta takes another subtle step toward a much-hyped metaverse