Print
Hits: 43

from the long, but worthwhile article department

I have studied and watched artificial intelligence grow over the last forty years. Like many, back in 1968, I was inspired by HAL 9000 in Stanley Kubrick's 2001: A Space Odyssey. The question on my mind at the time was, "Is HAL 9000 possible?" Thus began my interest in learning all I could about computers and something called Artificial Intelligence.

By 1970, my excitement grew when I read a quote by AI pioneer Marvin Minsky (in Life magazine), "In three to eight years, we will have a machine with the general intelligence of an average human being." The future was so bright I was beyond inspired.

Death in the AI Winters

Background

Before I continue, I will state that I am very much pro-AI. There are many areas where AI approaches to problems work exceptionally well (some examples toward the end). My concern and cautious approach to AI stems from the overselling and hype fed to the general public and investment community about AI's current capabilities. There has been real progress with AI technology, and as the past has shown, over-promising any technology has negative consequences for the future. Finally, this article uses minimal technical jargon and makes few assumptions about the reader. Supporting documentation for much of what is discussed is provided as links to more technical resources.

Fifty-five years after Minsky (and others) made bold statements about the "general intelligence of an average human," we are still far from HAL 9000. Historically, there was some success; however, two "AI winters" developed from approximately 1974–1980 and 1987–2000. The winters were largely due to "AI approaches and ideas" not working as expected. For instance, from 1983 to 1993, the US DOD spent $1 billion on the Strategic Computing Initiative, which focused on AI. The initiative was a reaction, in part, to Japan's 5th Generation Computing Project, which basically scared the bejeebers out of the traditional software industry because part of the project was to automate software development using AI.

Despite the AI winters, work continued on topics such as expert systems, genetic algorithms, and artificial neural networks. Many of these AI methods required substantial computation that was not readily available at the time. In 1980, a new AI hardware company, Symbolics, produced a specialized computer system to run programs written in the Lisp AI programming language. Originally specified in the late 1950s, Lisp is one of the oldest high-level languages still in use today and was designed to manipulate symbols rather than numbers. The oldest high-level programming language is Fortran, which was developed to perform mathematical operations (the name Fortran is short for Formula Translation).

In 1983, a company called Thinking Machines began with the goal of providing large-scale parallel "connection machines" for running Lisp code. Thinking Machines pivoted to more High Performance Computing (HPC) oriented applications by introducing a Fortran compiler for their third-generation machine.


Thinking Machines Connection Machine CM-2 (Source Wikipedia).

Interesting side note, on 15 March 1985, symbolics.com became the first registered .com domain name, and soon after, Thinking Machines registered the third (think.com). BBN Technologies was the second .com domain to be registered.

Neither Symbolics nor Thinking Machines survived the AI winters -- along with many other AI companies, including the 5th generation computing project.

AI Survivors

Research into AI continued through the AI winters. In 1986, David Rumelhart, Ronald J. Williams, and Geoffrey Hinton co-authored a paper that popularized the idea of the back propagation algorithm for training multi-layer neural networks (neural networks mimic how the human brain works). This approach led to many AI successes, including breakthroughs in image recognition (e.g., recognizing things like handwritten text)

Skipping ahead, in 2017, Google researchers introduced an AI model called a transformer architecture in their landmark paper, "Attention Is All You Need." Transformer models can determine how each part of a text sequence influences and correlates with other parts of the text stream. This approach led the way to what are called Generative Pre-trained Transformers, or GPT for short. These tools allowed relationships between words in large amounts of text to be created. The idea is actually quite clever. Instead of trying to define a word formally, define the word by how it is referenced by existing text, for instance, rather than defining a house cat as a domestic, fur-covered animal with four paws, claws, and a tail, which makes for entertaining web videos. A house cat is defined by how it relates to other words in a large amount of text. Thus, the sentence "the house cat dug its claws into the sofa" shows a relationship between a house cat, its claws, and the sofa. Those relationships get reinforced by other sentences like "the house cat scratched the carpet with its claws." Now consider a text corpus as large as the Internet, and the definition (relationships) of a "house cat" gets further refined by how it is used.

The Secret Sauce: LLMs

Transformers are used to create Large Language Models or LLMs. These are the computer programs you "converse with" on the web when you ask questions about cats or anything else. The most popular of these LLMs are OpenAI ChatGPT, Google Gemini, or Anthropic Claude (there are many others).


Nvidia GPU PCIe add-in card (Image courtesy of NVIDIA.

These tools or models are created by scanning the Internet for text data (and images) and then used to create LLMs. The scraped data are transformed into a numeric format that can be evaluated using something called linear algebra. It turns out devices called GPUs (historically called Graphical Processing Units, like those from Nvidia and AMD) are particularly good at accelerating traditional computers for this kind of math. The amount of data used in this process is huge, so large that no single person or group could recall even a fraction of it. The training process uses linear algebra and massive amounts of GPUs to analyze the data for relationships, then provides "weights" that represent these relationships. After training is complete, when you ask these models about things like cats, the weights help steer the answer based on the computed probabilities. For instance, if you asked "what types of things do house cats scratch with their claws," the model weights would help direct the answer to things like sofas and carpets.

The process of using the weights to determine a good answer (not necessarily the best answer) is called inference and can "correctly" predict the next word (sometimes called a token) in an answer to a question (or make videos). This feat is actually quite remarkable and offers significant benefits when applied to specific areas.

Not Quite Right

Where modern LLMs go off the rails is when they "hallucinate," an industry euphemism for "get it wrong." The hallucinations are due to the statistical nature of the relationships between data. The models try to predict the next word based on the probabilities obtained from scanning the internet and computing relationships. Mix in a little randomness (referred to as model temperature), and these predictions can yield answers ranging from a precise, thorough response to nonsense that sounds correct. A common example is the hallucination of citations in the legal or academic fields. Some of these results often look correct, including subjects, authors' names, and journals, but are a total fabrication. These are the answers the LLM considers "probabilistically correct" because the training weights direct the response flow toward what the LLM has learned a citation should "look like." The LLM has no sense that the citation is correct or utter nonsense. Thinking a little deeper about intelligence. Consider that by the age of 4-5, children have become quite good at navigating their world. They have a grasp of many general concepts (spatial movement, language, relationships, etc.) without having to read the entire Internet. Through experience, children develop a usable worldview by playing and pretending. They continue to learn and refine their worldview into adulthood -- some even read articles like A Non-Technical AI Primer. Current AIs seem to lack a worldview or even a sense of how they exist in the world. This lack of "awareness" is why LLMs struggle to play chess or solve childhood puzzles.


By Alan Light - CC BY-SA 3.0, Wikimedia

It is important to remember that LLMs are computer programs that try to predict the next probable word based on all the "language" they have collected from the Internet (i.e, the "Language" in LLM). An LLM does not think as you do. They are particularly good at assembling responses based on relationships of which we may not be aware. This behavior is exciting because it may bring to light new ideas or relationships, but LLMs have trouble understanding whether the answer "makes sense." In addition, ask an AI the same question more than once, and you may get different answers.

Another important distinction is the scope of many AI systems. Most AI "success" in the news comes from Narrow AI applications that are limited to a single, narrowly defined task. Outside of their training domain, they do not work. A good example is Google DeepMind's AlphaFold, which predicts protein structures that would otherwise require substantial classical computation. There is no way to ask AlphaFold to recommend music or videos, another popular use of narrow AI. The quest for a more general AI drives much of current AI research and investment.

Many AI companies justify their massive investments in broad or general AI because they believe they can create an AGI (Artificial General Intelligence) with their LLMs. Note that there is no consensus as to what constitutes an AGI. According to many LLM researchers, all that was needed was to scale the data (and, accordingly, the number of GPUs) to reach their goal. This superintelligence would give them a huge advantage in pretty much everything. I consider this their Marvin Minsky moment because a 2025 AAAI (Association for the Advancement of Artificial Intllegence) report indicates that 76% of AI researchers believe that simply scaling up current AI LLM models is "unlikely" or "very unlikely to achieve AGI. Note, many research believe AGI may be possible, but do not think creating bigger LLMs will get us there. There is also the question of "is it safe to even go there." which is a topic for another time.

The big AI companies believe they can replace a portion of the workforce with LLM-based AI technology. AI LLMs have shown success in many areas; however, at this point, this goal seems a bit speculative and premature. While the latest AI models are available to researchers and developers, they are also under rapid, ongoing development. New research papers and approaches are posted daily. As has been reported, replacing a customer service representative with a state-of-the-art AI chatbot can have issues. More importantly, according to some scholars, we are still very much not there yet and may not get there on our current path. In addition, a recent academic study found that LLM-based AI agents performed poorly on standard Customer Relationship Management (CRM) benchmark tests and failed to grasp the need for customer confidentiality.

Gambling in the AI Casino

One way to view the current AI market is like a casino full of slot machines. Success receives a lot of attention, bells, and flashing lights, while losses are rather quiet, except for the mumbling of losing players. There are many cases where LLMs give nonsense answers to what most humans perceive as simple questions. One of my favorites was asking a popular LLM model, "Is water frozen at 27 Fahrenheit?"

The response:

Water will not freeze at 27 degrees Fahrenheit because its freezing point is 32 degrees Fahrenheit; therefore, at 27 degrees, the water will still be liquid and would need to reach 32 degrees to begin freezing, with the time it takes to reach that point depending on the volume of water and the surrounding temperature.


Slot machines by Yamaguch i先生, CC BY-SA 3.0

Some logic and knowledge are missing from the response. The model has since been fixed to give the correct reply; however, there may be other "misconceptions" that result in "hallucinations" on other topics. Indeed, LLMs suffer from an "outlier problem." LLMs have difficulty generalizing knowledge and find it difficult to transfer relationships from known scenarios to new ones. A good example of this is basic multiplication. LLMs learn relationships between numbers when there are many simple multiplication problems in the training data. However, the size of the numbers (more integer digits) results in more wrong answers because the LLM has not learned multiplication, but rather memorized the relationships it found on the web for small sized numbers, which are more frequent.

By the way, if you gave the above "water is not frozen" answer in science class, you would be told you were wrong and not that you hallucinated the answer. When LLMs give correct responses, users gain confidence (a win at the AI slot machine). They can give the impression that because AIs are competent in several areas, they are competent in all areas (just because you win at the slot machine most of the time does not mean you win all of the time). The problem is that LLMs remain confident even when they produce incorrect answers, such as when they provide a justification for why water at 27 degrees Fahrenheit is not frozen.

In many cases, LLMs perform remarkably well within their training domain, which can be quite large. LLMs seem to do quite well at summarizing content—a task that condenses text while retaining its overall meaning. When asked to expand on content, LLMs are given more opportunities for hallucinations. For instance, "write a long report on how water can freeze at less than 32 degrees Fahrenheit,"

On one end of the spectrum, LLMs can be considered super-search engines that use an Internet full of data to fully answer a question faster (or perform a task) more accurately than a team of experts, and at the other, a used car salesman that will never say "I don't know." The challenge is to know with "whom" you are talking.

Weird Vibes When Coding

LLMs are actually good at creating some computer programs or text that looks like computer programs. A new term, "vibe coding," is used to characterize the process of programming with an LLM. The laudable goal is to tell the LLM what you want the program to do, then have it generate the program and, ideally, a test suite to check it for errors.

This process seems to work in simple cases but struggles with security, code complexity, debugging, and hallucinated functions. Indeed, a recent study of 16 experienced developers from large open-source repositories found that using LLMs actually slowed their expected completion times. When using AI tools, the developers took 19% longer to complete issues. Curiously, developers' perception was that using AI reduced completion time by 24%. Like many of the cases previously presented, there is a sense that AI coding tools are getting there, but there are still some issues to address.

AI Success in Science and Technology

As mentioned, I believe in the progress and pursuit of AI technology. Small-scale (narrow AI) business applications are showing success with AI. The use of AI in science and technology, however, has delivered dramatic performance and increased capabilities to both High Performance Computing (HPC) and the scientific process.

Supercomputing systems are designed to run Model Simulation (ModSim) programs. These types of programs are used for weather forecasting, crash simulations, galaxy collisions, and other "rocket science" type of applications. In terms of computing, these applications, based on underlying physics and chemistry, perform extensive "number crunching" that can take days or weeks to reach an answer.

The advent of large-scale AI modeling has altered this tried-and-true HPC computation formula. Large AI models can be trained on ModSim and actual data to produce "data models" that accurately solve traditional mathematical models in far less time, without requiring the solution to rely on underlying physics and chemistry. For example, Microsoft has developed an Atmosphere Model (weather) called Aurora that is 5,000 times faster than the traditional ModSim approach. The accuracy (as compared to other ModSim results and actual weather) is equal to or better than traditional numerical models.


Hurricane from Space courtesy NASA

There is no hardware or software presently available that can accelerate a traditional ModSim approach by even 100 times, let alone by a factor of 5,000. AI-based data models are showing similar results in other areas, like chemistry simulations,

Other remarkable progress has been made in creating an "AI-based Scientist" that can read research papers, plan experiments, and interpret results. Founded in 2023, FutureHouse is building AI-based scientists that work alongside human scientists to conduct science more efficiently and faster than was previously possible.

FutureHouse suggests on its webpage that its AI-based scientist can accomplish what human scientists do in 6 months in a single day. A single researcher with our AI Scientist can accomplish what an entire team would accomplish previously.

Like many of these examples, AI in science and technology is creating a "virtuous cycle of discovery" that is accelerating progress. This trend and other advantages of AI in the sciences have led to the formation of the Trillion Parameter Consortium (TPC), which will hold its second worldwide meeting, TPC26, the first week of June 2026 in Washington, D.C.

AI Doom (and Robots)

The media likes to talk about a possible "AI takeover" in which a wayward (or the first) AGI system acquires the ability to override human decision-making through economic manipulation, infrastructure control, or direct intervention. Aside from the fact that this is a good explanation for the state of the world today, my prediction is "not any time soon." Recall that the "I'm going to scale-up my computer until I create AGI" notion (explained above) has been proven wrong -- although there are those who are still trying. Instead, I predict there will be damage caused by AI, just as there is by misconfigured or faulty computer systems in place today. Misuse of AI will break things, and the same goes for spreadsheets. Beyond the AGI takeover, AI is still a powerful technology, and legislation will be needed to limit nefarious uses.

One topic that does scare people is the combination of humanoid robots and AI. Besides being a popular science fiction, these killer robots are not here yet. Progress on humanoid robots has been quite rapid over the last five years. These robots have the ability to move in a human-like fashion, but all the fancy demos, including the dancing and karate moves you may see on the Internet, are not autonomous in-the-moment responses to the surrounding environment. Much of the complex movement has been pre-programmed to demonstrate the robot's capability. We are not at the point where these robots are cooking breakfast, washing dishes, or escaping to get a job at WestWorld. Humanoid robots will continue to get better and learn to do more specific things (again in a narrow AI sense), but when the dreaded uprising comes, just remember these two words, "battery life," and you will do just fine.

Navigating AI

Here are a few ways I currently navigate the growing AI ecosystem (some of which is sage advice I received before we all plugged into the Internet). As I mentioned at the beginning of this article, I am pro-AI and believe it is important to set realistic expectations for the current level of AI technology.
  1. AI technology is not going away. It will continue to influence and hopefully improve mankind. I believe in the AI future -- even if HAL will not open the pod bay door.
  2. Be a critical reader. Reports indicate that half of the content you read online is now AI-generated (referred to as "AI slop"). There are some ways to tell, but most people get a feel for overgeneralized "flowery" content that may seem repetitive and has no grammatical errors. Be a skeptical reader, which is good advice in any case. If you are a domain expert (e.g., lawyer, accountant, engineer, scientist), issues with LLM-generated text within your domain will become apparent. If you are unfamiliar with the answer content, consider multiple, non-LLM web searches (ignore the "AI Overview" results). LLMs are, after all, based on these internet resources.
    Remember, LLMs are providing a probabilistic answer to your question; those probabilities are based on the LLM training. They do not think logically about the answer, which has led researchers to consider "neurosymbolic" approaches (i.e., combining symbolic or logic processing methods with LLMs). Recall, AI started with symbolic computing using the Lisp programming language.
  3. Continue writing your own content. If you are a writer, augmenting your writing with LLMs will be a constant temptation. If you are not a writer and are tasked with writing content (e.g., report summaries), the LLMs will be more of a solution. The phrase "trust, but verify" is good advice. Finally, verified (with provenance) authentic content is going to become very valuable.
  4. Videos are becoming suspect. LLMs that create video have become so good that it is now almost impossible to tell the difference between original video and AI-generated video and images. In the very near future, all new online videos should be considered suspect. Again, be skeptical of videos that seem to have unlikely scenarios, and count the fingers. Like text content, verified (with provenance) and authentic video will become valuable.
  5. Know your friends. While it has always been possible to create fake presence on the web (e.g., "catfishing), AI bots can easily simulate human interaction. Again, verification is important. My rule is that unless I have physically met someone (or know someone who has), I keep the relationship at an arm's length until some level of trust develops. As always, developing real relationships (i.e., investing in social capital) is the most valuable.
  6. AI will make us all stupid. This notion is not unfounded, but this argument has persisted with every new technology. Back in the day, handheld calculators were going to make you "not understand math, well, to be concise, let's call it arithmetic." My experience is that those who were never going to understand mathematics are not going to suddenly understand exponents, square roots, and long division by taking away calculators. Indeed, calculators reduce arithmetic errors.
    The same case can be made for databases, spreadsheets, word processors, spell checkers, GPS maps, etc. Each of these technologies offers convenience and increased productivity, as will AI (at first, Narrow AI) when it demonstrates utility in areas where it works effectively.
    At issue is over-optimization or over-dependence on any technology, which is a consideration made at the organizational or personal level. There will be a killer AI application at some point.
  7. AI will replace my job. This concern is perhaps the biggest among most people. As mentioned above, the jury is still out on how much AI will replace workers. Many jobs are considered "AI-proof." A recent study by MIT and Oak Ridge National Lab suggests AI will replace up to 12% of the workforce; however, on balance, other research predicts AI will create a new equilibrium between automation and human workers. Keep in mind that all new technology both eliminates and creates jobs.
There are many more topics I could address, and what started as a few paragraphs explaining AI to a friend has turned into a 3,900-word essay (written by a human). AI is here to stay and will hopefully settle in areas that augment rather than manipulate society.

My invitation is to be skeptical, ask questions, mind the assumptions, and remember HAL 9000 famously stated, "No 9000 computer has ever made a mistake or distorted information. We are all, by any practical definition of the words, foolproof and incapable of error." Until it wasn't.


HAL 9000 misdiagnosed the AE-35 unit from 2001 A Space Odyssey.