SPOILER: it is all just maths!

Generative AI tools rely on mathematics. They create content based on probability and pattern recognition, offering the most likely response given the data it has. This is carried out in an iterative process – making a series of small changes (usually additions), one after another. Thus a chatbot builds out its response one likely word at a time.

The underlying models (which define the mathematical relationships) have been pre-trained on sample data, and tuned (possibly by people) until they produce the desired responses.

Some specialist AI models may undergo further training (e.g. processing large amounts of information about English Case Law, particular diseases or a programming language) to provide more accurate subject-specific responses.

Generative AI tools may give the impression of understanding (especially if the interface is designed to portray them as helpful assistants) but really they always create their responses based on mathematical analysis.

You don’t need to understand everything about genAI models and how they work, but more information is provided here if you are interested.

Types of Model

All of the models involve the use of neural networks.

What is a neural network?

Large Language Models

You may have come across the term Large Language Model (or the abbreviation LLM) in discussions about tools like Claude, Gemini and ChatGPT. Large Language Models are AI models that have been trained on enormous amounts of text (thus the name large). They use this to try and plot the relationships between words, mathematically. Because of this model they are very good at predicting the next word, which repeated many times, can create large volumes of text.

How do Large Language Models work?

Inside a Large Language Model is a Transformer, another term you may have heard. When people talk about generative AI, most people think of tools like ChatGPT.

The “T” in ChatGPT stands for Transformer. This is one type of generative AI model, a very common one, but there are others.

Find out more about these other models:

Transformers

Great for dealing with large volumes of text.
Common examples are ChatGPT, Claude, Gemini and LLaMA

How do Transformers work?

Diffusion Models

Designed for high quality image generation, trained on image-text pairs. Common examples are Imagen, Midjourney, Sora and Stable Diffusion.

How do Diffusion Models work?

Variable Autoencoders

Good for comparing images and looking for differences, or generating images from probabilistic data. VAEs are usually built into machine learning frameworks such as TensorFlow and PyTorch.

How do Variable Encoders work?

GANs

Helpful for making changes to content (e.g. converting them to a particular style such as Anime characters)or generating sample data.

How do GANs work?

Just like with the human brain, there are many parts of this process that are not well understood, it is not possible to predict the exact output for a given input. As such these tools are sometimes referred to as “black-boxes” – meaning we do not know their internal workings.

Back to How does generative AI work?