When ChatGPT was released, people wanted to find ways to control the quality of responses LLMs generated. We soon realised that changing the prompt changes the output. Prompt engineering was born and a multitude of tools, cheatsheets and masterclasses were launched to capitalise on most people’s lack of questioning skills. We had all become used to Google and our need to frame queries well had not really been tested. Now we had to raise our game .
I’m building my consulting and educational practice helping leaders get more out of AI. One of my beliefs is that if we have a slightly better model of what is happening, we’ll have a better sense of how to take and keep control. I believe we need to be able to explain ideas like prompt engineering to ourselves finding metaphors that make sense.
Let me have a go at explaining why asking better more focussed questions produces better answers. I’m going to return to my favourite detective metaphors. This time I want to imagine a court house with several different rooms. Let’s limit ourselves to a main court room, the jury deliberation room, a press gallery, a visitors’ cafe where witnesses and family grab a coffee and the holding cells where suspects are held waiting their appearance before the judge.
Let’s imagine that today’s trial is a burglary in which jewels have been stolen. Three suspects have been apprehended and witnesses have been found. We want to know what happened and to solve the case.
If we take all the evidence and ask “what happened?”, we will get an answer but if we go to different locations in our court house we will get different perspectives. A visit to the cells will move our answers towards the perspectives of the defendants, a trip to the jury room will reveal their emerging view of the case based on what they have heard, while a trip to the cafe full of family and witnesses would produce a different perspective. The space we are in matters to the answers produced.
How does this relate to Large Language Models and prompt engineering?
An LLM like ChatGPT is a huge set of parameters that capture relationships between the language tokens in the training data it has been fed. This model captures relationships between words so effectively that when given a query, it can output answers one word at a time that answer the query (often better than Google). The captured relationships are not simply statistical models of what words follow each other, but what words follow each other in specific contexts. It even captures long distance relationships between abstract and complicated ideas.
The model may appear to have no order or structure when we just see it as a huge matrix of numbers but it’s structure actually captures these different contexts. This means that different regions of the model view tokens in a different light.
Our courtroom model, like the rest of the real world, has only 3 dimensions with different rooms symbolising different contexts. By moving from one room to another we hear answers from these different perspectives.
A large language model has thousands of dimensions (impossible to visualise in an image) but the metaphor holds. Moving from one “part” of the model to another changes the context and the answers (tokens) generated. It is as impossible to label these regions as it is to visualise them and there may be the equivalent of billions of different “spaces” but by knowing they are there we can imagine moving our perspective to one of these rooms; a bit like moving from the courtroom to the cells or to the cafe. This is what prompt engineering aims to achieve.
By adding more “depth and colour” to our questions we select a region of the model where we might get appropriate answers.
I’ll look at some ideas for “depth and colour” in my next post but for now imagine adding one of the following to our detective prompt “what happened?”: “you can tell us your side of the story here in the cells without anyone else knowing”, “here in the jury room, you must uphold the principles of justice and fairness at all times”, “you must get my brother in law out of here, whatever it takes”.
Is this useful? Can you think of a better metaphor that explains how prompt engineering actually behaves? Are the other topics about AI you would like me to grapple with? Let me know.
0 Comments