Prompt Engineering In-Depth
Jan 25, 2024
Outline
Introduction
Most of us have already worked with generative AI systems such as ChatGPT or DALL-E. These are called Large Language Models (LLMs). All work in the same way: they take a piece of text as an input, called a “prompt” and produce an output depending on the instructions given.
The output is not always the same, since the AI models work in a probabilistic way. They look at all the words they’ve output so far and calculate the probability of the next most-likely word. Depending on various factors, these are different on every run, even though the prompt is exactly the same.
This makes “prompting” (i.e. the process of coming up with useful instructions for ChatGPT) a particularly intricate challenge. You might have found yourself iterating and improving on the same prompt over and over to get just the right output.
Let’s take a look at a few ways of how to write better prompts that are more reliable and how to make ChatGPT produce more of what you want and less noise.
What are AI Hallucinations?
Prompt engineering is the process of improving on prompts for a particular task in order to get the desired output from ChatGPT.
Because LLMs are not exact, they produce results that are sometimes inaccurate. An LLM doesn’t “think”, as it might appear, but just produces text that sounds like the masses of other texts it was trained on. It produce text that looks real, but in fact it just regurgitates text that seems to fit your actual input.
These occasionally misleading or wrong outputs are called “hallucinations”. We could argue that everything an LLM outputs is a hallucination, because the LLM doesn’t work any different when it hallucinates.
Hallucinating is the way it works in the first place. This is what allows producing new outputs that haven’t existed yet based on large amounts of data it was trained on. But we’ll limit our definition of “hallucination” only to the case when the output distorts reality or produces unusable outputs.
What is Prompt Engineering?
The key of prompt engineering is restricting the output and creative possibilities of GPT in order to reduce LLM hallucinations. The more open the request—such as “create a report based on this spreadsheet data”—the higher the chances of wrong answers, because it’s too broad. Narrowing the scope and being more precise reduces errors—”As a data analyst, describe the process you would follow to analyse a dataset containing sales data for a retail store. Please include the steps to explore sales trends over time, identify top-selling products, and evaluate sales performance by region for the last quarter.”
In other words, the more context and rules you give the LLM, the better and more precise the output.
For example, think of it as a child that wants to do everything it sees. A parent would restrict the set of actions the child can take in order to prevent it from causing itself (and others) harm. Thus the parent fosters a behaviour that makes the child develop into a useful member of society. In the same way, the role of prompting is to direct the infinite possibilities of LLM outputs to something useful for the prompter.
How Does Generative AI Learn?
The models that power ChatGPT or DALL-E are already trained. That means, the heavy lifting has already been done for you. But you can – and should! – still “train” ChatGPT with your prompts.
Writing a prompt is nothing more than training ChatGPT to produce the output you want. You often give it auxiliary data to work on, either through copy-pasting passages of text, or through using custom GPTs and plug-ins.
You might have observed that giving it a few examples before asking it your actual question works better. This is called “few-shot learning”.
What is an example of prompt engineering?
Few-shot learning is a process that presents an AI model with a few examples of something it should do and then it can apply this knowledge of behaviour to new data inputs. Here is a short example:
Here we gave ChatGPT a few examples of how to classify a text. We can specify the output we want, whether it being complete sentences, keywords or code words such as HAPPY_SITUATION
or STRESSFUL_SITUATION
. It will then answer precisely how we instructed it. You can read more about few-shot learning in this prompt engineering guide.
How to Think About Prompting
With prompts it’s important to keep in mind the kinds of data ChatGPT was trained on and write the prompts in a similar way. LLMs are autocomplete machines, nothing more. Give rules instead of examples. LLMs are very technical, and you can use this to your advantage. Actually, LLMs have proven to be most useful in the programming world, where everything is very structured.
Think in Structures
The more you structure you prompt, the better. To get to the next level, you can start with providing ChatGPT your requests in a markup language of your choice.
A markup language is a way to enhance a text with annotations that don’t belong to the content of the text itself, but give information about the structure and different ways the text will be displayed. HTML is a very popular markup language. The letters stand for “Hype-Text Markup Language”. In HTML you use “tags” to make text appear in certain ways. For example, bold text is included between <b>
and </b>
tags. Paragraphs between <p>
and </p>
. Have you ever thought of formatting your prompts with a markup language?
Bringing Structure to Your Prompts With Markdown
Markdown is a very popular and easy-to-understand markup language. LLMs such as ChatGPT were trained on vast amounts of Markdown, so they understand text formatted in this language best.
If you want to learn the basics, this Markdown cheat sheet can prove useful. As an example, the prompt from above would be better formatted as:
Looks more structured, doesn’t it! Now you can think of how to structure your own prompts in this way.
Examples First, Instructions Later
Many of us are inclined to write the instruction first and then later on provide examples when prompting. However, it should be the other way around.
This ensures that the last sentences are those that the LLM will follow. If we put our examples at the end, it’s more likely that the output will be biased.
If we put our instructions last, they are "more recent" in the LLM’s memory, so to speak.
This is because LLMs compute the probability of the next word to come. So we want them to have the most important information coming last.
Why Programmers Are Inclined to be The Better Prompters
Because of this inherent property of LLMs to be so structured and mechanical, programmers tend to be good at prompting.
You could even take it one step further and provide “proper” programming instructions, such as letting ChatGPT generate an outline based on React JSX instructions such as:
The LLM doesn’t execute the code, but it understands it. It’s not fussy about the fact that we mixed different programming languages here—the intention matters! In this example it would output a proper list of headings and subheadings.
What are LLMs Not Good At? — The Code Generating Dilemma
There are some cases when LLMs produce unreliable output and where a proper programming solution is better. Or a combination of both.
In a recent client project, we needed to find the blog link on scraped websites, navigate there and find the title of the most recent article. For this, we had to look for a link tag with the appropriate class or id properties. Those familiar with web development know that these tags can have any ID, class or custom property attached to it, there being no standardised way of finding what one is looking for.
We tried letting ChatGPT go at the task. Not surprisingly, the LLM generated some code in response to our prompt that it then executed on the provided website source file.
The problem was that in the coding approach, the LLM looked for a few specific variations of the word "blog" in the id and class properties. Of course this didn’t work. Nothing that’s hard-coded ever works in the real world!
We had to circumvent this by going one level deeper—we wrote the code for the LLM and ended up using the LLM’s probabilistic capabilities for something that it’s truly good at: finding the link tag pointing to the blog, and not to write code for it!
So we wrote the code that collected all link tags from the website’s source code and provided the LLM with them, asking it to decide which one pointed to the blog page. The question was formulated differently this time and the LLM didn’t write code and execute it, but instead used it’s inference abilities to produce the right output.
Have a Programmer Write Your Prompts—Or Hire One!
In order to make the most out of LLMs, one has to know these limitations. One has to know when to use an LLM and when to switch to code that feeds data into the LLM. These are decisions that come with a deep understanding of the functioning of both LLMs and traditional code. One is deterministic and the other is not.
Some subtasks require exact computations and answers—these lend themselves to be solved by coding an "exact" program for them. Other subtasks are hard to solve by a deterministic program, so we use the LLM for that.
In essence, if you want to unlock the true power of prompting, going beyond what everybody’s doing, you have to know when to use each tool.
Prompting at Next Operations
We’re excited about these possibilities at Next Operations and are curious to see your use-case.
We have deep tech, product, startup and AI experience and can help your organisation bring its operations to the next level. We have already eliminated countless hours of manual and repetitive work for our clients so far.
We are happy to train your workforce, especially copywriters, marketers and your sales teams, in prompting more effective and save more time. You have the domain knowledge and we have the prompting expertise.
Don’t hesitate to get in touch!
Conclusion
In conclusion, the art of prompt engineering is pivotal in harnessing the full potential of generative AI. From understanding LLM hallucinations to the intricate nuances of crafting structured prompts, this field shapes how we interact with and benefit from AI technologies. The purpose of a well-structured prompt isn’t just about directing the AI towards desired outcomes; it’s about refining the interaction between human creativity and machine intelligence. As we’ve explored in this article, prompt engineering examples demonstrate that with the right approach, businesses can optimise their operations, enhancing efficiency and innovation. The integration of prompt engineering in AI represents a significant leap forward, marking a new era of human-AI collaboration where prompts in AI are not just questions, but gateways to endless possibilities.
Frequently Asked Questions (FAQ)
What is a prompt in generative AI?
A prompt in generative AI is an input command or instruction that guides the AI in generating specific outputs. It’s akin to asking a question or setting a task for the AI, where the prompt’s structure and content significantly influence the quality and relevance of the response.
What are prompts in AI?
Prompts in AI are essentially instructions or triggers used to activate and guide AI models like ChatGPT or DALL-E. They are critical in defining the context and setting the direction for the AI’s response, playing a key role in the effectiveness of AI-generated content.
How does prompting work?
Prompting works by providing AI models with a specific set of instructions or context. The AI then processes these prompts, drawing on its training and algorithms, to generate relevant and coherent responses. The clarity and specificity of the prompt directly influence the AI’s output.
What is the purpose of a structured prompt?
The purpose of a structured prompt is to provide clear, unambiguous guidance to an AI, reducing the likelihood of LLM hallucinations and irrelevant outputs. It helps in narrowing down the AI’s focus, leading to more accurate, useful, and context-specific responses.
Can you give some prompt engineering examples?
Certainly! One example of prompt engineering is providing a scenario followed by a specific question, like asking an AI to analyze sales data trends. Another example is instructing an AI to generate a list of marketing strategies for a particular product, where the prompt outlines the product features and target audience. These examples illustrate how varying the prompt can yield different, tailored outputs.