And at very different levels of quality.
It can be one of those typically lukewarm model responses with lots of caveats, long-winded explanations, or half-hallucinated knowledge from the day before yesterday.
Or the first section of a best-selling novel, a proposal for a perfect brand name, the most powerful and elegant Python function for a tricky problem.
Or even more, more and more of that: When you are building an application, the model can reliably and accurately answer millions of questions from your customers, fully process insurance claims or stubbornly rummage every day through the freshly filed patent filings in search of conflicts with older ones.
There is one if - and this is a big if: This works only, if you specify your wish in a perfect prompt.
And that doesn't just mean you have to avoid using ambiguous instructions like all these guys wasting the three wishes. No, it is much harder. To realize the full potential of our AI fairies and genies, you need to craft your prompts masterfully and meticulously.
When you are using models for everyday chores, this engineering work may be helpful.
If you are building applications with AI, it is a must.
Our Prompt Engineering Cheat Sheet is a condensed (PDF) book of spells for this somewhat arcane and often tricky discipline of machine learning.
Whether you're a seasoned user or just starting your AI journey, this cheat sheet should serve as a pocket dictionary for many areas of communication with large language models.
The contents:
- the AUTOMAT and the CO-STAR framework
- output format definition
- few-shot learning
- chain of thoughts
- prompt templates
- RAG, retrieval-augmented generation
- formatting and delimiters and
- the multi-prompt approach.
Here is a handy PDF version of the cheat sheet to take with you.
The AUTOMAT Framework
The AUTOMAT Framework describes the key ingredients of a perfect prompt instruction: What you need, how to write it.
And how not to write it.
The acronym AUTOMAT stands for
Act as a …
User Persona & Audience
Targeted Action
Output Definition
Mode / Tonality / Style
Atypical Cases
Topic Whitelisting
By considering each element, you can guide the LLM towards the desired outcome. Imagine you're writing a script for a chatbot. You define its role (A), who it's interacting with (U), the goal of the interaction (T), what information it should provide (O), how it should communicate (M), how to handle edge cases (A), and what topics are relevant (T). This structure ensures clear, consistent communication for your LLM.
The CO-STAR Framework
A similar approach, but setting the focus a little bit differently:
- Context: Set the scene! Provide background details for the LLM to understand the situation.
- Objective: What do you want to achieve? Clearly define the task for focused results.
- Style & Tone: Dress it up! Specify the desired writing style and emotional tone for your LLM's response.
- Audience: Know your reader. Identify who you're targeting to tailor the LLM's output.
- Response: Pick your format. Define the output format (text, code, etc.) for the LLM's response.
Output Format
The definition of the output format tells the model how to provide the response. Even better than telling is showing. Provide a real-life example of an output.
The model can mimic almost every conceivable output, any existing format, as well as the structures you define for a specific task. Providing an answer in an easy-to-parse format like JSON greatly simplifies building applications and autonomous AI workers.
Further structure the output by
- specifying allowed values and ranges for the AI's response
- instruct the AI what to do if data is unavailable, if values are missing
Few-Shot Learning
Few-shot learning in prompts shows the model a few practical problems and solutions before it starts with the real job:
- Standard cases: Weave a few examples of how the model should map input to output
- Special cases: Show the model how to answer edge cases or outliers. How should it answer, if data is missing, if off-topic questions are asked, or if a user goes rogue.
Chain of thought
If we force the model to think aloud and to make some considerations and reasoning before it gives the final answer, the results will get better. Here, our AI is not all different from 6th graders solving math problems. Or, let's say, humans in general.
Not my idea, by the way, but that of the Google Brain Team.
To filter the reasoning from the answer, again, use JSON in the output.
Prompt Templates
When building AI applications, you will almost never use a constant prompt.
You will use a template containing variables that are set depending on the situation: Specific user questions, document chunks, API outputs, the current time, content from the internet, etc.
In each step or invocation of the template, the variables are then replaced by actual content.
RAG — Retrieval augmented generation
RAG is maybe the most important technique developed in the field of LLMs in the last two years. This technique lets LLMs access your data or documents to answer a question — overcoming limitations like knowledge the cutoff in the pre-training data. RAG allows us to tap into an extremely broad base of content, comprising megabytes and gigabytes of data, leading to more comprehensive and up-to-date responses from LLMs.
Formatting and delimiters
Models don't read your prompt twice.
Either they understand at once what purpose a piece of information serves - is it an example, an instruction, or is it context information?
Or they don't. Then their response will be probably wrong.
So, make sure that the model can grasp the structure of your prompt. Unlike the privileged authors of Medium stories, you are limited to characters only; you cannot use graphic highlighting such as headings, bold, or italics. When structuring the individual sections, make friends with hashes, quotation marks and line breaks.
Assemble the parts
Here is a real-world example of how to put it all together. I am using the components of a prompt and the delimiters we discussed above to give the prompt a structure. Regarding the order of appearance, start with the core instruction, then examples, data, the output format and finally the interaction history.
And yes, prompts with examples and context information can get quite long. That's why the model providers are opening ever increasing context windows for a single inference — in AI lingo, this is the maximum length of input for one answer generation by the model.
Multi-Prompt Approach
When building a complex application like an autonomous AI worker, who handles insurance claims, one prompt is often not enough. You could build a single prompt categorizing a claim, checking the coverage, considering the insurance policy of the customer, and calculating the reimbursement value.
But this is not the best approach. It is both easier and returns more accurate responses to split the prompt into smaller single task prompts and build chains of model requests. Usually, you categorize input data first and then select a specific chain which processes the data with models and deterministic functions.
Dear fellow prompt magicians and acolytes, I wish you all the very best for your AI projects and hope that you are successful in your search for the perfect prompt.
I will try to keep the cheat sheet up-to-date. Follow me on Medium (⇈) or LinkedIn to get updates.
Big, big thanks for help with this cheat sheet to Ellen John, Timon Gemmer, Almudena Pereira and Josip Krinjski!