Effective ChatGPT Prompting for software developers
Introduction to the guiding principles of prompting
Introduction
The popularity of Large Language Models (LLMs) has boomed over recent months since the release of ChatGPT to a broader audience.
With LLMs, software developers (and anyone else who works in a field where AI is useful) benefit from enhanced capabilities to accomplish a series of tasks that would require entire teams and a huge amount of hours before the general appearance of these models.
This, in turn, allows people to accelerate certain tasks, so they can focus on different, more valuable, and time-worthy tasks.
The most popular tool right now in this space is ChatGPT from OpenAI, and, we will be exploring its capabilities through the web client, available here.
We will look at what are the guiding principles for effective prompting and how when “you know what to prompt for”, you can obtain absolutely amazing results on a varied set of tasks that are likely to pop up in your daily work, specifically, as a software engineer.
Before we dive in, it’s important to clarify that presently there are two main classes of LLMs:
Base LLMs: these predict the next word, based on text training data; in a sense these LLMs offer “raw” predictions, simply making predictions based on the training data that was used;
Instruction-tuned LLMs: ChatGPT falls into this category. An instruction-tuned LLM is a large language model that has been specifically trained to try to follow instructions.
Such a model starts off from a base LLM and is further fine-tuned with a series of specific instructions and their expected actions.
A famous technique called Reinforcement Learning from Human Feedback (RLHF) is a way to further fine-tune an LLM to make sure that its outputs are better than what a more “raw” LLM would produce and ensure that the model is helpful, honest, harmless and less likely to produce toxic or problematic outputs.
A good way to look at the effects of fine-tuning LLMs and also on the further enhancement with RLHF is the following image:
Shoggoth with Smiley Face. Courtesy of twitter.com/anthrupad
What this means is that LLMs require a lot of work, both via leveraging fine-tuning on higher quality data and incorporating RLHF into the process in order to make them safe to use by the end users.
However, even leveraging these highly optimized and trained models requires some level of skill in order to extract meaningful information, not only what we want, but also how we want it.
This is what a very recent, emergent field, called prompt engineering, is all about, and in the remainder of this blog post, we will actually apply some guiding principles that are foundational to prompt engineering to see how we can extract the maximum potential from tools like ChatGPT and similar.
The main points to retain are two: firstly, prompt engineering is called engineering precisely because we can follow a methodical approach, as we will see, to extract as much value as possible, and it’s a discipline that is very experimental in its nature: it’s all about performing experiments, to measure the obtained results, understanding them and if they match our expectations and then iterating and further refining our understanding of exactly what we want, and iterate if we aren’t satisfied with the results.
The second point serves as an important ethical reminder: when interacting with LLMs, bear in mind to only use them to build useful and helpful tools which can be used for good.
Much of the existing conversations and interactions with the existing deployed models can likely be used for further training and fine-tuning future models, so we all bear a collective responsibility of using them ethically and responsibly.
Principles of prompting
The current way of interacting with LLMs, and, particularly, with ChatGPT is by either leveraging the web UI or using the API client.
In both of these cases, the main way of interacting with the models is via inputting text and letting the model process that text and give us back output in response.
When we input text to an LLM, for example, via the web UI, what we write and send to the model is called a prompt. Essentially, we are prompting something from the model.
Just like between ourselves when engaging in human interaction, knowing what to ask for and how to ask for it, play a fundamental role in the results we get back from our interactions. If you’d be rude to someone, you can expect a much different reaction than if you ask for something in a nicer, polite way.
Contrast: “Give me that bread” with “Could I please have the last loaf of bread on that shelf behind you? Thanks a lot!”.
In a certain sense, both of these two prompts desire the same output (i.e., having the last loaf of bread for sale at the bakery) but, one of them is much more likely to get it than the other one.
This kind of principle in particular can be translated very well into the world of ChatGPT and crafting text prompts for interacting with LLMs: depending on how you prompt for certain information, you may get different results.
Knowing what and how to prompt becomes a key skill when leveraging LLMs.
The two main principles we will analyze are:
Write clear and specific instructions;
Give the model time to “think”;
Each of these two main principles has a lot of useful information to unpack, so, we will go over them in more detail below.
Principle 1 – Write clear and specific instructions
The first principle we’re going to look at is what it means exactly to write clear and specific instructions.
It’s important to realize that in this context, clear doesn’t mean short. In fact, when interacting with LLMs it’s often times good and even advised to be descriptive and write longer sets of instructions to get to a desired result.
Let’s suppose that we want to get a list of names formatted as JSON, and, additionally, let’s say that, according to our business requirements, we need the names to be Portuguese names, exclusively.
Below is an example of a bad prompt:
An example of a bad prompt when requiring detailed name information
The prompt shown above is a bad example of a prompt because it’s vague and not clear. We know from our business requirements that the names need to be Portuguese and, even more, we know that we need it formatted as a JSON, presumably, let’s say we want to use it in a web application we are building.
In these cases, following the principle of writing very clear and specific instructions can dramatically increase the quality of the output ChatGPT gives us:
This a much better example, of how being very specific and clear can greatly improve the output of the model
As we can see, by being very specific and encoding the business requirements that we know we needed to comply with within our prompt, we get a much better, much more useful response, that is actually aligned with what we want and need.
Another important concept that is important to leverage is a combination of delimiters and structured output to get the results we want.
We saw above how asking for more structured output gave us a much more usable result.
We will now see how we can combine the use of delimiters when offering the input and leverage that to guide our outputs.
Using delimiters and structured output
A critical aspect to be aware of, especially when leveraging the API of a model like ChatGPT, is that web applications that allow for a prompt to be input by a user are subjected to what is known as “prompt injection”.
Prompt injection is what happens when a user adds something to a prompt in an attempt to divert the model from its expected behavior to instead do what the user wants, rather than what it would actually do if the prompt would be followed verbatim, which can be exploited by users with malicious intent. This is better illustrated and more significant with usages conducted through the API, but, we can still observe it even using the simpler web UI.
Let’s say we want the model to summarize a snippet of text, but, we will tell ChatGPT to accept the input directly as it is within triple quotes, simulating an example of how prompt injection could happen in a real-world scenario:
Simulating prompt injection in the ChatGPT web UI. When we ask to summarize a piece of text, afterward ask the model to ignore the request and instead give us details about the best soccer player in the world
As we can see, without proper mechanisms to defend ourselves against prompt injection, our simple text summarizer example could be “hijacked” by a user with malicious intent and instead of a summary, we’d get instead details about who the best soccer player in the world is.
Using delimiters within our usage of the APIs allows us to defend ourselves against prompt injection and ensure that the LLMs are working towards what we want them to do instead of being diverted by a malicious user with intents different from ours and the ones we want the model to follow. Here’s an example of leveraging delimiters to be safe:
Proper use of prompt delimiters to ensure the input guides itself as we want and not as a user with malicious intent would want it to.
Now, we can see that when we leverage the triple backticks as delimiters, we can actually avoid a possible prompt injection that would divert the model from its originally intended use of summarizing text. Now, it simply also summarizes the text that could potentially be a prompt injection that would divert the model from its originally intended usage.
While there can be much more complex and sophisticated prompt injections, it’s good to be aware of these flaws in these LLMs, especially when working with the APIs and expecting free-text user input.
Leveraging delimiters to stay safe and keep the model within its intended use cases and boundaries is extremely useful.
Next, we will talk about structured output in a bit more detail.
Structured output is a great way to make sure that the model is guided towards what we want it to do and even better, we can actually control its output by detailing it in its input prompt.
Let’s refocus our attention on the example with which we began this section: getting a list of names, more specifically, Portuguese names, formatted as a JSON list.
Now, suppose that the requirements have changed, and we actually need the names to be JSON objects themselves of the form:
{“firstName: …., “lastName”: ….}
And that the other requirements are still the same (20 Portuguese names).
We can actually instruct ChatGPT that the list needs to be formatted this way:
Guiding ChatGPT’s output via the input prompt
As we can observe, the model does really well in following instructions when we guide him towards a specific output format that we desire, and we can obtain great results by carefully refining our prompts and enriching them with the output we want to have.
Few-shot prompting
A very useful technique when interacting with LLMs is what is called “few-shot prompting”.
The basic idea behind this technique is that when we want the model to perform a given task for which we can offer related examples to it, we first give the model in its input prompt a series of successful short examples of completing certain tasks and then ask the model to complete our original task.
Few-shot prompting usually ties itself very well with the usage of structured output we saw in the previous subsection because the few short examples which are offered as examples of successfully completing a task can be leveraged by the model as indications of how to format the output.
Let’s say we wanted to give an extra cheerful tone to some sentences which feel very plain.
The way we do it with few-shot prompting is by first offering a few examples of how this can be done, and then asking the model to do it itself.
In this way, the model will base itself on the presented examples and will produce results that are much more geared toward the output we want instead of being more generic in nature.
In essence, we embed some particular examples that showcase our desired output as part of the input prompt for the LLM, which can then perform very well on “unseen” examples. Let’s have a look:
Example of using few-shot prompting to steer the model towards outputs with high similarity
As we can see, the effectiveness of few-shot prompting is clearly noticeable!
By first showing the model exactly what we mean when we say we want to rephrase some sentences in a cheerful way and, most importantly, when we do it via offering a structured output for the “translation” (note how the model itself uses “becomes” immediately after the effective input sentence is written), we get a result that is exactly in line with the examples we have provided the model beforehand.
This can be used for more complex scenarios such as adjusting the delivery or tone of a piece of text, and guiding towards a given written format, amongst other options! Experimentation and exploration are your best allies in uncovering the potential of these models!
Principle 2 – Give the model time to “think”
The other principle we will look at briefly is how we can give the model time to “think” things through.
Drawing an analogy with human reasoning and interactions, it’s easy to understand that even we, as humans, when confronted with a very complex task and given a short amount of time to work through things, are bound to make some mistakes in our reasoning and assumptions.
When an LLM, like ChatGPT, makes reasoning errors by rushing to an incorrect conclusion, often we can rephrase the query and apply a technique known as “Chain of Reasoning”.
Essentially, this technique boils down to requesting the model to make a series (or chain) of relevant reasoning before providing the final answer, which can, very often, increase the model’s accuracy in providing correct answers.
Cooked example that can be fixed by applying “Chain of Reasoning”
For the example shown above, I first instructed ChatGPT to assume the statement I’d give it as an irrefutable truth.
Then, when further prompted with input that challenged the claim, by correctly stating that not all birds can fly, and by asking the model to apply a step-by-step analysis of a given prompt and reassess its conclusions, we can observe that it can then derive the correct conclusion.
When given instructions that make it analyze and confront its previous responses logically, reasoning about the validity of its assumptions, like, validating certain claims, numbers, or facts, we give it a chance of putting all the validated conclusions together, which makes it such that it can reach the right conclusions after “taking its time”.
Much like with humans, we see that it can be made much more accurate when certain logical steps are taken one at a time, which gives the model more “anchors” upon which to base its conclusions on, as opposed to taking an entire, more complex problem as a whole and “biting off more than what it can chew”.
Great use cases of this can include validation of complex math problems, where often asking ChatGPT to “first solve the problem” and only then attempt to evaluate the presented (incorrect) solution will often lead the model in the right direction and allow it to make the correct reasoning.
To wrap up this section, two techniques that are extremely useful in “giving a model time to think”:
1. Specify the steps to perform a certain task in the input prompt (Step 1 -...., Step 2 -...., etc.). When the model receives its input in a step-by-step fashion, it has a much easier time in digesting and breaking down a complex problem, as it can reason about each individual step which helps to reach the right conclusion.
2. Tell the model to work out its own solution before rushing to a conclusion.
Just like the example presented above, by first asking the model to solve a given problem by itself, we can “steer” the model in the right direction by ensuring that it tries to work out through the problem before attempting to parse and evaluate an already presented solution which may be flawed.
Conclusion
In this introductory post, we looked at two of the most important principles of prompting LLMs like ChatGPT:
- Writing clear and specific instructions;
- Giving the model time to think;
We saw how manipulating the input prompts by varying the structure and detail level of the requests we make can greatly improve and change the usefulness of the output we receive.
We saw the dangers of prompt injection and how the clever use of delimiters can be a great way to prevent misuse.
We looked at how describing the output formats can give us great downstream usage of our prompt’s outputs: for e.g. by requesting things in a list format, JSON format, etc., the output we receive can be directly applied to other workflows, like making a request to an API, embed a JSON in a web application, and more.
We learned about “few-shot prompting” as a clever technique for guiding the output into the desired style and format we want and how it can influence the model’s output to a very considerable extent.
To wrap up, we also briefly looked at the second guiding principle: what does it mean to give a model “time to think”?
Essentially, it all boils down to simplifying complex tasks by first breaking them down into smaller, easier-to-follow steps and having the model follow a step-by-step approach which increases its chances for success.
As we also saw, we can also tell the model to work out its own solution before rushing to a conclusion, which can also greatly increase its chances of reaching the correct conclusions!
Armed with this newfound knowledge, one parting remark is left…. Happy Prompting!!
This is an incredibly insightful article on effective prompting for ChatGPT, especially for software developers. The author does a fantastic job of breaking down the principles of clear and specific instructions, as well as the importance of giving the model time to "think" about it. The exploration of "few-shot prompting" and the use of delimiters to prevent prompt injection were particularly enlightening. It's clear that mastering these techniques can significantly enhance our interactions with large language models like ChatGPT. Looking forward to applying these principles in my own work with AI.
#ChatGPT #AI #SoftwareDevelopment