Article summary
Lately, I have been working on a software development project that’s using OpenAI’s “gpt-3.5.turbo” model to answer questions with its own backend database. To properly steer our agent in the right direction and expand its ability to answer more complex questions, a lot of time was spent on prompt engineering and state management. Through trial-and-error at the start, I have come up with some useful strategies to make AI prompts fed to the agent more reliable – as well as some to avoid!
Two terms come up most when trying to find ways to improve a model’s output. Those are “fine-tuning,”,which is a separate thing that’s managed on OpenAI’s web platform, and what I’ve been doing, or “prompt engineering.” So far, I haven’t used any fine-tuning at all and found success.
As a quick overview, AI prompts are fed into an agent’s request to get a more specific response in return. Verbiage and formatting are largely the determining factors here that can make a big difference!
Here’s what helped.
Sectioning the prompt, however long it is, helped me organize thoughts as well as improve performance. For the most part I chose to use something like Markdown for sectioning. In the same prompt, I’d create sections like “Instructions”, “Context”, and “Question”. Something like this:
const createPrompt = ({ question }) => `\
### Instructions ###
You are a talented data analyst tasked with determining how to respond to questions …
### Context ###
Any relevant context to the types of questions expected. This can also be a list of statements!
### Question ###
${question}\’;
In the above example, we have a simple format that can be followed for a lot of things. Since I require SQL queries as an output somewhere in my AI setup, though, we learned to add an “Examples” section, too. This was monumental in getting things right. Here’s an example:
### Examples ###
**Question:** How much X does Y use to relate to Z?
**Output:**
SELECT SUM(X) FROM Y_Z_table
WHERE X = Y AND X = Z;
# // Dividing line for multiple examples
**Question:** …
**Output:** …
# // Dividing line for multiple examples
Another trick I learned to use when I was really stuck, especially when it came to wording things concisely and appropriately was to use something like ChatGPT. Since ChatGPT uses OpenAI models, I would sometimes ask along the lines of “write me a system prompt to do …”. In some way, I see this as asking the AI to tell me how to best talk to itself! So far, it’s been working.
This is what hurt.
One thing that made the accuracy of my questions fall was actually really long prompt instructions. When trying to ask the AI model to do something, at first I thought more was better.
I’ve come to realize that these models are actually pretty good at inferring things based on a limited number of examples. This was especially true when trying to give the agent something like a question that I’d commonly ask and then the expected SQL query response. Loading the prompt up with too many of these would sometimes lose meaning.
Another thing to note is that even your instructions shouldn’t be too verbose. If they’re too complicated and filled with unnecessary verbiage, this can be confusing for the agent. For me, going around this hurdle meant trying to make things really concise while still retaining the meaning of what I wanted to ask the agent to do.
Better AI Prompts
I hope that these insights will help you on your next project! If there’s anything that you’ve personally found useful, feel free to leave a comment below. I’d love to hear what has worked in the past for others. At the beginning of this project, it was a little difficult to find resources since everyone uses AI in such a custom way with code – I hope this article helps bridge that gap.