You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Version 1: Summarize the main idea of the article in 1-5 sentences.
Version 2: If this article was very short (1-5 sentences), how would you summarize its main idea?
Version 3: Create a short summary of each paragraph in the article. Connect the summaries to create a summary of the entire article. Do not exceed 5 sentences.
...
A = Enumerate 10 things humans do better than machines.
B = List 10 things humans do better than machines.
Testing and scoring prompts with promptfoo
npm install -g promptfoo
promptfoo init
Your first prompt goes here
---
Next prompt goes here. You can substitute variables like this: {{var1}} {{var2}} {{var3}}
---
This is the next prompt.These prompts are nunjucks templates, so you can use logic like this:
{% if var1 %}{{ var1 }}{% endif %}
---
[{"role": "system", "content": "This is another prompt. JSON is supported."},{"role": "user", "content": "Using this format, you may construct multi-shot OpenAI prompts"}{"role": "user", "content": "Variable substitution still works: {{ var3 }}"}]
---
If you prefer, you can break prompts into multiple files (make sure to edit promptfooconfig.yaml accordingly)
# This configuration runs each prompt through a series of example inputs and checks if they meet requirements.prompts: [prompts.txt]providers: [openai:gpt-3.5-turbo-0613]tests:
- description: First test case - automatic reviewvars:
var1: first variable's valuevar2: another valuevar3: some other valueassert:
- type: equalsvalue: expected LLM output goes here
- type: containsvalue: some text
- type: javascriptvalue: 1 / (output.length + 1) # prefer shorter outputs
- description: Second test case - manual review# Test cases don't need assertions if you prefer to manually review the outputvars:
var1: new valuevar2: another valuevar3: third value
- description: Third test case - other types of automatic reviewvars:
var1: yet another valuevar2: and anothervar3: dear llm, please output your response in json formatassert:
- type: contains-json
- type: similarityvalue: ensures that output is semantically similar to this text
- type: llm-rubricvalue: ensure that output contains a reference to X
export OPENAI_API_KEY=<your-api-key>
npx promptfoo eval
npx promptfoo view
Server listening at http://localhost:15500
Do you want to open the browser to the URL? (y/N):
Press Ctrl+C to stop the server
promptfoo: using variables
You are an {{ role }}. Write a short ad for a the following product: Online course for learning how to write, test and deploy prompts.
cat << EOF > prompts.txtWhen I was 10 years old, my sister was half my age. Now I am 30 years old. How old is my sister?
---
user: When Harry was 4 years old, his sister was half of his age. Harry is now 50 years old. How old is his sister?ai: When Harry was 4 years old, his sister was 4/2 years old. This means there are 2 years difference between them. Harry is now 50 years old. His sister is 50 - 2 = 48 years old.user: When I was 10 years old, my sister was half my age. Now I am 30 years old. How old is my sister?ai:
EOF
cat << EOF > promptfooconfig.yamlprompts: [prompts.txt]providers: [openai:completion:text-davinci-002]tests:
- description: Test 1assert:
- type: containsvalue: 25 years oldEOF
promptfoo integration with LangChaing
cat<<EOF>lc.pyimportsysfromlangchain.prompts.chatimport (
ChatPromptTemplate,
SystemMessagePromptTemplate,
HumanMessagePromptTemplate,
)
# This is the template for the prompttemplate="""You are a helpful assistant that translates from {from_lang} to {to_lang}.Your output should be in JSON format.Examples:user: translate(Hello, en, es)ai:{{ "sentence": "Hello", "translation": "Hola", "from_lang": "en", "to_lang": "es"}}user: translate(Would you like to play a game?, en, es)ai:{{ "sentence": "Would you like to play a game?", "translation": "¿Te gustaría jugar un juego?", "from_lang": "en", "to_lang": "es"}}A user will pass in the sentence to translate, and your output should ONLY return the translation in the JSON format above, and nothing more."""# System message prompt template. This is a message that is not sent to the user.system_message_prompt=SystemMessagePromptTemplate.from_template(template)
# The text template that the user will use to send a message to the system.human_template="translate({sentence}, {from_lang}, {to_lang})"# Human message prompt template.human_message_prompt=HumanMessagePromptTemplate.from_template(human_template)
# Chat prompt template = system message prompt + human message promptchat_prompt=ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])
# import the ChatOpenAI class and the LLMChain classfromlangchain.chat_modelsimportChatOpenAIfromlangchain.chainsimportLLMChain# Create a new chainchain=LLMChain(
llm=ChatOpenAI(),
prompt=chat_prompt,
)
# Read and parse the user inputdefread_parse_user_input(input):
# parse the inputinput=input.replace("translate(", "")
input=input.replace(")", "")
input=input.split(",")
sentence=input[0].strip()
from_lang=input[1].strip()
to_lang=input[2].strip()
returnsentence, from_lang, to_lang# Get the user inputuser_input=user_input=sys.argv[1]
sentence, from_lang, to_lang=read_parse_user_input(user_input)
# Run the chain and print the outputoutput=chain.run(sentence=sentence, from_lang=from_lang, to_lang=to_lang)
print(output)
EOF
python lc.py
translate(Hello there, en, tr)
translate(What is the weather like today?, en, lt)
assertionTemplates:
isJson:
type: is-jsoncontainsTranslation:
type: containsvalue: "\"translation\""containsFromLang:
type: containsvalue: "\"from_lang\""containsToLang:
type: containsvalue: "\"to_lang\""containsSentence:
type: containsvalue: "\"sentence\""prompts: prompts.txtproviders: exec:python lc.pytests:
- vars:
fn: translate(Hi, en, dk)assert:
- $ref: "#/assertionTemplates/isJson"
- $ref: "#/assertionTemplates/containsTranslation"
- $ref: "#/assertionTemplates/containsFromLang"
- $ref: "#/assertionTemplates/containsToLang"
- $ref: "#/assertionTemplates/containsSentence"
- vars:
fn: translate(Hi, en, ar)assert:
- $ref: "#/assertionTemplates/isJson"
- $ref: "#/assertionTemplates/containsTranslation"
- $ref: "#/assertionTemplates/containsFromLang"
- $ref: "#/assertionTemplates/containsToLang"
- $ref: "#/assertionTemplates/containsSentence"# ... repeat the same structure for other languages like es, fr, de, and it.
promptfoo scenarios and streamlining the test
cat << EOF > prompts.txt[{"role": "system", "content": "Provide a weather-related recommendation based on the user's input. context: the user is going for a hike"},{"role": "user", "content": "Sunny"},{"role": "assistant", "content": "It's a great day for outdoor activities! Temperature: 25°C, Humidity: 40%"}]
---
[{"role": "system", "content": "Provide a weather-related recommendation based on the user's input. context: the user is going for a hike"},{"role": "user", "content": "Rainy"},{"role": "assistant", "content": "Don't forget your umbrella! Temperature: 18°C, Humidity: 90%"}]
---
[{"role": "system", "content": "Provide a weather-related recommendation based on the user's input. context: the user is going for a hike"},{"role": "user", "content": "Cloudy"},{"role": "assistant", "content": "You might need a jacket today. Temperature: 20°C, Humidity: 60%"}]EOF
cat << EOF > promptfooconfig.yamlprompts: [prompts.txt]providers: [openai:gpt-3.5-turbo-0613]scenarios:
- config:
- vars:
weatherCondition: SunnyexpectedAdvice: "If you are going for a hike, bring sunscreen."expectedTemperature: "25°C"expectedHumidity: "40%"
- vars:
weatherCondition: RainyexpectedAdvice: "If you are going for a hike, bring an umbrella."expectedTemperature: "18°C"expectedHumidity: "90%"
- vars:
weatherCondition: CloudyexpectedAdvice: "If you are going for a hike, bring a jacket."expectedTemperature: "20°C"expectedHumidity: "60%"tests:
- description: Forecast Advice based on Weather Conditionvars:
input: '{{weatherCondition}}'assert:
- type: similarvalue: '{{expectedAdvice}}'threshold: 0.8
- description: Forecast Temperature based on Weather Conditionvars:
input: '{{weatherCondition}}'assert:
- type: similarvalue: '{{expectedTemperature}}'threshold: 0.8
- description: Forecast Humidity based on Weather Conditionvars:
input: '{{weatherCondition}}'assert:
- type: similarvalue: '{{expectedHumidity}}'threshold: 0.8EOF