Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

promptfooconfig.yaml 2.9 KB

You have to be logged in to leave a comment. Sign In
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
  1. # yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
  2. description: Lambda Labs
  3. # To use this example:
  4. # 1. Set your LAMBDA_API_KEY environment variable
  5. # 2. Run: promptfoo eval
  6. prompts:
  7. - |
  8. You are a creative AI assistant who loves making absurd but somewhat plausible estimations.
  9. Please answer the following ridiculous question with step-by-step reasoning:
  10. {{question}}
  11. Make your answer entertaining but try to use reasonable assumptions and calculations.
  12. Your response should be at least 100 words and state the final answer at the end.
  13. providers:
  14. - id: lambdalabs:chat:llama-4-maverick-17b-128e-instruct-fp8
  15. config:
  16. temperature: 0.7
  17. max_tokens: 2048
  18. - id: lambdalabs:chat:llama3.3-70b-instruct-fp8
  19. config:
  20. temperature: 0.7
  21. max_tokens: 2048
  22. defaultTest:
  23. options:
  24. # Use Lambda Labs' Llama 4 model for grading too!
  25. provider:
  26. id: lambdalabs:chat:llama-4-maverick-17b-128e-instruct-fp8
  27. config:
  28. temperature: 0.2 # Lower temperature for more consistent grading
  29. max_tokens: 2048
  30. assert:
  31. - type: javascript
  32. value: output.length > 100
  33. - type: llm-rubric
  34. value: Does this response attempt to answer the question with step-by-step reasoning and plausible calculations?
  35. tests:
  36. - vars:
  37. question: How many people would it take typing simultaneously to recreate Wikipedia from scratch within one hour?
  38. - vars:
  39. question: What's the total length of toothpaste squeezed out worldwide each morning? Could it circle Earth?
  40. - vars:
  41. question: If everyone jumped at exactly the same time, would global internet latency noticeably spike due to social media posting?
  42. - vars:
  43. question: How long would it take to fill the Grand Canyon if everyone in the U.S. poured a cup of coffee into it every day?
  44. - vars:
  45. question: If all humans suddenly started walking west at once, how long would it take Earth's rotation to noticeably slow down?
  46. - vars:
  47. question: How tall would a pile of all receipts printed globally each year be? Would it reach the moon?
  48. - vars:
  49. question: How many miles of spaghetti does humanity eat per second, and how quickly could that spaghetti stretch from New York to Los Angeles?
  50. - vars:
  51. question: Could every microwave on Earth simultaneously popping popcorn affect weather patterns?
  52. - vars:
  53. question: If you could somehow assemble all lost keys in history into a single pile, how high would the mountain of keys rise?
  54. - vars:
  55. question: If all the world's smartphones were stacked on top of each other, how many times could they reach the International Space Station?
  56. - vars:
  57. question: How many balloons would it take to lift an average-sized house like in the movie "Up", and how much helium would that require?
  58. - vars:
  59. question: If all the data stored in cloud servers was printed on paper, would the resulting paper stack reach Mars?
Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...