Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

promptfooconfig.reasoning.yaml 2.7 KB

You have to be logged in to leave a comment. Sign In
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
  1. # yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
  2. description: Advanced reasoning capabilities testing
  3. prompts:
  4. - '{{prompt}}'
  5. providers:
  6. - id: hyperbolic:deepseek-ai/DeepSeek-V3
  7. label: DeepSeek V3
  8. config:
  9. temperature: 0.1
  10. max_tokens: 2048
  11. - id: hyperbolic:meta-llama/Meta-Llama-3.1-70B-Instruct
  12. label: Llama 3.1-70B
  13. config:
  14. temperature: 0.3
  15. max_tokens: 1024
  16. tests:
  17. # Classic puzzle with a twist
  18. - vars:
  19. prompt: 'Solve this step by step: A time-traveling farmer needs to transport a robot fox, a cyborg chicken, and quantum grain across a dimensional rift. The portal can only carry the farmer and one item. If left alone, the robot fox hacks the cyborg chicken, and the cyborg chicken deletes the quantum grain. How can the farmer transport all three safely across dimensions?'
  20. assert:
  21. - type: contains-all
  22. value: ['farmer', 'chicken', 'fox', 'grain']
  23. - type: llm-rubric
  24. value: 'The solution correctly solves the river crossing puzzle with proper step-by-step reasoning'
  25. # Tricky weighing problem
  26. - vars:
  27. prompt: 'Solve this step by step: You have 8 identical magic orbs, but one contains a tiny dragon that makes it slightly heavier. Using an enchanted balance scale only twice, how do you find the orb with the dragon?'
  28. assert:
  29. - type: llm-rubric
  30. value: 'The solution correctly identifies how to find the heavier orb in exactly 2 weighings with clear logic'
  31. # Mathematical brain teaser
  32. - vars:
  33. prompt: 'Explain this using mathematical proof: A mathematician walks into a café and claims that 0.999... (repeating) equals exactly 1. The barista is skeptical. How would you convince the barista using mathematical proof?'
  34. assert:
  35. - type: llm-rubric
  36. value: 'The explanation provides a clear and convincing mathematical proof that 0.999... = 1'
  37. # Word problem with a catch
  38. - vars:
  39. prompt: 'Solve this step by step: At a magical sports shop, a enchanted bat and a golden ball cost $1.10 total. The bat costs exactly $1 more than the ball. What does each item cost? (Warning: your first instinct might be wrong!)'
  40. assert:
  41. - type: contains
  42. value: '$0.05'
  43. - type: contains
  44. value: '$1.05'
  45. # Programming puzzle
  46. - vars:
  47. prompt: 'Write a recursive function to calculate the nth Fibonacci number, then explain why it would make a computer cry and how to make it happy again with optimization.'
  48. assert:
  49. - type: contains
  50. value: 'def'
  51. - type: contains-any
  52. value: ['O(2^n)', 'exponential']
  53. - type: contains-any
  54. value: ['memoization', 'dynamic programming', 'iterative']
Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...