Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

promptfooconfig.yaml 2.9 KB

You have to be logged in to leave a comment. Sign In
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
  1. # yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
  2. description: OpenAI MCP tool integration
  3. prompts:
  4. - 'What are the transport protocols supported in the MCP specification for {{repo}}?'
  5. - 'Can you search for information about {{topic}} in the {{repo}} repository and summarize the key points?'
  6. - 'What are the main features of {{repo}}? Please provide a detailed overview.'
  7. providers:
  8. - id: openai:responses:gpt-4.1-2025-04-14
  9. config:
  10. tools:
  11. - type: mcp
  12. server_label: deepwiki
  13. server_url: https://mcp.deepwiki.com/mcp
  14. require_approval: never
  15. allowed_tools: ['ask_question', 'read_wiki_structure']
  16. max_output_tokens: 1500
  17. temperature: 0.3
  18. instructions: 'You are a helpful research assistant. Use the available MCP tools to search for accurate information about repositories and provide comprehensive answers.'
  19. tests:
  20. - vars:
  21. repo: modelcontextprotocol/modelcontextprotocol
  22. topic: transport protocols
  23. assert:
  24. # Validate MCP tool execution was successful
  25. - type: is-valid-openai-tools-call
  26. weight: 0.3
  27. # Check for specific content in the response
  28. - type: contains
  29. value: 'transport'
  30. weight: 0.2
  31. - type: contains
  32. value: 'protocol'
  33. weight: 0.2
  34. # Ensure MCP tool was actually used (check for tool result)
  35. - type: contains
  36. value: 'MCP Tool Result'
  37. weight: 0.1
  38. # Validate the quality of the response
  39. - type: llm-rubric
  40. value: 'The response mentions transport protocols or MCP specification details'
  41. weight: 0.2
  42. - vars:
  43. repo: facebook/react
  44. topic: hooks
  45. assert:
  46. # Comprehensive MCP validation
  47. - type: is-valid-openai-tools-call
  48. - type: contains
  49. value: 'React'
  50. # Verify MCP integration worked
  51. - type: contains
  52. value: 'MCP Tool Result'
  53. - type: llm-rubric
  54. value: 'The response explains React functionality or features'
  55. - vars:
  56. repo: microsoft/typescript
  57. topic: type system
  58. assert:
  59. # Test both success and content validation
  60. - type: is-valid-openai-tools-call
  61. - type: contains-any
  62. value: ['TypeScript', 'type']
  63. # Ensure no MCP errors occurred
  64. - type: not-contains
  65. value: 'MCP Tool Error'
  66. - type: llm-rubric
  67. value: 'The response describes TypeScript features or type system'
  68. - vars:
  69. repo: openai/openai-python
  70. topic: API client
  71. assert:
  72. # Multi-layered validation approach
  73. - type: is-valid-openai-tools-call
  74. metric: mcp_tool_success
  75. - type: contains-any
  76. value: ['API', 'client', 'Python']
  77. # Check that MCP tools were discovered and used
  78. - type: contains
  79. value: 'MCP Tool Result'
  80. metric: mcp_tool_used
  81. weight: 0
  82. - type: llm-rubric
  83. value: 'The response describes the OpenAI Python client library or API features'
Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...