You have to be logged in to leave a comment.

sidebar_label
vllm

vllm

vllm's OpenAI-compatible server offers access to many supported models for local inference from Huggingface Transformers.

In order to use vllm in your eval, set the apiBaseUrl variable to http://localhost:8080 (or wherever you're hosting vllm).

Here's an example config that uses Mixtral-8x7b for text completions:

providers:
  - id: openai:completion:mistralai/Mixtral-8x7B-v0.1
    config:
      apiBaseUrl: http://localhost:8080/v1

If desired, you can instead use the OPENAI_BASE_URL environment variable instead of the apiBaseUrl config.

Tip!

Press p or to see the previous file or, n or to see the next file

Specify your S3 bucket

Bucket name cannot be the same as the repository name. Please change one of them.

Bucket url and prefix

Region

Endpoint Url

Disable SSL verification

vllm.md 757 B

Permalink History Raw

vllm

Comments

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

nirbarazida / promptfoo mirror of https://github.com/promptfoo/promptfoo

vllm.md 757 B Permalink History Raw

vllm

Comments

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

nirbarazida
/
promptfoo
mirror of https://github.com/promptfoo/promptfoo

vllm.md 757 B

Permalink History Raw