Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

ibm-bam.md 4.8 KB

You have to be logged in to leave a comment. Sign In
sidebar_label
IBM BAM

IBM BAM

The bam provider integrates with IBM's BAM API, allowing access to various models like meta-llama/llama-2-70b-chat and ibm/granite-13b-chat-v2.

Setup

This provider requires you to install the IBM SDK:

npm install @ibm-generative-ai/node-sdk

Configuration

Configure the BAM provider by specifying the model and various generation parameters. Here is an example of how to configure the BAM provider in your configuration file:

providers:
  - id: bam:chat:meta-llama/llama-2-70b-chat
    config:
      temperature: 0.01
      max_new_tokens: 1024
      prompt:
        prefix: '[INST] '
        suffix: '[/INST] '
  - id: bam:chat:ibm/granite-13b-chat-v2
    config:
      temperature: 0.01
      max_new_tokens: 1024
      prompt:
        prefix: '[INST] '
        suffix: '[/INST] '

Authentication

To use the BAM provider, you need to set the BAM_API_KEY environment variable or specify the apiKey directly in the provider configuration. The API key can also be dynamically fetched from an environment variable specified in the apiKeyEnvar field in the configuration.

export BAM_API_KEY='your-bam-api-key'

API Client Initialization

The BAM provider initializes an API client using the IBM Generative AI Node SDK. The endpoint for the BAM API is configured to https://bam-api.res.ibm.com/.

Configuration

Parameter Type Description
top_k number Controls diversity via random sampling: lower values make sampling more deterministic.
top_p number Nucleus sampling: higher values cause the model to consider more candidates.
typical_p number Controls the "typicality" during sampling, balancing between top_k and top_p.
beam_width number Sets the beam width for beam search decoding, controlling the breadth of the search.
time_limit number Maximum time in milliseconds the model should take to generate a response.
random_seed number Seed for random number generator, ensuring reproducibility of the output.
temperature number Controls randomness. Lower values make the model more deterministic.
length_penalty object Adjusts the length of the generated output. Includes start_index and decay_factor.
max_new_tokens number Maximum number of new tokens to generate.
min_new_tokens number Minimum number of new tokens to generate.
return_options object Options for additional information to return with the output, such as token probabilities.
stop_sequences string[] Array of strings that, if generated, will stop the generation.
decoding_method string Specifies the decoding method, e.g., 'greedy' or 'sample'.
repetition_penalty number Penalty applied to discourage repetition in the output.
include_stop_sequence boolean Whether to include stop sequences in the output.
truncate_input_tokens number Maximum number of tokens to consider from the input text.

Moderation Parameters

Moderation settings can also be specified to manage content safety and compliance:

Parameter Type Description
hap object Settings for handling hate speech. Can be enabled/disabled and configured with thresholds.
stigma object Settings for handling stigmatizing content. Includes similar configurations as hap.
implicit_hate object Settings for managing implicitly hateful content.

Each moderation parameter can include the following sub-parameters: input, output, threshold, and send_tokens to customize the moderation behavior.

Here's an example:

providers:
  - id: bam:chat:ibm/granite-13b-chat-v2
    config:
      moderations:
        hap:
          input: true
          output: true
          threshold: 0.9
Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...