📚 APIs Usage Examples

⚠️

Security Notice: The API keys shown in these examples are masked. Replace *************************** with your actual API key before using these examples.

💬 Text Generation
🔤 Embeddings

💬 Text Generation (Mistral, Granite,...)

Quick Navigation

cURL Python Langchain Continue.dev

🔧 Using Curl Bash

curl -X 'POST' \
    'https://granite-3-3-8b-instruct-maas-apicast-production.apps.prod.rhoai.rh-aiservices-bu.com:443/v1/completions' \
    -H 'accept: application/json' \
    -H 'Content-Type: application/json' \
    -H 'Authorization: Bearer ***************************' \
    -d '{
    "model": "granite-3-3-8b-instruct",
    "prompt": "San Francisco is a",
    "max_tokens": 15,
    "temperature": 0
}'

🐍 Using raw Python Python

import requests
import urllib3
import numpy as np
import json

API_URL = "https://granite-3-3-8b-instruct-maas-apicast-production.apps.prod.rhoai.rh-aiservices-bu.com:443"
API_KEY = "***************************"

input = ["San Francisco is a"]

completion = requests.post(
    url=API_URL+'/v1/completions',
    json={
      "model": "granite-3-3-8b-instruct",
      "prompt": "San Francisco is a",
      "max_tokens": 15,
      "temperature": 0
    },
    headers={'Authorization': 'Bearer '+API_KEY}
).json()

print(completion)

🔗 Using Langchain Python

📦 Prerequisites: pip install langchain==0.3.25 langchain-openai==0.3.22

import json

    from langchain_openai import ChatOpenAI
    from langchain_core.prompts import ChatPromptTemplate
    from langchain_core.prompts.chat import SystemMessagePromptTemplate, HumanMessagePromptTemplate
    from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
    
    llm = ChatOpenAI(
        openai_api_key="*************************",   # Private model, we don't need a key
        openai_api_base="https://granite-3-3-8b-instruct-maas-apicast-production.apps.prod.rhoai.rh-aiservices-bu.com:443/v1",
        model_name="granite-3-3-8b-instruct",
        temperature=0.01,
        max_tokens=512,
        streaming=True,
        callbacks=[StreamingStdOutCallbackHandler()],
        top_p=0.9,
        presence_penalty=0.5,
        model_kwargs={
            "stream_options": {"include_usage": True}
        }
    )
    
    template = ChatPromptTemplate.from_messages([
        SystemMessagePromptTemplate.from_template(
            """You are a helpful, respectful, and honest assistant.
            Answer each question clearly and concisely in a single response only.
            Do not continue the conversation or simulate dialogue unless explicitly asked.
            Never include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content.
            Ensure that your responses are socially unbiased and positive in nature.
            If a question does not make sense or is not factually coherent, explain why instead of trying to answer.
            If you don't know the answer to a question, say "I don't know".
            """),
        HumanMessagePromptTemplate.from_template("{input}"),
    ])
    
    query = "What is Artificial Intelligence?"
    prompt = template.invoke({"input": query})
    response = llm.invoke(input=prompt)
    print()
    print(json.dumps(response.usage_metadata, indent=2))

💻 Connecting Continue.dev to Granite-Code-Instruct JSON

Configuration in .continue/config.json

{
  ...
  "models": [
    {
      "title": "Granite-8B-Instruct",
      "provider": "openai",
      "model": "granite-3-3-8b-instruct",
      "apiBase": "https://granite-3-3-8b-instruct-maas-apicast-production.apps.prod.rhoai.rh-aiservices-bu.com:443/v1/",
      "apiKey": "************************",
      "completionOptions": {
        "temperature": 0.1,
        "topK": 1,
        "topP": 1,
        "presencePenalty": 0,
        "frequencyPenalty": 0
      }
    }
  ],
  ...
  "tabAutocompleteModel": {
    "title": "Granite-8B-Instruct",
    "provider": "openai",
    "model": "granite-3-3-8b-instruct",
    "apiBase": "https://granite-3-3-8b-instruct-maas-apicast-production.apps.prod.rhoai.rh-aiservices-bu.com:443/v1/",
    "apiKey": "****************************",
    "completionOptions": {
      "temperature": 0.1,
      "topK": 1,
      "topP": 1,
      "presencePenalty": 0,
      "frequencyPenalty": 0
    }
  },
  "tabAutocompleteOptions": {
    "useCopyBuffer": false,
    "maxPromptTokens": 1024,
    "prefixPercentage": 0.5
  },
  ...
}

🔤 Embeddings (Granite Embedding, Nomic-Embed-Text,...)