Table of Contents [expand]
Last updated October 23, 2025
LiteLLM is an open-source Python library that acts as a unified interface to call large language models (LLMs). You can use LiteLLM to simplify model access, spend tracking, and fallback management across various LLMs. This integration enables you to use AI models deployed on Heroku’s infrastructure with LiteLLM.
Setup and Configuration
To use Heroku with LiteLLM:
Create an app in Heroku:
heroku create example-appCreate and attach a chat model to your app:
heroku ai:models:create -a example-app claude-3-5-haikuExport configuration variables to save them in your current environment:
export INFERENCE_KEY=$(heroku config:get INFERENCE_KEY -a example-app) export INFERENCE_MODEL_ID=$(heroku config:get INFERENCE_MODEL_ID -a example-app) export INFERENCE_URL=$(heroku config:get INFERENCE_URL -a example-app)INFERENCE_KEYandINFERENCE_URLare required to make calls to your model.
To learn more about Heroku config variables, see Model Resource Config Vars.
Using the Integration
Available Models
Heroku for LiteLLM supports various chat models:
| Model | Region |
|---|---|
heroku/claude-4-sonnet |
US, EU |
heroku/claude-3-7-sonnet |
US, EU |
heroku/claude-3-5-sonnet-latest |
US |
heroku/claude-3-5-haiku |
US |
heroku/claude-3-haiku |
EU |
The heroku/ prefix is included in the model name so LiteLLM knows to use Heroku as the model provider.
Using Config Variables
Heroku uses the following LiteLLM API config variables:
HEROKU_API_KEY: This value corresponds to LiteLLM’sapi_keyparam. Set this variable to the value of Heroku’sINFERENCE_KEYconfig variable.HEROKU_API_BASE: This value corresponds to LiteLLM’sapi_baseparam. Set this variable to the value of Heroku’sINFERENCE_URLconfig variable.
In this example, we don’t explicitly pass the api_key and api_base variables. Instead, we set the config variables which Heroku will use:
import os
from litellm import completion
os.environ["HEROKU_API_BASE"] = "https://us.inference.heroku.com"
os.environ["HEROKU_API_KEY"] = "fake-heroku-key"
response = completion(
model="heroku/claude-3-5-haiku",
messages=[
{"role": "user", "content": "write code for saying hey from LiteLLM"}
]
)
print(response)
Explicitly Setting api_key and api_base
from litellm import completion
response = completion(
model="heroku/claude-sonnet-4",
api_key="fake-heroku-key",
api_base="https://us.inference.heroku.com",
messages=[
{"role": "user", "content": "write code for saying hey from LiteLLM"}
],
)