Skip Navigation
Show nav
Dev Center
  • Get Started
  • Documentation
  • Changelog
  • Search
  • Get Started
    • Node.js
    • Ruby on Rails
    • Ruby
    • Python
    • Java
    • PHP
    • Go
    • Scala
    • Clojure
    • .NET
  • Documentation
  • Changelog
  • More
    Additional Resources
    • Home
    • Elements
    • Products
    • Pricing
    • Careers
    • Help
    • Status
    • Events
    • Podcasts
    • Compliance Center
    Heroku Blog

    Heroku Blog

    Find out what's new with Heroku on our blog.

    Visit Blog
  • Log inorSign up
View categories

Categories

  • Heroku Architecture
    • Compute (Dynos)
      • Dyno Management
      • Dyno Concepts
      • Dyno Behavior
      • Dyno Reference
      • Dyno Troubleshooting
    • Stacks (operating system images)
    • Networking & DNS
    • Platform Policies
    • Platform Principles
  • Developer Tools
    • Command Line
    • Heroku VS Code Extension
  • Deployment
    • Deploying with Git
    • Deploying with Docker
    • Deployment Integrations
  • Continuous Delivery & Integration (Heroku Flow)
    • Continuous Integration
  • Language Support
    • Node.js
      • Working with Node.js
      • Node.js Behavior in Heroku
      • Troubleshooting Node.js Apps
    • Ruby
      • Rails Support
      • Working with Bundler
      • Working with Ruby
      • Ruby Behavior in Heroku
      • Troubleshooting Ruby Apps
    • Python
      • Working with Python
      • Background Jobs in Python
      • Python Behavior in Heroku
      • Working with Django
    • Java
      • Java Behavior in Heroku
      • Working with Java
      • Working with Maven
      • Working with Spring Boot
      • Troubleshooting Java Apps
    • PHP
      • PHP Behavior in Heroku
      • Working with PHP
    • Go
      • Go Dependency Management
    • Scala
    • Clojure
    • .NET
      • Working with .NET
  • Databases & Data Management
    • Heroku Postgres
      • Postgres Basics
      • Postgres Getting Started
      • Postgres Performance
      • Postgres Data Transfer & Preservation
      • Postgres Availability
      • Postgres Special Topics
      • Migrating to Heroku Postgres
    • Heroku Key-Value Store
    • Apache Kafka on Heroku
    • Other Data Stores
  • AI
    • Working with AI
    • Heroku Inference
      • Inference API
      • Quick Start Guides
      • AI Models
      • Inference Essentials
    • Vector Database
    • Model Context Protocol
  • Monitoring & Metrics
    • Logging
  • App Performance
  • Add-ons
    • All Add-ons
  • Collaboration
  • Security
    • App Security
    • Identities & Authentication
      • Single Sign-on (SSO)
    • Private Spaces
      • Infrastructure Networking
    • Compliance
  • Heroku Enterprise
    • Enterprise Accounts
    • Enterprise Teams
    • Heroku Connect (Salesforce sync)
      • Heroku Connect Administration
      • Heroku Connect Reference
      • Heroku Connect Troubleshooting
  • Patterns & Best Practices
  • Extending Heroku
    • Platform API
    • App Webhooks
    • Heroku Labs
    • Building Add-ons
      • Add-on Development Tasks
      • Add-on APIs
      • Add-on Guidelines & Requirements
    • Building CLI Plugins
    • Developing Buildpacks
    • Dev Center
  • Accounts & Billing
  • Troubleshooting & Support
  • Integrating with Salesforce
  • AI
  • Heroku Inference
  • Inference API
  • Managed Inference and Agents API /v1/agents/heroku

Managed Inference and Agents API /v1/agents/heroku

Last updated May 15, 2025

Table of Contents

  • Request Body Parameters
  • tools Array of Objects
  • messages Array of Objects
  • Request Headers
  • Response Format
  • Example Request
  • Example Response

The /v1/agents/heroku endpoint allows you to interact with an agentic system powered by large language models (LLMs) that can autonomously invoke tools based on your messages. Unlike /chat/completions, which generates a single model response, the v1/agents/heroku endpoint supports automatic tool execution and multistep workflows.

Request Body Parameters

Use the following parameters to manage the behavior of the agent and which tools it can use.

Required Parameters

Field Type Description Example
model string model used for inference, typically the value of your INFERENCE_MODEL_ID config var
messages array array of messages used by the agent to determine its response and next actions [{"role": "user", "content": "Check my database schema."}]

Optional Parameters

Field Type Description Default Example
max_tokens_per_inference_request integer max number of tokens the model can generate during each underlying inference request before stopping (a single call to /v1/agents/heroku can include multiple underlying inference requests)
max value: 4096 for Haiku models, 8192 for Sonnet models
varies 1024
stop array list of strings that stop the model from generating further tokens if any of the strings are in the response (for example, ["foo"] causes the model to stop generating output only if it generated the string "foo") null ["foo"]
temperature float controls randomness of the response: values closer to 0 make responses more focused by favoring high-probability tokens, while values closer to 1.0 encourage more diverse responses by sampling from a broader range of possibilities for each generated token
range: 0.0 to 1.0
1.0 0.2
tools array list of tools the agent is allowed to use null see tools field in the example request
top_p float specifies the proportion of tokens to consider when generating the next token, in terms of cumulative probability
range: 0 to 1.0
0.999 0.95

tools Array of Objects

Each tool in the array allows the agent to call an action on your behalf. Heroku automatically executes tool calls via one-off dynos. The /v1/agents/heroku endpoint currently supports two types of tools:

  • heroku_tool: 1st-party tools that Heroku Managed Inference and Agents natively supports
  • MCP tools: custom MCP tools you deploy to Heroku, which Heroku automatically runs when called by your model. To learn more, see Heroku MCP Tools about how to deploy your own custom MCP tools to Heroku.
Field Type Description Example
type enum<string> type of tool
one of:heroku_tool, mcp
"heroku_tool"
name string name of tool (see Heroku Tools for available tools) "code_exec_ruby"
description string (optional) hint text to inform the model when to use this tool "Runs SQL query on a Heroku database"
runtime_params object configuration to control automatic execution of Heroku Tools and mcp tools (see runtime parameters)

Runtime Parameters

The v1/agents/heroku endpoint passes certain settings to the specified mcp or heroku_tool tools at runtime The model can’t modify the settings.

Field Type Description Default Example
target_app_name string (required) name of Heroku app to run the tool in "my-heroku-app"
dyno_size string dyno size to use when running the tool "standard-1x" "standard-1x"
ttl_seconds integer max seconds a dyno is allowed to run
max: 120
120 10
max_calls integer max number of times this tool can be called during the agent loop 3 1
tool_params object additional parameters for tool (for example, cmd, db_attachment) (see tool-specific docs) (varies) {}
  • mcp type tools allow optionally specifying ttl_seconds, max_calls, and dyno_size. The other parameters aren’t supported.
  • heroku_tool type tools require or allow certain parameters depending on the tool itself. See tool-specific docs for more information.

messages Array of Objects

A messages object is an array of message objects.

Each message must specify a role field that determines the message’s schema. Currently, the supported types are user, assistant, system, and tool.

If the most recent message uses the assistant role, the model will continue its answer starting from the content in that most recent message.

role=user message

user messages are the primary way to send queries to your model and prompt it to respond.

Field Type Description Required Example
role string role of message
always: "user"
yes "user"
content string contents of user message yes "What is the weather?"

role=assistant message

Typically, the model only generates assistant messages. However, you can create or prefill a partially completed assistant response to influence the content a model generates on its next turn.

Field Type Description Required Example
role string role of message
always: "assistant"
yes "assistant"
content string contents of assistant message yes, unless tool_calls is specified "Here is the information"
refusal string or null refusal message by assistant no "I cannot answer that"
tool_calls array array of tool call request objects no [{"id": "tool_call_12345", "type": "function", "function": {"name": "my_cool_tool", "arguments": {"some_input": 123}}}]

Tool Call Object

Represents the model’s request to execute a specific tool.

Field Type Description Example
id string unique ID for the tool call "tooluse_abc123"
type string type of call
always: "function"
"function"
function object function call details see tool call example
Tool Call Example
"tool_calls": [
  {
    "id": "tooluse_abc123",
    "type": "function",
    "function": {
      "name": "dyno_run_command",
      "arguments": "{}"
    }
  }
]

Function Object

Field Type Description Example
name string name of tool to invoke "dyno_run_command"
arguments string JSON-encoded string of tool arguments "{}"

role=system message

A system message is a special prompt given to the model to guide its responses.

Field Type Description Required Example
role string role of message
always: "system"
yes "system"
content string contents of system message yes "You are a helpful assistant. You favor brevity and avoid hedging. You readily admit when you don't know an answer."

role=tool message

A tool message object representing a specified tool’s output.

Field Type Description Required Example
role string role of message
always: "tool"
yes "tool"
content string output of tool call yes "Rainy and 84º"
tool_call_id string tool call the message is responding to yes "toolu_02F9GXvY5MZAq8Lw3PTNQyJK"

Request Headers

Header Type Description
Authorization string Bearer token containing your Heroku Inference API key

All /v1/agents/heroku requests must include the following header:

-H "Authorization: Bearer $INFERENCE_KEY"

Response Format

Agent responses are streamed back over Server-Sent Events (SSE). Each event: message includes a JSON payload representing a completion. The final event is event: done with the data [DONE].

Completion Object

Field Type Description Example
id string unique ID for agent session "chatcmpl-abc123"
object enum<string> type of completion
one of: chat.completion, tool.completion
"tool.completion"
created integer unix timestamp when chunk was created 1746546550
model string model ID used to generate the message "claude-3-7-sonnet"
system_fingerprint string fingerprint of system generating output "heroku-inf-abc123"
choices array of objects array of length 1 containing a single choice object see example response
usage object token usage statistics, empty for tool completions (no tokens consumed) {"prompt_tokens":15,"completion_tokens":13,"total_tokens":28}

Choice Object

Field Type Description Example
index integer index of the choice
always: 0
0
message object message content (response messages will always be of role assistant or tool) see example response
finish_reason enum<string> reason model stopped
one of: stop, length, tool_calls, ""
"tool_calls"

Usage Object

Field Type Description Example
prompt_tokens integer tokens used in prompt 397
completion_tokens integer tokens used in response 65
total_tokens integer sum of prompt and completion tokens 462

Each event: message streamed over SSE contains a single completion object: either a chat.completion or a tool.completion. The final message is event: done with data: [DONE].

Example Request

 curl --location $INFERENCE_URL/v1/agents/heroku \
  --header 'Content-Type: application/json' \
  --header "Authorization: Bearer $INFERENCE_KEY" \
  --data @- <<EOF
{
  "model": "$INFERENCE_MODEL_ID",
  "messages": [
    {
      "role": "user",
      "content": "What is the current time and date?"
    }
  ],
  "tools": [
    {
      "type": "heroku_tool",
      "name": "dyno_run_command",
      "runtime_params": {
        "target_app_name": "$APP_NAME",
        "tool_params": {
          "cmd": "echo hello && date",
          "description": "Runs `echo hello && date` on one-off dyno.",
          "parameters": {
            "type": "object",
            "properties": {},
            "required": []
          }
        }
      }
    }
  ]
}
EOF

Example Response

event:message
data:{"id":"chatcmpl-183de038cafa9c3b09d8e","object":"chat.completion","created":1746798767,"model":"claude-3-7-sonnet","system_fingerprint":"heroku-inf-np7w0x","choices":[{"index":0,"message":{"role":"assistant","content":"I can help you find the current time and date by running a command on the system. Let me do that for you.","refusal":null,"tool_calls":[{"id":"tooluse_lgp6wvphSU-tz_8Ljp42Kg","type":"function","function":{"name":"dyno_run_command","arguments":"{}"}}]},"finish_reason":"tool_calls"}],"usage":{"prompt_tokens":397,"completion_tokens":65,"total_tokens":462}}

event:message
data:{"id":"chatcmpl-183de038cafa9c3b09d8e","object":"tool.completion","created":1746798768,"system_fingerprint":"heroku-inf-np7w0x","choices":[{"index":0,"message":{"role":"tool","content":"Tool 'dyno_run_command' returned result: hello\nFri May  9 13:52:48 UTC 2025","refusal":null,"tool_call_id":"tooluse_lgp6wvphSU-tz_8Ljp42Kg","name":"dyno_run_command"},"finish_reason":""}],"usage":{}}

event:message
data:{"id":"chatcmpl-183de038cafa9c3b09d8e","object":"chat.completion","created":1746798771,"model":"claude-3-7-sonnet","system_fingerprint":"heroku-inf-np7w0x","choices":[{"index":0,"message":{"role":"assistant","content":"The current time and date is:\nFriday, May 9, 2025, 13:52:48 UTC (Coordinated Universal Time)\n\nThis corresponds to:\n- 6:52:48 AM PDT (Pacific Daylight Time)\n- 9:52:48 AM EDT (Eastern Daylight Time)\n\nNote that the actual current time in your local timezone may differ depending on where you are located.","refusal":null},"finish_reason":"stop"}],"usage":{"prompt_tokens":509,"completion_tokens":99,"total_tokens":608}}

event:done
data:[DONE]

Keep reading

  • Inference API

Feedback

Log in to submit feedback.

Managed Inference and Agents API /v1/mcp/servers Managed Inference and Agents API /v1/chat/completions

Information & Support

  • Getting Started
  • Documentation
  • Changelog
  • Compliance Center
  • Training & Education
  • Blog
  • Support Channels
  • Status

Language Reference

  • Node.js
  • Ruby
  • Java
  • PHP
  • Python
  • Go
  • Scala
  • Clojure
  • .NET

Other Resources

  • Careers
  • Elements
  • Products
  • Pricing
  • RSS
    • Dev Center Articles
    • Dev Center Changelog
    • Heroku Blog
    • Heroku News Blog
    • Heroku Engineering Blog
  • Twitter
    • Dev Center Articles
    • Dev Center Changelog
    • Heroku
    • Heroku Status
  • Github
  • LinkedIn
  • © 2025 Salesforce, Inc. All rights reserved. Various trademarks held by their respective owners. Salesforce Tower, 415 Mission Street, 3rd Floor, San Francisco, CA 94105, United States
  • heroku.com
  • Legal
  • Terms of Service
  • Privacy Information
  • Responsible Disclosure
  • Trust
  • Contact
  • Cookie Preferences
  • Your Privacy Choices