Generate a chat message

POST

api

chat

curl http://localhost:11434/api/chat -d '{
  "model": "gemma3",
  "messages": [
    {
      "role": "user",
      "content": "why is the sky blue?"
    }
  ]
}'

{
  "model": "<string>",
  "created_at": "2023-11-07T05:31:56Z",
  "message": {
    "role": "assistant",
    "content": "<string>",
    "thinking": "<string>",
    "tool_calls": [
      {
        "function": {
          "name": "<string>",
          "description": "<string>",
          "arguments": {}
        }
      }
    ],
    "images": [
      "<string>"
    ]
  },
  "done": true,
  "done_reason": "<string>",
  "total_duration": 123,
  "load_duration": 123,
  "prompt_eval_count": 123,
  "prompt_eval_duration": 123,
  "eval_count": 123,
  "eval_duration": 123,
  "logprobs": [
    {
      "token": "<string>",
      "logprob": 123,
      "bytes": [
        123
      ],
      "top_logprobs": [
        {
          "token": "<string>",
          "logprob": 123,
          "bytes": [
            123
          ]
        }
      ]
    }
  ]
}

Body

application/json

model

string

required

Model name

messages

object[]

required

Chat history as an array of message objects (each with a role and content)

Show child attributes

tools

object[]

Optional list of function tools the model may call during the chat

Show child attributes

format

Format to return a response in. Can be json or a JSON schema

Available options:

json

options

object

Runtime options that control text generation

Show child attributes

stream

boolean

default:true

think

When true, returns separate thinking output in addition to content. Can be a boolean (true/false) or a string ("high", "medium", "low") for supported models.

keep_alive

Model keep-alive duration (for example 5m or 0 to unload immediately)

logprobs

boolean

Whether to return log probabilities of the output tokens

top_logprobs

integer

Number of most likely tokens to return at each token position when logprobs are enabled

Response

Chat response

model

string

Model name used to generate this message

created_at

string<date-time>

Timestamp of response creation (ISO 8601)

message

object

Show child attributes

done

boolean

Indicates whether the chat response has finished

done_reason

string

Reason the response finished

total_duration

integer

Total time spent generating in nanoseconds

load_duration

integer

Time spent loading the model in nanoseconds

prompt_eval_count

integer

Number of tokens in the prompt

prompt_eval_duration

integer

Time spent evaluating the prompt in nanoseconds

eval_count

integer

Number of tokens generated in the response

eval_duration

integer

Time spent generating tokens in nanoseconds

logprobs

object[]

Log probability information for the generated tokens when logprobs are enabled

Show child attributes

Generate embeddingsCreates vector embeddings representing the input text

⌘I

curl http://localhost:11434/api/chat -d '{
  "model": "gemma3",
  "messages": [
    {
      "role": "user",
      "content": "why is the sky blue?"
    }
  ]
}'

{
  "model": "<string>",
  "created_at": "2023-11-07T05:31:56Z",
  "message": {
    "role": "assistant",
    "content": "<string>",
    "thinking": "<string>",
    "tool_calls": [
      {
        "function": {
          "name": "<string>",
          "description": "<string>",
          "arguments": {}
        }
      }
    ],
    "images": [
      "<string>"
    ]
  },
  "done": true,
  "done_reason": "<string>",
  "total_duration": 123,
  "load_duration": 123,
  "prompt_eval_count": 123,
  "prompt_eval_duration": 123,
  "eval_count": 123,
  "eval_duration": 123,
  "logprobs": [
    {
      "token": "<string>",
      "logprob": 123,
      "bytes": [
        123
      ],
      "top_logprobs": [
        {
          "token": "<string>",
          "logprob": 123,
          "bytes": [
            123
          ]
        }
      ]
    }
  ]
}

API Reference

Endpoints

Generate a chat message

Body

Response