Understanding Tool Calls

Why tool calls exist. They hand off work to real systems when the model can’t know the answer.
What you get. Accurate, traceable results aligned with systems you already trust.
What this page covers. When to use a tool vs. text-only response, and how to encode arguments reliably.

We cover the request lifecycle, how structured responses are marshalled in Rust, and what telemetry you can collect to monitor usage.

Example: Asking “What is the price of Bitcoin?” without Tools

When you omit the tools array, the model must answer using its internal context, which usually results in a generic response. The example below uses the same granite4:tiny-h model referenced in the Ollama quick start guide:

curl http://localhost:11434/api/chat \
  -H "Content-Type: application/json" \
  -d '{
        "model": "granite4:tiny-h",
        "stream": false,
        "messages": [
          {
            "role": "system", 
            "content": "You are a precise assistant that admits uncertainty."
          },
          {
            "role": "user", 
            "content": "What is the price of Bitcoin today in USD?"
          }
        ]
      }'

Typical output looks like:

{
  "role": "assistant",
  "content": "I’m sorry, but I don’t have access to real‑time data, so I can’t provide today’s Bitcoin price. For the most up‑to‑date figure, please check a reliable financial source such as a cryptocurrency exchange (e.g., Coinbase, Binance), a market data site (e.g., CoinMarketCap, CoinGecko), or a financial news outlet. If you’d like, I can share information about how Bitcoin’s price has historically behaved or explain the factors that influence its value. Let me know how I can help!"
}

Example: Asking “What is the price of Bitcoin?” with Tools

The chat endpoint in Ollama can emit structured tool invocations when the user request requires code. Re-using the granite4:tiny-h model, expose a get_current_datetime tool so the assistant can call into deterministic logic:

curl http://localhost:11434/api/chat \
  -H "Content-Type: application/json" \
  -d '{
    "model": "granite4:tiny-h",
    "stream": false,
    "messages": [
      {"role": "system", "content": "You are a precise assistant that admits uncertainty."},
      {"role": "user", "content": "What is the price of Bitcoin today in USD?"}
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_bitcoin_price_usd",
          "description": "Returns the current price of Bitcoin in USD.",
          "parameters": {
            "type": "object",
            "properties": {},
            "additionalProperties": false
          }
        }
      }
    ],
    "tool_choice": "auto"
  }'

Ollama returns either a natural-language answer or a tool payload. When a tool is selected the message looks like:

{
  "role": "assistant",
  "content": "",
  "tool_calls": [
    {
      "name": "get_bitcoin_price_usd",
      "arguments": "{}"
    }
  ]
}

Tool payload = executable intent. It includes the tool name and JSON arguments your code should run.
Run → respond. Execute the tool, send the result back, and the assistant returns a natural-language answer.
Why tools matter. With a tool, you get the real Bitcoin price; without one, you get apologies or guesses.
Teaching point. Use this contrast to justify deterministic code behind every real-world capability.