Documentation

The Infrence API has one production endpoint — /v1/research — and a small surface around it.

Quickstart

  1. Sign up and grab your default API key from API keys.
  2. POST a question to /v1/research with your bearer token.
  3. Read the brief and citations from the response.
curl https://api.infrence.ai/v1/research \
  -H 'Authorization: Bearer inf_live_…' \
  -H 'Content-Type: application/json' \
  -d '{
  "question": "Compare the top vector databases",
  "mode": "standard"
}'

Authentication

All /v1/* endpoints take a bearer token: Authorization: Bearer inf_live_…. Keys are issued from the dashboard. Reset or revoke at any time.

POST /v1/research

Request body

{
  "question": "string",          // required
  "mode": "lite|standard|pro|max", // default: standard
  "max_credits": 60,             // optional cap; required for max
  "max_wallclock_secs": 180,     // optional override
  "response_schema": {...},      // optional JSON Schema
  "include_sources": true,
  "async": false,
  "webhook_url": null
}

Response (sync)

{
  "id": "9b0a…",
  "status": "succeeded",
  "question": "Compare the top vector databases",
  "brief": "Pinecone, Weaviate, and Qdrant lead the hosted-vector market…",
  "typed": null,
  "sources": [
    { "url": "https://…", "title": "…", "domain": "…", "summary": "…" }
  ],
  "wave_count": 2,
  "credits_charged": 60,
  "cost_usd": 0.36,
  "latency_ms": 12480,
  "created_at": "2026-05-04T10:11:12Z",
  "completed_at": "2026-05-04T10:11:24Z"
}

Response (async, 202)

{
  "id": "9b0a…",
  "status": "pending",
  "status_url": "https://api.infrence.ai/v1/research/9b0a…",
  "events_url": "https://api.infrence.ai/v1/research/9b0a…/events"
}

Modes

ModeCreditsMax costWallclockSources
Lite15$0.1560s5
Standard60$0.60180s20
Pro200$2.00360s60
Maxmeteredyour max_credits600s120

JSON Schema mode

Pass any JSON Schema in response_schema. The brief is re-cast into a typed object that conforms to the schema. If validation fails, your credits are refunded.

"response_schema": {
  "type": "object",
  "properties": {
    "competitors": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "name":     { "type": "string" },
          "url":      { "type": "string" },
          "pricing":  { "type": "string" }
        },
        "required": ["name", "url"]
      }
    }
  },
  "required": ["competitors"]
}

Async, webhooks, cancel

Long jobs (anything > 60s wallclock or with webhook_url) return 202 + an id. Poll GET /v1/research/{id} or stream events from GET /v1/research/{id}/events (SSE). Cancel with DELETE /v1/research/{id} — the unspent reservation is refunded.

Idempotency

Pass Idempotency-Key: <your-key>. Repeating the same key within 24h returns the original job instead of running a new one — safe to retry from a flaky client.

Errors

  • 400 — invalid request body or impossible budget.
  • 401 — missing or revoked API key.
  • 402 — insufficient credits. Top up or upgrade.
  • 404 — unknown job id (or owned by another user).
  • 5xx — internal error; the job is auto-refunded.

Code samples

cURL

curl https://api.infrence.ai/v1/research \
  -H 'Authorization: Bearer inf_live_…' \
  -H 'Content-Type: application/json' \
  -d '{
  "question": "Compare the top vector databases",
  "mode": "standard"
}'

JavaScript

const res = await fetch('https://api.infrence.ai/v1/research', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer inf_live_…',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
  "question": "Compare the top vector databases",
  "mode": "standard"
}),
});
const data = await res.json();

Python

import requests

res = requests.post(
    'https://api.infrence.ai/v1/research',
    headers={'Authorization': 'Bearer inf_live_…'},
    json={
  'question': 'Compare the top vector databases',
  'mode': 'standard'
},
)
data = res.json()

Go

body := `{
  "question": "Compare the top vector databases",
  "mode": "standard"
}`
req, _ := http.NewRequest("POST", "https://api.infrence.ai/v1/research", strings.NewReader(body))
req.Header.Set("Authorization", "Bearer inf_live_…")
req.Header.Set("Content-Type", "application/json")
resp, err := http.DefaultClient.Do(req)

Rust

let body = serde_json::json!({
  "question": "Compare the top vector databases",
  "mode": "standard"
});
let res = reqwest::Client::new()
    .post("https://api.infrence.ai/v1/research")
    .bearer_auth("inf_live_…")
    .json(&body)
    .send()
    .await?;