Getting Started
HiveIQ is an intelligent AI compute layer that routes every request to the optimal provider — cutting costs by up to 86% while improving reliability. One endpoint replaces six provider integrations.
Base URL
https://hiveiq.xyz
Quick example
Send your first request. Replace YOUR_API_KEY with the key you generate from your dashboard.
curl https://hiveiq.xyz/tasks \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "task_type": "chat_completion", "messages": [ {"role": "user", "content": "Explain TEEs in one sentence."} ], "provider": "auto", "max_tokens": 128 }'
Authentication
HiveIQ accepts three authentication methods on protected endpoints. Pick whichever matches your client.
1. User API Key (recommended for production)
Long-lived hiq_* keys you create from the dashboard. Each request is attributed to your account and counted against your tier's quota.
Authorization: Bearer hiq_<your-key>
2. User JWT (for browser sessions)
Short-lived (24h default) JWTs returned by POST /auth/login. Carries sub, email, jti, iat, exp. Revoked on /auth/logout via Redis blacklist.
{
"email": "user@example.com",
"password": "at-least-8-chars"
}
{
"token": "eyJhbGciOiJIUzI1NiIs…",
"user_id": "3a4f-…",
"email": "user@example.com"
}
3. Operator key (for self-hosted dashboards)
The HIVEIQ_API_KEY environment variable on the server. Sent the same way as a user key — no per-user attribution, unlimited rate. Intended for internal dashboards and integration tests.
Authorization: Bearer <HIVEIQ_API_KEY>
/health, /status, /, /dashboard, /docs, /auth/register, /auth/login, and the A2A discovery endpoint /.well-known/agent.json are public. Everything under /tasks, /analytics, and /payments requires a Bearer token.Tasks API
The core endpoint. Submit a chat completion and HiveIQ picks the optimal provider based on cost, latency, privacy, and reputation.
Submit a chat completion. Returns a single response with provider attribution, token counts, cost, and latency.
Request body
| Field | Description |
|---|---|
| messagesarrayrequired | List of {role, content} objects. role is one of system, user, assistant. |
| task_typestring | Currently only chat_completion. Default: chat_completion. |
| providerstring | auto (default), or one of openai, anthropic, runpod, replicate, ionet, phala. |
| routingstring | auto (deterministic cheapest-first, default), marketplace (competitive bidding), or direct (requires explicit provider). |
| privacy_tierstring | public (default), semi-private, or tee. Stronger tiers satisfy weaker requests but not vice versa. |
| data_residencystring | any (default), eu_only, or us_only. Filters providers by region. |
| modelstring | Optional explicit model id. When omitted, the provider's default is used. auto + workload_hint picks the best model from the registry. |
| workload_hintstring | chat (default), code, or reasoning. Used by the model registry to pick a benchmark-appropriate model when model="auto". |
| required_capabilitiesstring[] | Hard filter for marketplace bidding. Only providers whose capabilities are a superset will bid. Example: ["vision","function_calling"]. |
| max_tokensint | Default 512. Max 8192. |
| temperaturefloat | Default 0.7. Range 0.0–2.0. |
{
"task_type": "chat_completion",
"messages": [
{"role": "system", "content": "You are concise."},
{"role": "user", "content": "Explain CAP theorem in two sentences."}
],
"provider": "auto",
"routing": "auto",
"privacy_tier": "public",
"max_tokens": 256,
"temperature": 0.5
}
{
"id": "7f5d3a91-…",
"provider": "openai",
"model": "gpt-4o-mini",
"output": "CAP says any distributed datastore can guarantee at most two of consistency, availability, and partition tolerance simultaneously. In practice you pick CP or AP because partitions are unavoidable on real networks.",
"input_tokens": 22,
"output_tokens": 48,
"estimated_cost_usd": 0.0000321,
"latency_ms": 812
}
Same body as POST /tasks. Returns a Server-Sent Events stream of tokens as the underlying provider produces them.
# Each chunk arrives as an SSE message: event: token data: {"delta": "CAP "} event: token data: {"delta": "says "} # Final terminator with full metadata: event: done data: {"provider": "openai", "cost_usd": 0.0000321, "latency_ms": 812}
Recent task log for the calling user. Default limit 20, max 200.
Aggregate stats for the calling user: total_tasks, total_cost_usd, avg_latency_ms, tasks_by_provider, cost_by_provider, tasks_last_24h, cost_last_24h, cost_saved_vs_gpt4o.
Marketplace API
A competitive bidding layer over the provider mesh. Set "routing": "marketplace" on a task and providers compete on cost, latency, and reputation.
How bidding works
- HiveIQ loads every active provider whose capabilities satisfy
required_capabilitiesand whose privacy tier satisfies the request. - Providers without a verified NAC identity are excluded.
- Each remaining provider produces a bid in parallel within a 200ms window.
- Bids are scored:
score = w_cost · cost_norm + w_latency · latency_norm + w_reputation · rep_norm. Default weights are{cost: 0.4, latency: 0.3, reputation: 0.3}— override with theMARKETPLACE_WEIGHTSenv var. - The highest-scoring bid wins. Outcome (success / failure / actual latency vs. estimate) is recorded and feeds back into the provider's reputation with
α = 0.1smoothing. - NAC level 4 grants a +10% reputation multiplier at scoring time, so high-tier identities win closer auctions.
All active marketplace listings, sorted by reputation desc. Optional ?capability=<tag> filter.
{
"success": true,
"data": [
{
"provider_id": "openai",
"capabilities": ["chat", "reasoning", "code", "vision"],
"base_price_per_token": 2.5e-7,
"reputation_score": 0.987,
"is_active": true,
"created_at": "2026-04-10T17:26:31Z"
}
],
"meta": {"count": 6}
}
Last N bids across all tasks (default 20, max 500). Useful for live dashboards.
Every bid for a specific task. Includes the winner and losers, sorted by score desc. meta.winner_provider and meta.winner_score identify the winning bid.
Header counters: active_listings, avg_reputation, bids_24h, wins_24h, win_rate_leader, leader_wins.
Third-party self-registration. Stub. Returns 501 Not Implemented until the Vitreus testnet launches and we can verify staking transactions on-chain. Existing GET /marketplace/listings already shows pre-registered providers.
Multi-Agent Sessions
Decompose a complex task into 2–4 specialist sub-tasks, run them across the provider mesh in parallel where independent, and synthesize a coherent final answer via Claude.
Synchronously calls Claude to decompose the task, persists the session and steps, kicks off background execution, and returns the full plan plus a stream URL.
{
"task": "Compare PostgreSQL and Redis as caching layers, then recommend one for a high-write fintech workload.",
"orchestrator": "claude"
}
{
"success": true,
"data": {
"session_id": "7678cc6e-d08f-425b-af5b-7c0f2e3a635d",
"orchestrator_model": "claude-3-5-sonnet",
"status": "active",
"task_description": "Compare PostgreSQL and Redis…",
"decomposition_plan": [
{"agent_role": "analyst", "prompt": "…", "recommended_provider": "anthropic", "depends_on": []},
{"agent_role": "analyst", "prompt": "…", "recommended_provider": "anthropic", "depends_on": []},
{"agent_role": "writer", "prompt": "…", "recommended_provider": "anthropic", "depends_on": [0, 1]}
],
"steps": [/* 3 step records, all status="pending" initially */],
"total_cost_usd": 0,
"total_latency_ms": 0
},
"meta": {
"step_count": 3,
"stream_url": "/agents/sessions/7678cc6e-…/stream"
}
}
Recent sessions for the calling user, newest first.
Full session record including every step's current state, response, cost, latency, and the final synthesized result once the session completes.
Server-Sent Events stream of live progress. The connection opens with a snapshot event of current state, then streams events as the executor produces them. Closes on a terminal event.
Event types
| Event | Payload |
|---|---|
snapshot | Current state of every step. Replayed on every connect so a late subscriber sees what happened before they joined. |
step_started | {session_id, step_id, step_index, agent_role, provider_id} |
step_completed | {session_id, step_id, step_index, provider_id, model, cost_usd, latency_ms} |
step_failed | {session_id, step_id, step_index, error} |
synthesis_started | Emitted right before the final Claude synthesis pass. |
session_completed | {session_id, status, total_cost_usd, total_latency_ms, completed_steps, failed_steps} — terminal |
session_failed | Same shape as above with status = "failed" — terminal |
event: snapshot
data: {"type":"snapshot","step_count":3,"steps":[…]}
event: step_completed
data: {"step_index":0,"provider_id":"anthropic","cost_usd":0.001008,"latency_ms":2456}
event: synthesis_started
data: {"session_id":"7678cc6e-…"}
event: session_completed
data: {"status":"completed","total_cost_usd":0.017839,"total_latency_ms":33813}
Model Registry
A curated catalog of every model HiveIQ can route to, with benchmark scores, parameter counts, context windows, and live cost-per-1k figures.
Filterable, sortable model catalog.
Query parameters
| familystring | Model family — gpt, claude, llama, mistral, etc. |
| capabilitystring | Required capability tag — chat, reasoning, code, vision, function_calling. |
| sortstring | cost_asc (default), cost_desc, mmlu_desc, params_desc. |
Full record for one model: parameters, context window, capabilities, benchmark scores (MMLU, HumanEval, GSM8K), and per-1k pricing.
Header counters: total_models, families, avg_mmlu, cheapest_model, top_mmlu_model, usage_24h.
Pick the best model for a workload hint. Mirrors the logic the routing engine uses internally when model="auto".
{
"workload_hint": "reasoning",
"max_cost_per_1k": 0.005,
"required_capabilities": ["chat", "reasoning"]
}
Provider Identity
Every HiveIQ provider has an on-chain identity backed by a Non-Fungible Access Certificate (NAC) issued via the Vitreus pallet_nac_managing. Identity gates marketplace participation and influences scoring.
NAC level tiers
| Level | Meaning | Marketplace effect |
|---|---|---|
| L0 | Unverified — no certificate | Cannot participate in bidding |
| L1 | Basic identity verified | Allowed to bid · ×1.00 reputation |
| L2 | Reputation-staked | Allowed to bid · ×1.00 reputation |
| L3 | Compliance-attested | Allowed to bid · ×1.00 reputation |
| L4 | Enterprise / institutional | ×1.10 reputation multiplier |
verify_nac() implementation simulates nacManaging.checkNacLevel() with deterministic per-provider levels. Once the Vitreus testnet RPC is available, the call routes to the real chain — the response shape stays identical.All provider identities, sorted by NAC level desc. Set only_verified=true to filter out L0 records.
{
"success": true,
"data": [
{
"identity_id": "10294a48-…",
"provider_id": "openai",
"nac_level": 4,
"nac_account": "nac::openai",
"vtrs_address": "0x5f0e9aF2cD3a89A11d4f1c9d6FB76eD09a4cB10E",
"verified": true,
"reputation_multiplier": 1.1,
"issued_at": "2026-04-11T03:50:54Z",
"last_verified_at": "2026-04-11T03:51:13Z"
}
],
"meta": {"count": 6, "verified_count": 6, "source": "vitreus_pallet_nac_managing (simulated)"}
}
Single provider identity record. Returns 404 if no record exists.
Trigger an NAC verification check. Auto-registers the provider if no record exists, then calls (simulated) nacManaging.checkNacLevel() and updates the record with the new level + last_verified_at timestamp. meta.extrinsic identifies the underlying chain call.
A2A Protocol
HiveIQ implements Google's Agent2Agent (A2A) protocol so any A2A-compatible orchestrator can discover HiveIQ and submit tasks without integration work.
A2A discovery document. Public, unauthenticated. Returns the AgentCard with name, version, capabilities, default input/output modes, and the list of skills HiveIQ exposes.
{
"name": "HiveIQ",
"description": "AI compute orchestration with on-chain reputation, NAC-backed provider identity, and multi-provider routing.",
"url": "https://hiveiq.xyz/a2a",
"version": "0.5.0",
"provider": {"organization": "HiveIQ × Vitreus", "url": "https://vitreus.io"},
"capabilities": {"streaming": false, "pushNotifications": false, "stateTransitionHistory": true},
"defaultInputModes": ["text"],
"defaultOutputModes": ["text"],
"skills": [
{"id": "hiveiq.chat-completion", "name": "HiveIQ Chat Completion", "…": "…"},
{"id": "hiveiq.multi-agent-session", "name": "HiveIQ Multi-Agent Session", "…": "…"}
]
}
Synchronous A2A tasks/send equivalent. Translate the A2A envelope into a HiveIQ TaskRequest, execute via the routing engine, return an A2A Task envelope with state: completed and a metadata.hiveiq object containing provider, model, tokens, and cost.
{
"id": "task-1",
"sessionId": "chat-9f",
"input": {
"message": {
"role": "user",
"parts": [
{"type": "text", "text": "Say hello in haiku form."}
]
}
},
"metadata": {"max_tokens": 200, "temperature": 0.5}
}
{
"id": "task-1",
"sessionId": "chat-9f",
"status": {"state": "completed", "timestamp": "2026-04-11T04:15:42Z"},
"input": {"message": {"role": "user", "parts": [{"type": "text", "text": ""}]}},
"output": [
{"type": "text", "text": "Gentle morning light, / Whispers of the dawn arise, / Hello, world anew."}
],
"metadata": {
"hiveiq": {
"provider": "openai",
"model": "gpt-4o-mini",
"input_tokens": 14,
"output_tokens": 19,
"estimated_cost_usd": 1.35e-05
}
},
"createdAt": "2026-04-11T04:15:40Z",
"completedAt": "2026-04-11T04:15:42Z"
}
Spawn a HiveIQ multi-agent session from an A2A orchestrator request. Same body shape as /a2a/tasks; the input message is treated as the task description. Returns immediately with state: submitted and a metadata.hiveiq.session_id the caller polls or streams.
{
"id": "task-1",
"status": {"state": "submitted", "timestamp": "2026-04-11T04:17:50Z"},
"metadata": {
"hiveiq": {
"session_id": "7678cc6e-d08f-425b-af5b-7c0f2e3a635d",
"step_count": 3,
"stream_url": "/agents/sessions/7678cc6e-…/stream",
"poll_url": "/agents/sessions/7678cc6e-…"
}
}
}
Provider mesh in an A2A-friendly schema. HiveIQ-specific extension — useful for clients that want to enumerate which underlying providers can satisfy a skill before submitting.
A2A task status states
| submitted | Accepted by HiveIQ, not yet running. |
| working | Provider is producing output (streaming-only state). |
| input-required | Agent is asking for additional input before continuing. |
| completed | Terminal — output is in output[]. |
| failed | Terminal — see status.message. |
| canceled | Terminal — caller canceled before completion. |
Polling an async session
For sessions created via POST /a2a/sessions, poll the HiveIQ-native endpoint GET /agents/sessions/{session_id} using the session_id from metadata.hiveiq, or subscribe to the SSE stream at {stream_url} for live updates. The session terminates with status: completed or status: failed.
Python SDK
Zero-dependency stdlib client. No requests, no httpx, no transitive deps to vet.
sdk/hiveiq.py in the GitHub repo. Install it directly from GitHub with pip while the PyPI release is being prepared.# Install from GitHub (recommended for now): pip install git+https://github.com/Bison1330/hiveiq-core.git#subdirectory=sdk # Or grab the single-file SDK directly: curl -O https://raw.githubusercontent.com/Bison1330/hiveiq-core/main/sdk/hiveiq.py # Once PyPI publishing completes: pip install hiveiq
Quickstart
from sdk.hiveiq import HiveIQ, HiveIQAPIError client = HiveIQ( api_key="hiq_...", base_url="https://hiveiq.xyz", ) # Submit a task in auto routing mode result = client.execute( messages=[{"role": "user", "content": "Summarize the CAP theorem."}], provider="auto", privacy_tier="public", max_tokens=256, ) print(result["output"]) print(f"via {result['provider']} — {result['estimated_cost_usd']:.6f}")
Streaming
for chunk in client.stream( messages=[{"role": "user", "content": "Tell me a story."}], ): print(chunk["delta"], end="", flush=True)
Marketplace bidding
result = client.execute( messages=[{"role": "user", "content": "Generate a function to validate IBANs."}], routing="marketplace", required_capabilities=["chat", "code"], ) print(f"winner: {result['provider']}")
Multi-agent session
session = client.create_session( task="Compare PostgreSQL vs Redis as a fintech cache layer, then recommend one.", orchestrator="claude", ) print("session_id:", session["session_id"]) # Poll until complete final = client.wait_for_session(session["session_id"], timeout=120) print(final["final_result"])
Other helpers
client.providers()— list every registered provider with availability + pricingclient.analytics()— aggregate stats for the calling userclient.history(limit=20)— recent task log
Rate Limits
HiveIQ rate-limits per IP for anonymous traffic and per user for authenticated traffic. Counters live in Redis so they survive restarts and apply across workers.
| Tier | Tasks (/tasks) | Auth (login/register) | API Keys (max active) |
|---|---|---|---|
| Anonymous | 10/min · 50/day | 5–10/hour per IP | — |
| Authenticated | 50/min · 500/day | 5/hour per user | 10 |
| Operator key | unlimited | — | — |
Limit headers
Every rate-limited response carries the standard slowapi headers so clients can back off gracefully:
X-RateLimit-Limit: 50 X-RateLimit-Remaining: 47 X-RateLimit-Reset: 1712842631 Retry-After: 38
What happens at the limit
Requests over the limit return 429 Too Many Requests. The body includes a human-readable hint specific to the endpoint:
{
"detail": "Demo rate limit reached. Contact the HiveIQ team for full API access."
}
Error Reference
Phase 6 endpoints (/marketplace/*, /models/*, /identity/*, /agents/sessions/*) use a unified envelope. Legacy endpoints (/tasks/*, /auth/*) keep a flat {detail} shape for SDK back-compat.
Envelope format
{
"success": false,
"data": null,
"error": {"code": "not_found", "message": "Session 'abc' not found"},
"meta": {"ts": "2026-04-11T04:15:40Z"}
}
Status codes
| Code | Meaning | Common cause / fix |
|---|---|---|
| 400 | Bad Request | Privacy tier can't be satisfied by any available provider, or the request body is structurally valid but semantically invalid. |
| 401 | Unauthorized | Missing or invalid Authorization header. JWT may be expired, revoked, or issued before a password change. |
| 403 | Forbidden | Authenticated but the resource isn't owned by the calling user. |
| 404 | Not Found | The session_id, model_id, or provider_id doesn't exist. Check the path. |
| 409 | Conflict | Typically the API key cap (10 active keys per user) — revoke an existing key first. |
| 422 | Unprocessable Entity | Request body failed Pydantic validation. The response body lists each invalid field. |
| 429 | Too Many Requests | Rate limit exceeded. Honor the Retry-After header. |
| 500 | Internal Server Error | Unhandled exception inside HiveIQ. Captured by Sentry; the request id is in the response headers as X-Request-Id. |
| 501 | Not Implemented | Endpoint is reserved but not yet active — currently only POST /marketplace/listings. |
| 502 | Bad Gateway | Every eligible provider failed (AllProvidersFailedError) or the upstream LLM API returned an unrecoverable error. |
| 503 | Service Unavailable | /health returned degraded — DB, cache, or all providers offline. Check GET /health. |