Vexi Docs

Structured Business Data for AI Agents

Vexi returns a complete Agent Business Object (ABO) for any business — structured, confidence-scored, and ready for your agent to act on. One API call replaces hours of web scraping.

⚡

Fast

p95 under 200ms for cached ABOs

🔒

Typed

Predictable JSON schema, every time

🤖

Agent-ready

MCP server included, works with Claude, Cursor, LangChain

🌍

Global

40+ countries, all business types

Base URLhttps://api.getvexi.dev/v1

Quick Start #

Get your first ABO in under 2 minutes.

1Get your API key

Sign up at getvexi.dev and create a key in your dashboard. You'll get a vxi_test_ key by default — switch to vxi_live_ when you're ready for production traffic.

2Make your first request

Pass your key as a Bearer token. This searches for the top three CRM platforms in the US:

curl "https://api.getvexi.dev/v1/search?q=CRM+software&location=US&limit=3" \
  -H "Authorization: Bearer vxi_test_••••••••"

import requests

resp = requests.get(
    "https://api.getvexi.dev/v1/search",
    params={"q": "CRM software", "location": "US", "limit": 3},
    headers={"Authorization": "Bearer vxi_test_••••••••"},
)
data = resp.json()

const resp = await fetch(
  "https://api.getvexi.dev/v1/search?q=CRM+software&location=US&limit=3",
  { headers: { Authorization: "Bearer vxi_test_••••••••" } }
);
const data = await resp.json();

3Read the response

You'll get back an array of ABOs, each with a confidence score, canonical fields, and a stable slug for follow-up calls.

200 OK

▸{ 3 results · query_id: q_8f2c1a }

{
  "query_id": "q_8f2c1a",
  "count": 3,
  "credits_used": 3,
  "results": [
    {
      "slug": "hubspot",
      "name": "HubSpot",
      "business_type": "saas",
      "category": "crm",
      "location": {
        "city": "Cambridge",
        "state": "MA",
        "country": "US"
      },
      "contact": {
        "website": "https://hubspot.com"
      },
      "completeness": 96,
      "confidence": 0.98,
      "last_verified": "2026-05-21T08:11:00Z"
    },
    // salesforce, pipedrive…
  ]
}

Authentication #

All requests are authenticated with a Bearer token in the Authorization header. Keys are created in your dashboard and can be rotated at any time without downtime.

Authorization:Bearervxi_live_••••••••••••••••••••••••

Key format

Prefix	Type
vxi_live_	Production key — counts against your monthly credits, returns real data.
vxi_test_	Test key — 100 free calls/day, returns deterministic fixture ABOs.

Never expose your API key in client-side code or public repositories. If a key leaks, rotate it from your dashboard immediately — the old key is invalidated within seconds.

API Reference #

Five endpoints. All return JSON. All authenticated with the same Bearer token.

get/v1/search

Search for businesses by query, location, and category. Returns ranked ABO matches with confidence scores.

Query parameters

Parameter	Type	Required	Description
q	string	Required	Free-text search query. Business name, brand, or descriptive phrase.
location	string	optional	ISO-3166 country code. Examples: `US`, `BR`, `GB`.
business_type	string	optional	L1 category slug from `/v1/categories`.
limit	integer	optional	Max results. Free: 10, Starter: 25, Growth: 50. Default 10.
min_completeness	integer	optional	Minimum ABO completeness score, 0–100. Filters thin records.
wait Starter+	boolean	optional	If `true`, blocks until any pending crawl completes (max 30s).

Request

curl "https://api.getvexi.dev/v1/search" \
  --data-urlencode "q=notion" \
  --data-urlencode "location=US" \
  --data-urlencode "business_type=saas" \
  --data-urlencode "limit=25" \
  -G -H "Authorization: Bearer $VEXI_KEY"

import requests

resp = requests.get(
    "https://api.getvexi.dev/v1/search",
    params={"q": "notion", "location": "US", "limit": 25},
    headers={"Authorization": "Bearer $VEXI_KEY"},
)
data = resp.json()

const resp = await fetch(
  "https://api.getvexi.dev/v1/search?q=notion&location=US&limit=25",
  { headers: { Authorization: "Bearer $VEXI_KEY" } }
);
const data = await resp.json();

Response

▸Expand response body

{
  "query_id": "q_2d9f01",
  "count": 4,
  "credits_used": 3,
  "cache_hit": true,
  "results": [
    {
      "slug": "notion",
      "name": "Notion",
      "business_type": "saas",
      "category": "project-management",
      "location": { "country": "US", "city": "San Francisco" },
      "completeness": 94,
      "confidence": 0.97
    }
  ]
}

Response fields

Field	Type	Description
query_id	string	Stable id for this search; usable for analytics and replay.
count	integer	Total matches before limit cap.
credits_used	integer	How many credits this call consumed.
cache_hit	boolean	`true` when served from the ABO cache.
results[]	ABO[]	Ranked array of Agent Business Objects. See ABO Schema.

get/v1/business/{slug}

Retrieve a single ABO by its stable slug. Returns the full record including hours, social links, and verification metadata.

Parameters

Parameter	Type	In	Description
slug	string	path	The ABO slug returned by `search`.
required_fields Starter+	string[]	query	Comma-separated field paths. If any are missing, Vexi auto-crawls before responding.

Request

curl "https://api.getvexi.dev/v1/business/hubspot?required_fields=offerings.pricing,trust.certifications" \
  -H "Authorization: Bearer $VEXI_KEY"

import requests

resp = requests.get(
    "https://api.getvexi.dev/v1/business/hubspot",
    params={"required_fields": "offerings.pricing,trust.certifications"},
    headers={"Authorization": "Bearer $VEXI_KEY"},
)
data = resp.json()

const resp = await fetch(
  "https://api.getvexi.dev/v1/business/hubspot?required_fields=offerings.pricing,trust.certifications",
  { headers: { Authorization: "Bearer $VEXI_KEY" } }
);
const data = await resp.json();

Response

▸Expand response body · ABO (full)

{
  "slug": "hubspot",
  "schema_version": "1.0",
  "identity": {
    "name": "HubSpot",
    "legal_name": "HubSpot, Inc.",
    "founded_year": 2006,
    "category": "crm",
    "business_type": "saas",
    "description_short": "All-in-one CRM, marketing, and sales platform for scaling teams."
  },
  "offerings": {
    "primary_offering": "Customer Platform (CRM + Marketing + Sales hubs)",
    "pricing": {
      "model": "freemium",
      "starting_from": "$0/mo",
      "free_tier": true
    }
  },
  "location": {
    "is_remote": true,
    "is_international": true,
    "addresses": [{
      "city": "Cambridge", "state": "MA", "country": "US", "is_headquarters": true
    }]
  },
  "contact": {
    "website": "https://hubspot.com",
    "linkedin": "https://linkedin.com/company/hubspot"
  },
  "trust": {
    "certifications": ["SOC 2 Type II", "ISO 27001", "GDPR"],
    "years_in_market": 20
  },
  "quality": {
    "completeness_score": 96,
    "confidence": "high"
  },
  "generated_at": "2026-05-21T08:11:00Z"
}

post/v1/crawl

Trigger a live crawl for a business URL. Returns immediately with a job id, or — when wait: true — blocks until the ABO is ready.

Request body

Field	Type	Required	Description
url	string	Required	Public URL of the business homepage.
wait	boolean	optional	Block until crawl completes (max 30s). Default `false`.

Request

curl "https://api.getvexi.dev/v1/crawl" \
  -X POST \
  -H "Authorization: Bearer $VEXI_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url":"https://hubspot.com","wait":false}'

import requests

resp = requests.post(
    "https://api.getvexi.dev/v1/crawl",
    json={"url": "https://hubspot.com", "wait": False},
    headers={"Authorization": "Bearer $VEXI_KEY"},
)
data = resp.json()

const resp = await fetch("https://api.getvexi.dev/v1/crawl", {
  method: "POST",
  headers: {
    Authorization: "Bearer $VEXI_KEY",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({ url: "https://hubspot.com", wait: false }),
});
const data = await resp.json();

Response

{
  "status": "queued",
  "job_id": "job_01HXY24KMP9Z3",
  "eta_seconds": 14,
  "poll_url": "/v1/jobs/job_01HXY24KMP9Z3"
}

get/v1/jobs/{job_id}

Poll a crawl job for completion. Returns the final ABO once status is complete.

Prefer webhooks over polling for production. Configure a webhook URL in your dashboard and Vexi will POST the completed ABO directly to your endpoint.

MCP Server #

Vexi ships a native Model Context Protocol (MCP) server, letting any MCP-compatible agent — Claude, Cursor, Windsurf, or your own — query structured business data without writing a single line of API code.

MCP (Model Context Protocol) is an open standard that lets AI models call external tools in a structured, permission-safe way. When you connect Vexi's MCP server, your agent can search, look up, and crawl businesses as naturally as asking a question.

Works out of the box with Claude Desktop, Cursor, Windsurf, and any MCP-compatible client.

Setup

Pick your client. You'll need a vxi_live_ key from your dashboard and Node 18+ on your machine.

Claude Desktop

macOS · Windows · Linux

1Open Settings → Developer → Edit Config.
2Add the snippet below to claude_desktop_config.json.
3Restart Claude Desktop. Look for the 🔌 icon in the chat input.

Cursor

macOS · Windows · Linux

1Open Cursor Settings → MCP → Add new server.
2Name Vexi, command npx -y @vexi/mcp, env VEXI_API_KEY=vxi_live_….
3Restart Cursor. Vexi tools appear in the agent panel.

claude_desktop_config.json

{
  "mcpServers": {
    "vexi": {
      "command": "npx",
      "args": ["-y", "@vexi/mcp"],
      "env": {
        "VEXI_API_KEY": "vxi_live_••••••••"
      }
    }
  }
}

Available Tools

Once connected, your agent has four typed tools available. Each maps 1:1 to a REST endpoint.

Tool	Description	Key parameters
search_businesses	Search businesses by query and location.	`query` req · `location` · `business_type` · `limit`
get_business	Retrieve a full ABO by slug.	`slug` req · `required_fields`
crawl_business	Queue a new business for crawling.	`url` req · `wait`
list_categories	List all available business categories.	—

Example

A real conversation with Claude Desktop, Vexi connected. The model picks the right tool, runs it, and reasons over typed ABOs:

claude · vexi mcp connected

YOU

You

Find me the top 3 CRM platforms used by startups in the US.

Claude

Let me search Vexi for that — querying CRM platforms in the US.

search_businessesquery="CRM platforms startups"location="US"limit=3

3 results · 412 msok

HubSpotcrm · saas96

Salesforcecrm · enterprise-saas94

Pipedrivecrm · saas89

Based on Vexi data, the three most-used CRM platforms for US startups are HubSpot (free tier, marketing-first), Salesforce (enterprise standard, deepest customization), and Pipedrive (sales-pipeline focused, lowest learning curve). Want me to pull the full ABO for any of them?

ABO Schema #

An Agent Business Object (ABO) is Vexi's core data structure — a normalized, confidence-scored JSON document that describes everything an AI agent needs to know about a business.

Every ABO follows the same predictable schema, so your agent never needs to handle parsing logic for different sources. Fields are typed, nullable values are explicit, and a completeness_score tells you how much of the schema was populated.

ABO · top-level

{
  "slug": "hubspot",
  "schema_version": "1.0",
  "generated_at": "2026-05-22T08:11:00Z",
  "source_url": "https://hubspot.com",
  "pages_scraped": 4,
  "identity":        { … core facts },
  "offerings":       { … products & services },
  "location":        { … physical & geographic },
  "contact":         { … channels },
  "trust":           { … credibility },
  "operations":      { … service ops },
  "agent_interface": { … agent-ready CTAs },
  "quality":         { … data quality }
}

Fields Reference

Eight top-level objects. Each is independently populated and confidence-scored.

identityobject

Core facts about the business — name, age, category, and a pair of agent-optimized descriptions.

namestringLegal or trade name.

legal_namestring | nullFull legal entity name.

founded_yearnumber | nullYear founded.

categorystringL1 business type (e.g. "saas").

subcategorystring | nullL2 category slug.

business_typestringOne of 20 canonical types.

description_shortstringOne-sentence summary, agent-optimized.

description_longstringFull paragraph description.

languagesstring[]ISO language codes.

markets_servedstring[]Geographic markets.

offeringsobject

Products and services the business provides.

primary_offeringstring | nullMain product/service in plain language.

servicesService[]Named services with name, description, category.

productsProduct[]Products with name, description, price_range, category.

pricingobject{ model, starting_from, free_tier, pricing_url }.

locationobject

Physical and geographic presence.

is_physicalbooleanHas physical location(s).

is_remotebooleanOperates remotely.

is_nationalbooleanServes entire country.

is_internationalbooleanOperates globally.

addressesAddress[]{ street, city, state, country, zip, is_headquarters }.

regions_servedstring[]Named regions or countries.

contactobject

All contact channels — web, social, and direct.

websitestringPrimary URL.

emailstring | nullContact email.

phonestring | nullPhone number.

whatsappstring | nullWhatsApp number.

instagramstring | nullInstagram handle.

linkedinstring | nullLinkedIn URL.

other_channelsChannel[]Additional channels.

trustobject

Social proof and credibility signals.

differentiatorsstring[]Key competitive advantages.

certificationsstring[]ISO, SOC 2, HIPAA, etc.

awardsstring[]Named awards and recognitions.

clients_servedstring | nullNotable clients or client count.

years_in_marketnumber | nullYears in operation.

ratingsRating[]{ platform, score, review_count }.

social_proofstring[]Testimonials and notable quotes.

operationsobject

Operational details for service businesses.

operating_hoursstring | nullBusiness hours.

response_timestring | nullTypical response SLA.

service_areastring | nullGeographic service area.

appointment_requiredboolean | nullWhether appointments are required.

online_bookingboolean | nullWhether online booking is supported.

agent_interfaceobject

Structured data optimized for agent decision-making — actions, intents, and tags.

actionsAction[]{ name, description, url, method } — CTAs the agent can take.

intents_matchedstring[]Use cases this business satisfies.

tagsstring[]Searchable keywords.

keywordsstring[]SEO and search keywords.

qualityobject

Data quality metadata — never use a field without checking this.

completeness_scorenumber0–100, percentage of fields populated.

confidence"low" | "medium" | "high"Extraction confidence.

missing_fieldsstring[]Dot-notation paths of empty fields.

crawl_notesstringNotes from the crawl process.

Rate Limits & Credits #

Every call costs a small number of credits, deducted from your monthly bucket. Credits reset on the first of each calendar month, UTC.

Credit cost per operation

Operation	Credits
GET /business (cache hit)	1
GET /categories	0
GET /search (cache hit)	3
GET /search (new crawl)	10
POST /crawl	10

Plan limits

Plan	Credits / month	Keys	Search results
Free ($0)	500	1	10
Starter ($49.90/mo)	10,000	5	25
Growth ($99.90/mo)	20,000	Unlimited	50

Handling 429s

When you hit your limit, Vexi returns 429 Too Many Requests with a Retry-After header. Back off, then retry:

import time, requests

def call_with_retry(url, headers, max_retries=5):
    for attempt in range(max_retries):
        r = requests.get(url, headers=headers)
        if r.status_code != 429:
            return r
        delay = int(r.headers.get("Retry-After", 2))
        time.sleep(delay * (2 ** attempt))
    raise RuntimeError("rate limited")

async function callWithRetry(url, headers, maxRetries = 5) {
  for (let i = 0; i < maxRetries; i++) {
    const r = await fetch(url, { headers });
    if (r.status !== 429) return r;
    const delay = Number(r.headers.get("Retry-After")) || 2;
    await new Promise(res => setTimeout(res, delay * 1000 * (2 ** i)));
  }
  throw new Error("rate limited");
}

Error Handling #

Vexi uses standard HTTP status codes. Every error returns a JSON body with a stable code, a human-readable message, and a request_id you can quote when contacting support.

Error codes

Code	Meaning
400	Bad request — malformed parameters or invalid body.
401	Invalid API key — missing, expired, or rotated.
403	Feature not available on your plan.
404	Business not found for that slug.
429	Rate limit exceeded — back off using `Retry-After`.
500	Server error — retry safe; surfaces in our status page.

Error response shape

{
  "error": {
    "code": "feature_not_available",
    "message": "The 'wait' parameter requires the Starter plan or higher.",
    "docs_url": "https://docs.getvexi.dev/errors/feature_not_available",
    "request_id": "req_01HXY27Q44LDC"
  }
}

5xx responses are safe to retry with idempotent backoff. POST /crawl deduplicates by URL inside a 60-second window, so retries won't double-charge credits.