Deep Research

Overview

Deep Research takes a free-text brief about one research target — a company, a person, or a team — and produces a cited report. A reasoning agent plans the research, fans out research agents that gather evidence from the web and Extruct’s data sources, reviews what came back, and synthesizes the result. It is implemented through the deep_research_tasks endpoints in API Reference.

This Path Works Best When

You want a deep, sourced report on one target: account planning, buyer research, initiative summaries, diligence.
You are researching a person before outreach or a meeting: who they are, what their team owns, and the best angle to open a conversation.
You need every claim backed by citations, or structured output that downstream systems can parse.
You are willing to wait minutes for an asynchronous task in exchange for depth.

Choose Another Path If

You want to discover many companies matching criteria. Use Deep Search.
You want repeatable enrichment across a list of companies. Use AI Tables.
You need one company’s profile facts instantly. Use Company Lookup.

Prerequisites

Deep Research requires a Pro plan, and starting a task requires the full depth budget in available credits (see Costs below).

export EXTRUCT_API_TOKEN="YOUR_API_TOKEN"

Generate tokens in Dashboard API Tokens. For full setup, see Authentication.

Endpoints used

Workflow

1) Create a task

Write the brief like a request to a research analyst: a detailed paragraph naming the target, your own context (what you sell or research, who your buyer is), and the decision the report should support. Detailed, specific briefs reliably produce better reports than one-liners — “Help me break into Shell” is a bad brief; the example below is a good one.

TASK_RESPONSE=$(curl -sS -X POST "https://api.extruct.ai/v1/deep_research_tasks" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${EXTRUCT_API_TOKEN}" \
  -d '{
    "brief": "We sell a cloud cost-optimization platform to large enterprises; typical buyers are VPs of Infrastructure and FinOps leads. I am preparing outreach to Shell. Research how Shell'"'"'s IT and digital organization is structured, who owns cloud infrastructure and FinOps decisions, which cloud, data, or efficiency initiatives they announced in the last 18 months, and which vendors or system integrators they already work with. I want practical conversation angles tied to live initiatives, plus any signals of cost-cutting programs or budget pressure.",
    "depth": "medium"
  }')

TASK_ID=$(echo "${TASK_RESPONSE}" | jq -r '.id')
echo "${TASK_ID}"

Requires jq. If unavailable, copy id manually from the response. People are first-class targets too — include your own context and the profile to research:

curl -sS -X POST "https://api.extruct.ai/v1/deep_research_tasks" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${EXTRUCT_API_TOKEN}" \
  -d '{
    "brief": "Here is my company: example.com. We sell AI-powered sales-enablement software to mid-market B2B teams. Research this person and the team they work with: https://www.linkedin.com/in/example-profile. I want their role and scope, what their team owns, recent initiatives or public statements, tools they already use, and the best angle to open a conversation.",
    "depth": "medium"
  }'

2) Poll until done

curl --get "https://api.extruct.ai/v1/deep_research_tasks/${TASK_ID}" \
  -H "Authorization: Bearer ${EXTRUCT_API_TOKEN}"

Poll every 10–15 seconds. While running, the counters show progress: iterations (analysis steps), agents (research agents completed), sources (unique sources collected). The task is finished when status is done or failed.

3) Read the report

For a markdown task, report.kind is "markdown": the markdown cites sources with bracketed ids like [1] that resolve against report.sources (id → URL).

curl -sS --get "https://api.extruct.ai/v1/deep_research_tasks/${TASK_ID}" \
  -H "Authorization: Bearer ${EXTRUCT_API_TOKEN}" | jq -r '.report.markdown'

Always check report.degradation_reasons. It lists, in plain language, anything that reduced coverage — research stopped at its budget or step limit, or some research agents failed. Empty for a clean run. Show these notes to your users alongside the report. If status is failed, read failure_reason: a rejected brief includes suggestions for fixing it, and failed tasks refund all their charges.

Structured reports (schema mode)

Pass an output_schema and the report comes back as JSON matching your schema instead of markdown. The schema must be a JSON Schema object ("type": "object"). We validate it the moment you create the task, so a bad schema fails right away with a 422 rather than wasting a run. Keep it small — up to 5 levels of nesting and 50 properties, with root #/$defs/<name> for shared definitions. A handful of decision-relevant fields beats a sprawling schema: depth and breadth push the model to pad fields instead of grounding them in evidence.

curl -sS -X POST "https://api.extruct.ai/v1/deep_research_tasks" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${EXTRUCT_API_TOKEN}" \
  -d '{
    "brief": "We provide fraud-prevention APIs for fintech platforms and are building an account plan for Stripe. Summarize Stripe'"'"'s enterprise product initiatives from the last 12 months, identify concrete product or partnership angles where a fraud-prevention vendor could plug in, and flag risks that could stall a deal.",
    "depth": "high",
    "output_schema": {
      "type": "object",
      "properties": {
        "summary": {"type": "string"},
        "recommended_angles": {"type": "array", "items": {"type": "string"}},
        "risks": {"type": "array", "items": {"type": "string"}}
      },
      "required": ["summary", "recommended_angles", "risks"]
    }
  }'

When the task is done, report.fields holds your values, guaranteed to validate against the schema. There is no separate citations object to join back to your fields. If you want sources, ask for them in the schema (below).

Sources and reasoning, inline

Wrap any value in an Evidenced object to get its provenance back beside it. It has three properties: value (your schema for the data), sources (the source URLs), and reasoning (one sentence). We fill sources and reasoning from the research and keep only URLs we actually fetched — an invented source is dropped. Nothing to parse or join: the evidence sits next to the value. Wrap at the level you want cited — a whole object for one citation per entity, each array item for one per record, or a single field. Unwrapped fields come back as plain values.

{
  "type": "object",
  "properties": {
    "people": {
      "type": "array",
      "items": {
        "type": "object",
        "required": ["value", "sources", "reasoning"],
        "properties": {
          "value": {
            "type": "object",
            "properties": {
              "name": {"type": "string"},
              "role": {"type": "string"}
            }
          },
          "sources": {"type": "array", "items": {"type": "string"}},
          "reasoning": {"type": "string"}
        }
      }
    }
  }
}

The evidence comes back inline with each value:

{
  "people": [
    {
      "value": {"name": "Juan Perez", "role": "CEO"},
      "sources": ["https://www.linkedin.com/in/juan-perez"],
      "reasoning": "LinkedIn lists them as the current CEO."
    }
  ]
}

Costs

depth sets the research-agent budget: medium = 5, high = 15, xhigh = 25.
Each research agent that runs costs 2 credits. Starting a task reserves the full budget in available credits (medium = 10, high = 30, xhigh = 50), but you are billed only for agents that actually run. A focused brief often finishes well under budget.
Failed tasks — rejected briefs, mid-run credit exhaustion, internal errors — refund all their charges.

API Guides

Build with AI Agents

Integrations

Overview

This Path Works Best When

Choose Another Path If

Prerequisites

Endpoints used

Workflow

1) Create a task

2) Poll until done

3) Read the report

Structured reports (schema mode)

Sources and reasoning, inline

Costs

​Overview

​This Path Works Best When

​Choose Another Path If

​Prerequisites

​Endpoints used

​Workflow

​1) Create a task

​2) Poll until done

​3) Read the report

​Structured reports (schema mode)

​Sources and reasoning, inline

​Costs

Overview

This Path Works Best When

Choose Another Path If

Prerequisites

Endpoints used

Workflow

1) Create a task

2) Poll until done

3) Read the report

Structured reports (schema mode)

Sources and reasoning, inline

Costs