Skip to main content

Data Provider Guide

This guide explains how data providers publish signal catalogs via adagents.json, enabling AI agents to discover, verify authorization, and activate signals for advertising campaigns.

The Problem

Data providers (Pinnacle Data, Meridian Analytics, Apex Segments, etc.) own valuable audience and contextual data, but integrating with the growing ecosystem of AI-powered advertising agents presents challenges: Discovery is fragmented. Each signals agent (Luminary Data, Nova DSP, etc.) needs custom integrations to know what signals you offer. There’s no standard way for an AI agent to ask “what automotive purchase intent signals does Pinnacle Data have?” Authorization is opaque. When a buyer receives a signal from a signals agent, they can’t verify that the agent is actually authorized to resell it. They have to trust the intermediary. Signal semantics are inconsistent. Without standardized definitions, an AI agent can’t know whether “auto_intenders” is a binary segment, a propensity score, or a multi-value category—making it impossible to construct proper targeting expressions. Scaling requires N×M integrations. Every data provider needs custom integrations with every signals agent. This doesn’t scale.

The Solution

Signal Catalogs solve these problems by letting data providers publish a machine-readable catalog of their signals at a well-known URL. This enables:
  • Discovery: AI agents can find signals via natural language (“find automotive purchase intent signals”) or structured lookup
  • Authorization verification: Buyers can verify authorization by checking the data provider’s domain directly
  • Typed targeting: Signal definitions include value types (binary, categorical, numeric) so agents can construct correct targeting expressions
  • Scalable partnerships: Authorize agents once in your catalog; as you add signals, authorized agents automatically have access

Overview

Data providers own audience and contextual data (purchase intent, demographics, behavioral segments). The Signal Catalog feature lets you publish your signals in a standardized format that:
  • Enables discovery via natural language queries
  • Provides authorization verification for agents
  • Describes signal characteristics (binary, categorical, numeric)
  • Supports tag-based grouping for efficient authorization
This follows the same pattern as publishers declaring properties - instead of “what ad placements exist,” you’re declaring “what signals exist.”

The Parallel Pattern

PublishersData Providers
Declare properties (websites, apps)Declare signals (audiences, segments)
Authorize agents to sell inventoryAuthorize agents to resell signals
Use property_ids / property_tagsUse signal_ids / signal_tags
Buyers verify via publisher_domainBuyers verify via data_provider_domain
Both use /.well-known/adagents.json as the publishing mechanism. A single adagents.json file can declare both properties and signals simultaneously — see Unified declaration model.

File Location

Data providers host their signal catalog at:
https://your-domain.com/.well-known/adagents.json
Following RFC 8615 well-known URI conventions.

Basic Structure

{
  "$schema": "https://adcontextprotocol.org/schemas/v3/adagents.json",
  "contact": {
    "name": "Pinnacle Auto Data",
    "email": "partnerships@pinnacle-auto-data.com",
    "domain": "pinnacle-auto-data.com"
  },
  "signals": [
    {
      "id": "likely_ev_buyers",
      "name": "Likely EV Buyers",
      "description": "Consumers modeled as likely to purchase an electric vehicle in the next 12 months",
      "value_type": "binary",
      "tags": ["automotive", "green"]
    }
  ],
  "signal_tags": {
    "automotive": {
      "name": "Automotive Signals",
      "description": "Vehicle-related audience segments"
    },
    "green": {
      "name": "Green/Sustainability",
      "description": "Environmentally-conscious consumer segments"
    }
  },
  "authorized_agents": [
    {
      "url": "https://signals-agent.example.com",
      "authorized_for": "All automotive signals",
      "authorization_type": "signal_tags",
      "signal_tags": ["automotive"]
    }
  ],
  "last_updated": "2025-01-15T10:00:00Z"
}

Signal Definition

Each signal in the signals array describes a targetable segment:

Required Fields

FieldTypeDescription
idstringUnique identifier within your catalog. Pattern: ^[a-zA-Z0-9_-]+$
namestringHuman-readable signal name
value_typeenumData type: binary, categorical, or numeric

Optional Fields

FieldTypeDescription
descriptionstringDetailed description of what this signal represents
tagsarrayTags for grouping (lowercase, alphanumeric: ^[a-z0-9_-]+$)
allowed_valuesarrayFor categorical signals: valid values
rangeobjectFor numeric signals: { min, max, unit }
restricted_attributesarrayRestricted attribute categories this signal touches (e.g., ["health_data"]). Enables structural governance matching.
policy_categoriesarrayPolicy categories this signal is sensitive for (e.g., ["children_directed"]). Enables structural governance matching.

Signal Value Types

Binary Signals

User either matches or doesn’t. Most common type.
{
  "id": "likely_ev_buyers",
  "name": "Likely EV Buyers",
  "value_type": "binary",
  "tags": ["automotive", "purchase_intent"]
}
Targeting: Include or exclude users matching this signal.

Categorical Signals

User has one of several possible values.
{
  "id": "vehicle_ownership",
  "name": "Current Vehicle Ownership",
  "value_type": "categorical",
  "allowed_values": ["luxury_ev", "luxury_non_ev", "mid_range", "economy", "none"]
}
Targeting: Target users with specific values (e.g., “users who own a luxury EV or luxury non-EV”).

Numeric Signals

User has a score or measurement within a range.
{
  "id": "purchase_propensity",
  "name": "Auto Purchase Propensity",
  "value_type": "numeric",
  "range": {
    "min": 0,
    "max": 1,
    "unit": "score"
  }
}
Targeting: Target users within a value range (e.g., “propensity score > 0.7”).

Authorization Patterns

Pattern 1: Signal IDs (Direct References)

Authorize specific signals by ID:
{
  "authorized_agents": [
    {
      "url": "https://premium-agent.example.com",
      "authorized_for": "Premium automotive signals only",
      "authorization_type": "signal_ids",
      "signal_ids": ["likely_ev_buyers", "luxury_auto_intenders"]
    }
  ]
}
Best for: Specific, limited signal sets. Fine-grained control.

Pattern 2: Signal Tags (Efficient Grouping)

Authorize all signals with certain tags:
{
  "authorized_agents": [
    {
      "url": "https://full-catalog-agent.example.com",
      "authorized_for": "All automotive signals",
      "authorization_type": "signal_tags",
      "signal_tags": ["automotive"]
    }
  ]
}
Best for: Large catalogs. As you add signals with the tag, agents automatically get access.

Signal Tags

The signal_tags object provides metadata for tags used in signals:
{
  "signal_tags": {
    "automotive": {
      "name": "Automotive Signals",
      "description": "Vehicle ownership, purchase intent, and service signals"
    },
    "premium": {
      "name": "Premium Signals",
      "description": "High-value segments with enhanced pricing"
    }
  }
}
Why define tags?
  • Human-readable context for buyers exploring your catalog
  • Enables efficient authorization (“all premium signals”)
  • Groups related signals for easier discovery

How Buyers Use Your Catalog

1. Discovery

Buyers call get_signals on a signals agent. The agent may use your catalog for:
  • Natural language matching (“find automotive purchase intent signals”)
  • Structured lookup by signal_id

2. Authorization Verification

When a buyer receives a signal, they can verify authorization:
{
  "signal_id": {
    "data_provider_domain": "pinnacle-auto-data.com",
    "id": "likely_ev_buyers"
  }
}
The buyer fetches https://pinnacle-auto-data.com/.well-known/adagents.json and checks:
  1. Does the signal exist in the signals array?
  2. Is the signals agent in authorized_agents?
  3. Does the authorization cover this signal (by ID or tag)?

3. Targeting

Based on value_type, buyers construct targeting expressions:
// Binary targeting
{
  "signal_id": { "source": "catalog", "data_provider_domain": "pinnacle-auto-data.com", "id": "likely_ev_buyers" },
  "value_type": "binary",
  "value": true
}

// Categorical targeting
{
  "signal_id": { "source": "catalog", "data_provider_domain": "pinnacle-auto-data.com", "id": "vehicle_ownership" },
  "value_type": "categorical",
  "values": ["luxury_ev", "luxury_non_ev"]
}

// Numeric targeting
{
  "signal_id": { "source": "catalog", "data_provider_domain": "pinnacle-auto-data.com", "id": "purchase_propensity" },
  "value_type": "numeric",
  "min_value": 0.7
}

Agent-Native Signals

Not all signals come from data provider catalogs. Signals agents may also offer agent-native signals - custom signals they’ve created themselves (proprietary models, first-party data, etc.).

Signal ID Structure

Signal IDs use source as a discriminator:
SourceFieldsVerification
catalogdata_provider_domain + idVerifiable via data provider’s adagents.json
agentagent_url + idTrust-based - buyer trusts the agent

Example: Agent-Native Signal

{
  "signal_id": {
    "source": "agent",
    "agent_url": "https://luminary-data.com/.well-known/adcp/signals",
    "id": "custom_auto_intenders"
  },
  "value_type": "binary",
  "value": true
}

When to Use Each

Use source: "catalog" when:
  • Signal comes from an external data provider (Pinnacle Data, Meridian Analytics, etc.)
  • Authorization verification is important
  • You want to reference the canonical signal definition
Use source: "agent" when:
  • Signal is proprietary to the signals agent
  • No external data provider to verify against
  • Agent has created custom models or first-party segments

Complete Example

A full signal catalog for an automotive data provider:
{
  "$schema": "https://adcontextprotocol.org/schemas/v3/adagents.json",
  "contact": {
    "name": "Pinnacle Auto Data",
    "email": "partnerships@pinnacle-auto-data.com",
    "domain": "pinnacle-auto-data.com"
  },
  "signals": [
    {
      "id": "likely_ev_buyers",
      "name": "Likely EV Buyers",
      "description": "Consumers modeled as likely to purchase an electric vehicle in the next 12 months based on vehicle registration, financial, and behavioral data",
      "value_type": "binary",
      "tags": ["automotive", "premium"]
    },
    {
      "id": "vehicle_ownership",
      "name": "Current Vehicle Ownership",
      "description": "Current vehicle category owned by the consumer",
      "value_type": "categorical",
      "allowed_values": ["luxury_ev", "luxury_non_ev", "mid_range", "economy", "none"],
      "tags": ["automotive"]
    },
    {
      "id": "purchase_propensity",
      "name": "Auto Purchase Propensity",
      "description": "Likelihood score of purchasing any new vehicle in the next 6 months",
      "value_type": "numeric",
      "range": { "min": 0, "max": 1, "unit": "score" },
      "tags": ["automotive"]
    }
  ],
  "signal_tags": {
    "automotive": {
      "name": "Automotive Signals",
      "description": "Vehicle-related audience segments"
    },
    "premium": {
      "name": "Premium Signals",
      "description": "High-value premium audience segments with enhanced pricing"
    }
  },
  "authorized_agents": [
    {
      "url": "https://luminary-data.com/.well-known/adcp/signals",
      "authorized_for": "All Pinnacle automotive signals via Luminary Data",
      "authorization_type": "signal_tags",
      "signal_tags": ["automotive"]
    },
    {
      "url": "https://nova-dsp.com/.well-known/adcp/signals",
      "authorized_for": "Pinnacle premium signals only",
      "authorization_type": "signal_ids",
      "signal_ids": ["likely_ev_buyers"]
    }
  ],
  "last_updated": "2025-01-15T10:00:00Z"
}

Location data provider example

A geo/mobility provider’s signal catalog uses the same structure but with location-specific signals. Here’s the signals array for a provider publishing foot traffic and mobility data:
{
  "signals": [
    {
      "id": "store_visitors",
      "name": "Store Visitors",
      "description": "Consumers who visited a specified retail location in the past 30 days based on opted-in mobile device data",
      "value_type": "binary",
      "tags": ["geo", "foot_traffic"]
    },
    {
      "id": "visit_frequency",
      "name": "Location Visit Frequency",
      "description": "Monthly visit count to a specified location category",
      "value_type": "numeric",
      "range": { "min": 0, "max": 30, "unit": "visits_per_month" },
      "tags": ["geo", "frequency"]
    },
    {
      "id": "commute_pattern",
      "name": "Commute Pattern",
      "description": "Categorized daily commute behavior based on observed travel patterns",
      "value_type": "categorical",
      "allowed_values": ["urban_transit", "suburban_driver", "remote_worker", "hybrid"],
      "tags": ["geo", "behavioral"]
    }
  ]
}
Note how the three value types map to different geo concepts: binary for yes/no store visitation, numeric for visit frequency with a meaningful range, and categorical for classified mobility behavior.

Identity / demographic provider example

An identity company’s signal catalog publishes consumer segments derived from financial records, surveys, and public data. Note: these are targeting segments, not raw data. Credit-derived signals may carry regulatory obligations (FCRA) — consult your compliance team before publishing.
{
  "signals": [
    {
      "id": "household_income",
      "name": "Household Income Tier",
      "description": "Modeled household income bracket based on financial and demographic indicators",
      "value_type": "categorical",
      "allowed_values": ["under_50k", "50k_75k", "75k_100k", "100k_150k", "150k_250k", "over_250k"],
      "tags": ["demographic", "income"]
    },
    {
      "id": "life_stage",
      "name": "Life Stage",
      "description": "Life stage classification derived from demographic and behavioral indicators",
      "value_type": "categorical",
      "allowed_values": ["young_adult", "early_career", "established_family", "empty_nester", "retired"],
      "tags": ["demographic", "life_stage"]
    },
    {
      "id": "credit_active",
      "name": "Active Credit Seeker",
      "description": "Consumer has actively applied for new credit products in the past 90 days",
      "value_type": "binary",
      "tags": ["financial", "in_market", "credit"]
    }
  ]
}
Identity companies often also provide cross-device identity graphs, but identity resolution as a service (matching Device A to Person B) is not yet part of the AdCP protocol. See the signals ecosystem guide for more on this boundary.

Retail media provider example

Retailers have first-party purchase data that doubles as high-value targeting signals. A retail media network can publish signals alongside its properties in the same adagents.json:
{
  "signals": [
    {
      "id": "category_buyer",
      "name": "Category Buyer",
      "description": "Purchased in the specified product category within the past 90 days",
      "value_type": "categorical",
      "allowed_values": ["electronics", "home", "beauty", "grocery", "fashion"],
      "tags": ["retail", "purchase"]
    },
    {
      "id": "purchase_frequency",
      "name": "Monthly Purchase Frequency",
      "description": "Number of purchases in a product category over the trailing 90 days",
      "value_type": "numeric",
      "range": { "min": 0, "max": 50, "unit": "purchases" },
      "tags": ["retail", "frequency"]
    },
    {
      "id": "new_to_brand",
      "name": "New to Brand",
      "description": "Consumer has no prior purchase history with the specified brand in the trailing 12 months",
      "value_type": "binary",
      "tags": ["retail", "conquest"]
    }
  ]
}
Retail signals are especially valuable because they’re deterministic — based on actual purchases, not modeled behavior. See the signals ecosystem guide for the dual-role pattern (publisher + data provider).

Validation

Use the AdAgents.json Builder to validate your signal catalog, or validate programmatically:
curl -X POST https://adcontextprotocol.org/api/adagents/validate \
  -H "Content-Type: application/json" \
  -d '{"domain": "your-domain.com"}' | jq '.data.validation'
The validator checks:
  • Required fields (id, name, value_type for each signal)
  • ID patterns (alphanumeric with underscores/hyphens)
  • Tag consistency (tags used in signals should be defined in signal_tags)
  • Authorization references (signal_ids/signal_tags should reference existing signals/tags)

Best Practices

1. Use Descriptive IDs

// Good
{ "id": "likely_ev_buyers" }
{ "id": "household_income_150k_plus" }

// Avoid
{ "id": "seg_12345" }
{ "id": "a1b2c3" }

2. Provide Complete Metadata

Include description so buyers understand what each signal represents.

3. Use Tags for Scalability

As your catalog grows, tags enable efficient authorization without listing every signal ID.

4. Document Value Types Clearly

For categorical signals, always include allowed_values. For numeric signals, include range with unit.

5. Keep Files Updated

Update last_updated timestamp when signals change. Buyers cache these files - stale data causes authorization failures.

Declaring governance metadata

Signal definitions support two optional fields that enable structural governance matching: restricted_attributes and policy_categories. When declared, governance agents can match signals against a campaign plan’s restrictions deterministically instead of relying on semantic inference from signal names.

restricted_attributes

Declare which GDPR Article 9 special categories of personal data a signal touches. Values: racial_ethnic_origin, political_opinions, religious_beliefs, trade_union_membership, health_data, sex_life_sexual_orientation, genetic_data, biometric_data.
{
  "id": "chronic_condition_hh",
  "name": "Chronic Condition Households",
  "description": "Households with modeled indicators of chronic health conditions",
  "value_type": "binary",
  "tags": ["health", "demographic"],
  "restricted_attributes": ["health_data"]
}
When a campaign plan declares restricted_attributes: ["health_data"], a governance agent blocks this signal without needing to interpret the description.

policy_categories

Declare which policy categories a signal is sensitive for. Policy categories group related regulatory regimes — children_directed covers COPPA, UK AADC, and GDPR Article 8. Values are registry-defined category IDs.
{
  "id": "kids_cartoon_fans",
  "name": "Kids Cartoon Fans",
  "description": "Children aged 6-12 who watch animated content",
  "value_type": "binary",
  "tags": ["entertainment", "children"],
  "policy_categories": ["children_directed"]
}

Combining both fields

A signal can declare both when it touches restricted personal data and is relevant to a specific regulatory regime:
{
  "id": "fertility_intent",
  "name": "Fertility Intent",
  "description": "Consumers researching fertility treatments",
  "value_type": "binary",
  "tags": ["health", "life_stage"],
  "restricted_attributes": ["health_data"],
  "policy_categories": ["pharmaceutical_advertising"]
}
Without governance metadata, a governance agent must infer sensitivity from signal names — this is fragile and produces false positives. Declared attributes enable deterministic matching.

Relationship to the Policy Registry

Signal definitions declare policy_categories and restricted_attributes using the same vocabulary as the Policy Registry. These fields enable governance agents to match signal metadata against policy entries during campaign validation.
Signal fieldRegistry equivalentPurpose
policy_categoriespolicy_categories on policy entriesDeclares which regulatory regimes the signal touches (e.g., children_directed, health_wellness)
restricted_attributesrestricted_attributes on policy categoriesDeclares which GDPR Article 9 special categories the signal touches (e.g., health_data, racial_ethnic_origin)
Values MUST match the canonical definitions in the Policy Registry. See policy category definitions for the full list of valid policy_categories values and restricted attribute definitions for valid restricted_attributes values.

Integration with get_adcp_capabilities

Signal agents advertise available data providers via get_adcp_capabilities:
{
  "signals": {
    "data_provider_domains": ["pinnacle-auto-data.com", "meridian-analytics.com", "apex-segments.com"]
  }
}
This tells buyers which data providers’ catalogs the agent can access.

Next Steps

  1. Create your adagents.json with your signal catalog
  2. Host at /.well-known/adagents.json on your domain
  3. Validate using the AdAgents.json Builder
  4. Partner with signals agents who will resell your data
  5. Add agents to authorized_agents as partnerships are established