Overview
The Marketing Science Gateway is a stateless HTTP API. There is no SDK to install and no session state to manage. Every request is fully independent — send data in, get results back.
The API uses MCP (Model Context Protocol) — a lightweight JSON-RPC 2.0 envelope over HTTP. You do not need an MCP library to use it. All examples on this page use plain HTTP.
Every call follows the same two-step pattern:
- Discover — describe what you need in plain English; get back the best-matching algorithms and their required inputs.
- Execute — call the algorithm with your data; get back structured results.
You can skip discovery entirely if you already know which algorithm you want — go straight to Execute.
The endpoint
All requests go to a single URL provided when your API key is issued. The base URL takes the form:
POST https://api.yourdatastories.com/mcp
| Property | Value |
|---|---|
| Method | POST — all requests use POST |
| Content-Type | application/json |
| Accept | application/json, text/event-stream |
| Response format | Server-Sent Events (SSE) — one data: line containing JSON |
| State | Stateless — each request is independent; no persistent session required |
| Protocol | JSON-RPC 2.0 (MCP Streamable HTTP transport) |
Authentication
All requests require an API key passed as a bearer token in the Authorization header:
Authorization: Bearer YOUR_API_KEY
Your API key and endpoint URL are issued together. To request access, email admin@yourdatastories.com.
Quickstart
A Bayesian A/B test in two steps. Replace YOUR_API_KEY and YOUR_ENDPOINT with the values from your API key confirmation.
curl
# Step 1 — open a session, capture the session ID SESSION_ID=$(curl -s -D - -o /dev/null \ -X POST YOUR_ENDPOINT \ -H 'Content-Type: application/json' \ -H 'Accept: application/json, text/event-stream' \ -H 'Authorization: Bearer YOUR_API_KEY' \ -d '{"jsonrpc":"2.0","method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"my-client","version":"1.0"}},"id":0}' \ | grep -i 'mcp-session-id' | awk '{print $2}' | tr -d '\r') # Step 2 — call the algorithm curl -s -X POST YOUR_ENDPOINT \ -H 'Content-Type: application/json' \ -H 'Accept: application/json, text/event-stream' \ -H 'Authorization: Bearer YOUR_API_KEY' \ -H "Mcp-Session-Id: $SESSION_ID" \ -d '{ "jsonrpc": "2.0", "method": "tools/call", "params": { "name": "bayesian_ab_testing", "arguments": { "control_successes": 1200, "control_trials": 8400, "test_successes": 1380, "test_trials": 8400 } }, "id": 1 }'
Python
import requests, json, os ENDPOINT = os.environ["YDS_ENDPOINT"] # e.g. https://api.yourdatastories.com/mcp API_KEY = os.environ["YDS_API_KEY"] HEADERS = { "Content-Type": "application/json", "Accept": "application/json, text/event-stream", "Authorization": f"Bearer {API_KEY}", } def get_session_id(): r = requests.post(ENDPOINT, headers=HEADERS, json={ "jsonrpc": "2.0", "method": "initialize", "params": {"protocolVersion": "2024-11-05", "capabilities": {}, "clientInfo": {"name": "my-client", "version": "1.0"}}, "id": 0, }, timeout=8) return r.headers.get("Mcp-Session-Id") or r.headers.get("mcp-session-id") def call_algorithm(session_id, tool_name, arguments): headers = {**HEADERS, "Mcp-Session-Id": session_id} r = requests.post(ENDPOINT, headers=headers, json={ "jsonrpc": "2.0", "method": "tools/call", "params": {"name": tool_name, "arguments": arguments}, "id": 1, }, timeout=30) # Parse the SSE envelope raw = r.text.strip() for line in raw.splitlines(): if line.strip().startswith("data:"): raw = line.strip()[5:].strip() break rpc = json.loads(raw) return json.loads(rpc["result"]["content"][0]["text"]) sid = get_session_id() result = call_algorithm(sid, "bayesian_ab_testing", { "control_successes": 1200, "control_trials": 8400, "test_successes": 1380, "test_trials": 8400, }) print(json.dumps(result, indent=2))
Discovering algorithms
find_marketing_algorithm is the gateway's built-in search tool. Describe what you want to analyse in plain English — it returns the top matching algorithms with their names, purpose, and required inputs.
Use it when you do not know which algorithm to call, or when you want to route a user's natural language question automatically.
matches = call_algorithm(sid, "find_marketing_algorithm", { "query": "segment my customers by purchase behaviour" }) # Returns up to 5 ranked results: # [{"tool_name": "rfm_segmentation", "purpose": "...", "required_inputs": [...], "score": 0.87}] for algo in matches["matched_algorithms"]: print(algo["tool_name"], "—", algo["required_inputs"])
Discovery uses semantic search (sentence-transformers) with a keyword fallback. Plain synonyms and marketing domain terminology both work — you do not need to know the exact algorithm name.
Calling an algorithm
Once you know the algorithm name and its required inputs, call it directly. Browse all 134 algorithms and their input schemas in the algorithm library.
Request structure
{
"jsonrpc": "2.0",
"method": "tools/call",
"params": {
"name": "algorithm_name",
"arguments": { /* algorithm inputs */ }
},
"id": 1
}
Example — RFM segmentation
result = call_algorithm(sid, "rfm_segmentation", { "records": [ {"customer_id": "C001", "order_date": "2024-11-15", "revenue": 250}, {"customer_id": "C002", "order_date": "2025-01-03", "revenue": 85}, {"customer_id": "C001", "order_date": "2025-02-20", "revenue": 190}, ], "customer_id_col": "customer_id", "date_col": "order_date", "revenue_col": "revenue", })
Data preparation
All data is passed as inline JSON in the request body. No file uploads or external storage is required.
| Property | Detail |
|---|---|
| Format | Array of row objects — [{"col": value, ...}, ...] |
| Row limit | ~2,000–5,000 rows per request. Above this, response times increase. Contact us if you need to process larger datasets. |
| Numbers | Pass as numeric types, not strings. Strip currency symbols and thousands separators before sending — $3,200 → 3200 |
| Dates | ISO 8601 strings: YYYY-MM-DD or YYYY-MM-DD HH:MM:SS |
| Column names | Match exactly what required_inputs specifies for the algorithm |
Parsing the response
Responses use Server-Sent Events (SSE) format. The body looks like this:
event: message
data: {"jsonrpc":"2.0","id":1,"result":{"content":[{"type":"text","text":"{...output...}"}]}}
The algorithm output is a JSON string nested inside result.content[0].text. Extract it like this:
raw = response.text.strip() for line in raw.splitlines(): if line.strip().startswith("data:"): raw = line.strip()[5:].strip() break rpc = json.loads(raw) output = json.loads(rpc["result"]["content"][0]["text"])
Output types
| Type in algorithm | Serialised as |
|---|---|
| DataFrame | {"columns": [...], "data": [...]} (split orient) |
| numpy array | JSON array |
| matplotlib Figure | Base64-encoded PNG string |
| numpy scalar | Python primitive (int or float) |
Error handling
Input validation errors
When required inputs are missing or the wrong type, the outer JSON envelope is valid but content[0].text contains a plain-text error message rather than JSON. Always wrap the inner parse in a try/except:
text = rpc["result"]["content"][0]["text"] try: output = json.loads(text) except json.JSONDecodeError: raise ValueError(f"Validation error: {text}")
Timeouts
Most algorithms return in under 3 seconds. Model-training algorithms (churn_scoring, propensity_model, customer_lifetime_value_predictive) can take 10–20 seconds on larger payloads. Set your HTTP client timeout to at least 30 seconds.
Retries
The API is stateless — all requests are safe to retry. On timeout or 5xx, retry with standard exponential backoff (1s, 2s, 4s). Do not retry 400-level validation errors — fix the input first.
Agent framework integration
The gateway is a standards-compliant MCP server. Any MCP-compatible framework can connect to it directly — all 143 algorithms register as callable tools automatically.
Generic MCP server config
{
"mcpServers": {
"marketing-science": {
"url": "YOUR_ENDPOINT",
"transport": "http",
"headers": {
"Authorization": "Bearer YOUR_API_KEY"
}
}
}
}
OpenAI Agents SDK
from agents import Agent, Runner from agents.mcp import MCPServerHTTP gateway = MCPServerHTTP( url=os.environ["YDS_ENDPOINT"], headers={"Authorization": f"Bearer {os.environ['YDS_API_KEY']}"}, ) agent = Agent( name="Marketing Analyst", instructions="Use the marketing science tools to analyse data and return results.", mcp_servers=[gateway], ) result = await Runner.run(agent, "Run RFM segmentation on this customer data: ...")
LangChain
from langchain_mcp_adapters.client import MultiServerMCPClient async with MultiServerMCPClient({ "marketing-science": { "url": os.environ["YDS_ENDPOINT"], "transport": "streamable_http", "headers": {"Authorization": f"Bearer {os.environ['YDS_API_KEY']}"}, } }) as client: tools = client.get_tools() # pass tools to your LangChain agent as normal
find_marketing_algorithm as a routing tool in your agent rather than exposing all 143 algorithms directly. It keeps token counts low and lets the model focus on a small set of relevant candidates — the same approach used in our own reference agent.
Algorithm reference
The full library — 143 algorithms across 13 sections — is at:
www.yourdatastories.com/algorithms →
Each entry shows the algorithm name (the exact value to pass as "name" in tools/call), its dependencies, and what it computes. For required input keys, call find_marketing_algorithm with a plain English description — the response includes required_inputs for each match.