About ko.io
Financial Data
Infrastructure
ko.io provides clean, structured financial data for AI agents and developers. We collect, verify, and serve data from SEC EDGAR, Federal Reserve, Treasury, and other public sources — so you don't have to.
The Problem
SEC EDGAR is powerful.
Using it shouldn't be this hard.
SEC EDGAR is the single source of truth for institutional capital flows in the United States. Every hedge fund position, every insider trade, every congressional stock transaction is filed there. It is the most important public financial dataset in the world.
But EDGAR data is raw XML and SGML, inconsistently formatted, and scattered across dozens of filing types. Amendments overwrite originals without clear versioning. CUSIPs have no direct ticker mapping. Financial statements use different XBRL taxonomies across companies and time periods.
Building a reliable pipeline from scratch takes months of engineering. Parsing XBRL, mapping CUSIPs to tickers, handling amendments, deduplicating across quarterly filings, normalizing financial metrics — each problem is solvable, but solving all of them together is a full-time infrastructure project.
Existing commercial solutions charge $500+/month, often cover only a subset of filings, and provide limited programmatic access. Many lack the granularity that serious quantitative work demands.
ko.io exists to solve this. Clean, structured, cross-referenced financial data — accessible to AI agents and developers in milliseconds, not months.
Data Coverage
Every dataset. One API.
Every dataset is ingested from SEC EDGAR and federal sources, parsed, cross-referenced, and enriched with computed fields. Updated daily via 19 automated pipelines.
13F Institutional Holdings
74.4M records
12,176 institutions, quarterly filings since 2014. Pre-calculated QoQ changes, action labels (NEW_BUY, ADD, REDUCE, SOLD_ALL), and portfolio weights.
Form 4 Insider Trades
109K+ transactions
6,600+ companies. Officer and director trades with role identification, dollar values, and 10b5-1 plan detection.
Congressional Trading
21K+ trades
STOCK Act disclosures from House and Senate. Transaction types, dollar ranges, disclosure delays, and committee assignments.
Company Financials
449K+ records
10-K and 10-Q filings. 30+ standardized metrics per company per period, from revenue to free cash flow.
Daily Stock Prices
13.5M records
15-year history across 7,400+ tickers. OHLCV plus adjusted close, split-adjusted.
And More
24 additional datasets
Treasury yields, Fed rates and balance sheet, CPI/PPI/NFP, FINRA short volume, CFTC futures, Form 144, SC 13D/G, 8-K buybacks, N-PORT fund holdings, and more.
Infrastructure
Built for reliability at scale
Two continents, two database replicas, automated pipelines, and edge-routed traffic. Every component is designed for fault tolerance and low latency.
Dual-Region Architecture
Dual-region active-active. Both run identical application and database instances for true active-active redundancy.
Geo-Routed API
Cloudflare Worker routes every request to the nearest origin based on geographic proximity. Automatic failover between regions.
Analytical Database
Sub-10ms query times across 200M+ rows. Column-oriented storage with 3x compression, optimized for analytical workloads at scale.
19 Automated Data Pipelines
Automated ingestion, parsing, and cross-referencing. 2,000+ filings processed daily. CUSIP-to-ticker mapping, amendment handling, deduplication.
100% Data Consistency
Both regions synchronized in real-time via DAG callbacks with 30-minute cron failsafe. Automated verification scripts confirm row-level parity across every dataset.
Zero-Downtime Deployments
Docker containers with health checks and automatic failover. CI/CD deploys to both regions in parallel. No maintenance windows.
Access
Three ways to query the data
Whether you prefer raw HTTP, AI tooling, or natural language — the same dataset is available through every interface.
REST API
56 endpoints. JSON responses with pagination, filtering, and sorting. Authenticate with an API key and start querying in seconds.
View Documentation →MCP Server
18 tools via @ko-io/mcp-sec-data on npm. Works with Claude, Cursor, ChatGPT, and other AI platforms. Query SEC data in natural language.
Get Started →API Playground
Test API calls directly from your browser. Build queries, see responses, and export results — no code required.
Try It Now →Principles
Open and transparent
Public data, properly structured
All data is sourced from public SEC EDGAR filings. We do not estimate, interpolate, or editorialize. Every record traces back to a specific filing.
No proprietary black boxes
No proprietary data, no sentiment scores, no AI-generated opinions. Just facts extracted from regulatory filings, cleaned, and served fast.
Transparent pricing
Start with the free tier — 100 API calls per day, permanently. Upgrade when you need higher rate limits. All plans include the full dataset.
Get in touch
Questions, partnership inquiries, or enterprise needs — we respond within 24 hours.
admin@ko.ioadmin@ko.io