Build on the world's most comprehensive classification graph
1,000+ classification systems, 1.2M+ nodes, and 321K+ crosswalk edges - available via REST API, MCP server, packaged AI skills (Claude Code, Anthropic, ChatGPT Custom GPT, portable), or directly from the open-source repo.
Open Source
Fork, self-host, or contribute. Full source for the API, ingesters, and frontend.
colaberry/WorldOfTaxonomyREST API
50 endpoints - search, browse, translate codes, and explore crosswalks. No SDK needed.
GET /api/v1/search?q=physicianMCP Server
Connect Claude, Cursor, VS Code, or any MCP client. 22 tools for search, translation, and hierarchy navigation.
python3 -m world_of_taxonomy mcpAI Skills
Drop-in skills for Claude Code, Anthropic, ChatGPT, and any LLM agent. Four bundles, same backend.
skills/Add a System
Contribute a classification system using our TDD guide. Three paths: NACE-derived, ISIC-derived, or standalone.
ingest/my_system.pyGitHub
Open source - fork, contribute, or self-host
Repository
colaberry/WorldOfTaxonomy
License
Open Source
Stack
Python + Next.js + PostgreSQL
GitHub Stars
2
Quick start
git clone https://github.com/colaberry/WorldOfTaxonomy.git cd WorldOfTaxonomy # Install backend dependencies pip install -r requirements.txt # Configure database (copy .env.example and fill in DATABASE_URL) cp .env.example .env # Run the API python3 -m uvicorn world_of_taxonomy.api.app:create_app --factory --port 8000 # Run the frontend (separate terminal) cd frontend && npm install && npm run dev
Crosswalk Explorer
Interactive graph visualization of crosswalk relationships
Explore how classification systems connect through 321K+ crosswalk edges. The system-level graph shows all connected systems grouped by category. Click any edge to drill into the code-level view with individual mappings.
System view
Systems grouped by category, edges = crosswalks
Code view
Individual codes with exact/partial/broad edges
Powered by
Cytoscape.js graph library
REST API
HTTP JSON API - 50 endpoints, no SDK needed
Base URL
/api/v1
Auth
Bearer token or API key
Rate limits
30/min anon, 1,000/min auth
Popular endpoints
/api/v1/search?q={term}Full-text search across 1.2M+ nodes/api/v1/systems/{id}/nodes/{code}/equivalencesCrosswalk mappings to other systems/api/v1/classifyClassify free-text against all systems (Pro+)/api/v1/countries/{code}Country taxonomy profile/api/v1/export/systems.jsonlBulk export as JSONL (Pro+)Guides
Curated knowledge to use the data effectively
API + MCP quickstart, auth, rate limits
How 321K+ edges connect classification systems
Which system to use by country and purpose
ICD-10 vs ICD-11 vs MeSH vs LOINC compared
How HS, CPC, UNSPSC, and SITC relate
System design, data flows, and diagrams
MCP Server
Works with Claude, Cursor, VS Code, Windsurf, and any MCP client
The MCP (Model Context Protocol) server lets AI assistants like Claude query the taxonomy graph directly - searching codes, translating between systems, navigating hierarchies, and exploring country profiles - all from within a conversation.
Protocol
JSON-RPC over stdio
Transport
stdin / stdout
Tools
22 available
Quick start
# From the repo root (requires DATABASE_URL in environment) python3 -m world_of_taxonomy mcp
Popular tools
search_classificationsFull-text search across all nodes
translate_codeConvert a code from one system to another
classify_businessClassify free-text against taxonomy systems
explore_industry_treeInteractive hierarchy exploration
get_country_taxonomy_profileFull taxonomy profile for a country
AI Skills
Drop-in skill bundles for Claude, ChatGPT, and any LLM agent
Four packaged integrations, all backed by the same REST API and MCP server. Pick the one that matches your agent runtime. Source lives in the /skills directory of the repo.
Claude Code Skill
Markdown skill file with frontmatter. Drop into ~/.claude/skills/ or reference from the repo. Auto-activates on classification, translation, and hierarchy queries.
skills/claude-code/worldoftaxonomy.mdAnthropic Claude Skill
Self-contained SKILL.md bundle for claude.ai agent skills. Includes auth, endpoints, response guidance, and invocation triggers.
skills/anthropic/SKILL.mdChatGPT Custom GPT
OpenAPI Action schema + system prompt for ChatGPT. Includes an export script that trims the spec to the 10 endpoints a Custom GPT needs.
skills/openapi/Portable LLM Skill
Plain markdown system prompt + JSON tool schemas. Works with Gemini, Llama, LangChain, LlamaIndex, or any function-calling agent.
skills/portable/Shared capabilities
- Classify free text (business, product, occupation, document) under standard codes across all systems
- Translate codes between systems (NAICS -> ISIC, ICD-10-CM -> ICD-10-GM, SOC -> ISCO, HS -> CPC)
- Walk hierarchies (children, ancestors, siblings) and audit crosswalk coverage between any two systems
Adding a New System
Contribute a classification system in ~10 steps using TDD
Every system follows the same TDD cycle: write a failing test first, implement the ingester to make it green, wire it into the CLI, then run the full suite to confirm no regressions. The detailed SOP lives in docs/adding-a-new-system.md.
New file
world_of_taxonomy/ingest/<system>.py
Test file
tests/test_ingest_<system>.py
Wire up
world_of_taxonomy/__main__.py
10-step checklist
- 1Write a failing test (test_ingest_<system>.py) - confirm it is red before continuing
- 2Create the ingester (ingest/<system>.py) - parse source data, build SYSTEM + NODES dicts
- 3Set is_leaf correctly - use codes_with_children = {parent for ... if parent} pattern
- 4Implement ingest(conn) - upsert system row, upsert nodes in dependency order
- 5Run the test green - minimum code to pass, nothing more
- 6Add crosswalk edges if a concordance table exists (ingest/crosswalk_<system>.py)
- 7Write a test for equivalences - confirm bidirectional edges are created
- 8Wire into __main__.py ingest command (add system id to the dispatch table)
- 9Run the full test suite - python3 -m pytest tests/ -v - all green before committing
- 10Update CLAUDE.md system table with name, region, and node count
Three implementation paths
Path A - NACE-derived
System shares all NACE Rev 2 codes (WZ, ONACE, NOGA, ATECO, NAF, PKD, SBI, etc.). Copy nodes from NACE and create 1:1 equivalence edges. ~15 lines of code.
see: nace_derived.pyPath B - ISIC-derived
System is a national adaptation of ISIC Rev 4 (CIIU, VSIC, BSIC, etc.). Copy ISIC nodes and create equivalences. Add country-specific codes if the source deviates.
see: isic_derived.pyPath C - Standalone
System has its own source file (CSV, XLSX, JSON, XML, PDF). Parse source, build hierarchy from parent codes, detect leaves via codes_with_children, upsert independently.
see: naics.py, loinc.pyMinimal standalone ingester template
# world_of_taxonomy/ingest/my_system.py
SYSTEM = {
"id": "my_system_2024",
"name": "My Classification System 2024",
"authority": "Issuing Body",
"region": "Global",
"version": "2024",
"description": "...",
}
# (code, title, description, parent_code)
NODES = [
("A", "Section A", "Agriculture", None),
("A01", "Crop production", "...", "A"),
...
]
async def ingest(conn) -> None:
await conn.execute("""
INSERT INTO classification_system (...) VALUES (...)
ON CONFLICT (id) DO UPDATE SET ...
""", *SYSTEM.values())
# Compute leaf flags dynamically - never hard-code level == N
codes_with_children = {parent for (_, _, _, parent) in NODES if parent is not None}
for code, title, desc, parent in NODES:
is_leaf = code not in codes_with_children
await conn.execute("""
INSERT INTO classification_node (...) VALUES (...)
ON CONFLICT (system_id, code) DO UPDATE SET ...
""", SYSTEM["id"], code, title, desc, parent, is_leaf)Pricing
Free, Pro, and Enterprise plans available
The full knowledge graph is available on every plan. Paid tiers add higher limits, bulk export, classification API, and dedicated support.
Contact Sales
Interested in Enterprise? Tell us about your use case and we'll get back to you.
Questions or contributions?
Open an issue or pull request on GitHub - all feedback welcome.