Karpathy Just Described the Future of Knowledge Delivery. We Already Built It.

Karpathy's LLM Knowledge Base thesis maps to the Knowledge Delivery System we've been building. Here's what a productized version requires beyond scripts and Obsidian.

Yesterday, Andrej Karpathy posted a breakdown of how he's using LLMs to build "LLM Knowledge Bases." Raw documents go into a directory. An LLM compiles them into structured, interlinked markdown. The LLM maintains the knowledge base. Health checks catch inconsistencies. Queries get filed back. Knowledge compounds.

Then he said: "I think there is room here for an incredible new product instead of a hacky collection of scripts."

I've been building that product. It's called Skill Refinery. Here's the technical story of what it takes to go from personal wiki to production platform.

Same Thesis, Different Architecture

Karpathy's insight: LLMs are knowledge compilers, not answer machines. Use them to process raw source material into structured knowledge they can reason over.

Our extraction engine does exactly this. Books, courses, training recordings, coaching frameworks, technical manuals — any format. The engine processes raw expert IP into structured skill cards. But going from Obsidian on one machine to a multi-tenant, multi-platform production system introduces five architectural problems a personal wiki never faces.

Problem 1: IP Protection at the Architecture Level

Karpathy's wiki contains public research. Nobody's livelihood depends on it. When you extract knowledge from published authors or proprietary enterprise IP, the architecture has to enforce protection by design.

Our approach: source documents are processed by the extraction engine to produce skill cards. After extraction, source files are done — they never re-enter an AI context window during delivery. Skill cards enter context transiently during MCP calls but are excluded from model training under enterprise API contracts.

This isn't a policy. It's an architectural constraint. The system physically cannot serve source material after extraction is complete.

Problem 2: Multi-Platform Delivery via MCP

Karpathy's output is markdown files in Obsidian. For a product, knowledge needs to reach users wherever they work.

We deliver through MCP (Model Context Protocol) — the open standard supported by Anthropic, OpenAI, and Microsoft. One extraction produces skill cards that work inside Claude, ChatGPT, and Microsoft Copilot. The MCP server handles authentication, skill resolution, and access control per request.

Internal members authenticate with sk_org_ prefixed key-based identity resolution. External clients use static client_ prefixed keys. No OAuth sessions needed for internal delivery — the key resolves identity and permissions on every call.

Problem 3: Multi-Tenant Access Control

One user, one wiki, no permissions needed. An enterprise with seven business units, each with different knowledge scopes, needs granular access.

We just shipped division-level skill visibility. Business unit assignments on skill cards control which teams see which knowledge. Changes take effect on the next MCP call — no reconnect required. The middleware resolves the requesting user's organization and division, then filters the skill response accordingly.

Problem 4: Monetization Infrastructure

Expert knowledge has value. Karpathy's system is free because it's personal. A platform needs payment rails.

We use Stripe Connect with the platform as merchant of record. Experts set pricing on Creator Storefronts. Subscribers pay through the platform. Revenue splits are automated. The skill card itself carries the monetization metadata — tier, pricing, access level — so the MCP server can enforce paid vs. free at the protocol level.

Problem 5: Extraction That Scales

Karpathy manually feeds documents into his raw/ directory and prompts the LLM to compile. That works for 100 articles.

At 50 experts with hundreds of skill cards each, you need a pipeline. Our extraction engine processes source material in under 10 minutes per source. The pipeline identifies frameworks, decision trees, and expert reasoning patterns, then structures them into modular skill cards with metadata for classification, audience tagging, and tier assignment.

The extraction methodology itself is institutional knowledge — refined through processing 125+ books from a single expert. Each extraction improves the pipeline.

The Category

I coined the term Knowledge Delivery System (KDS) months ago. The LMS manages courses. The KMS manages documents. The KDS delivers answers.

Karpathy just validated the thesis from the research side. The technical community is going to build toward this whether we exist or not. The question is whether the infrastructure is purpose-built or cobbled together from scripts.

If you're interested in the architecture: skillrefinery.ai

Full KDS framework: The LMS Is Dead. Long Live the KDS.

← Back to Writing Work With Me →