Skip to content

Open Knowledge Format — Google Cloud's Vendor-Neutral Spec for LLM-Friendly Knowledge Repositories

Source: How the Open Knowledge Format Can Improve Data Sharing \ Date Published: June 12, 2026 \ Authors: Sam McVeety & Amir Hormati (Google Cloud)


TL;DR

Google Cloud has introduced the Open Knowledge Format (OKF) v0.1 — an open, vendor-neutral specification that formalizes what has become known as the "LLM-wiki" pattern for knowledge management. At its core, OKF is remarkably simple: "Just markdown, just files, just YAML frontmatter." Knowledge is represented as a directory of markdown files where the file path serves as the concept's identity. The format draws inspiration from Andrej Karpathy's widely circulated LLM Wiki pattern. Three design principles guide the spec: (1) minimally opinionated (only a type field is required in frontmatter), (2) producer/consumer independence, and (3) format-before-platform philosophy. Google also ships an Enrichment Agent (auto-drafts OKF from BigQuery), a Static HTML Visualizer, and Sample Bundles (GA4 e-commerce, Stack Overflow, Bitcoin).

The LLM-Wiki Pattern, Standardized

The OKF formalizes a pattern that has emerged organically across dozens of projects: keep knowledge in plain markdown files, organized by directory structure, with metadata in YAML frontmatter. The file path IS the identity of the concept — there is no separate database, no proprietary index, no API dependency.

This pattern became popular largely because LLMs consume markdown naturally — it is the closest thing to a universal knowledge format that both humans and models read comfortably.

Three Design Principles

1. Minimally Opinionated

OKF imposes very few requirements. The only mandatory field in the YAML frontmatter is type — everything else is optional and extensible. This means producers can emit knowledge in OKF without needing to agree on a shared schema. A dataset of Stack Overflow questions and a collection of GA4 e-commerce metrics can both be valid OKF repositories with completely different metadata structures.

2. Producer/Consumer Independence

Producers write knowledge in OKF without knowing who will consume it. Consumers read OKF without needing to negotiate with producers. The format is the contract. This decoupling is essential for knowledge sharing across organizational boundaries — a team at Google can publish OKF that a startup in São Paulo can consume without any coordination.

3. Format, Not Platform

OKF is emphatically not a Google Cloud product. It is a specification. Anyone can implement OKF-compatible tools in any language on any platform. Google just happens to be shipping the first reference implementations. This principle protects against vendor lock-in and ensures that knowledge stored in OKF remains accessible regardless of what happens to any particular company or product.

What Ships With v0.1

Google released several companion components alongside the spec:

  • Enrichment Agent: A tool that automatically drafts OKF knowledge repositories from BigQuery data. It infers the directory structure, generates markdown summaries, and produces YAML frontmatter from SQL schemas.
  • Static HTML Visualizer: A browsable HTML renderer for OKF repositories. No server required — it generates a static site from the markdown tree.
  • Sample Bundles: Pre-built OKF repositories for common use cases:
  • GA4 e-commerce schema
  • Stack Overflow Q&A dataset
  • Bitcoin blockchain metadata

These samples serve both as demonstrations and as templates for teams building their own OKF repositories.

Why This Matters

The OKF addresses a growing pain point: as organizations accumulate AI-generated knowledge bases, vector stores, RAG pipelines, and agent memories, they increasingly rely on proprietary or bespoke formats that are invisible, non-portable, and unsearchable outside the tool that created them. OKF proposes a lowest-common-denominator format that is:

  • Human-readable (just markdown)
  • Machine-parseable (YAML frontmatter)
  • File-system-native (no database required)
  • Version-controllable (git-friendly)
  • Portable (no vendor lock-in)

Whether OKF achieves widespread adoption depends on whether the ecosystem of tools around it grows. But the specification itself is a useful contribution — it codifies a pattern that was already emerging and gives it a shared vocabulary.

Key Takeaways

  1. OKF v0.1 formalizes the "LLM-wiki" pattern: knowledge as a directory of markdown files with YAML frontmatter.
  2. The file path IS the concept's identity — no separate index or database required.
  3. Three design principles: minimally opinionated (only type required), producer/consumer independence, format-before-platform.
  4. OKF is explicitly vendor-neutral — it is not a Google Cloud product, it is a specification anyone can implement.
  5. Google ships companion tools: an Enrichment Agent (OKF from BigQuery), a Static HTML Visualizer, and Sample Bundles.
  6. OKF solves the growing problem of invisible, non-portable, proprietary knowledge formats in the age of AI agents.