ADR Export Guide

Complete guide for exporting Architecture Decision Records (ADRs) in JSON/JSONL format.

Overview
Format Comparison
Filter Options
Output Schema
Integration Patterns
Performance Considerations
Troubleshooting

Overview

The adr export command exports ADRs as knowledge fragments - RAG-optimized representations designed for semantic search, machine learning, and knowledge graph applications.

Key Features

Knowledge Fragments: Not raw ADR dumps, but semantic extractions optimized for LLM ingestion
Filtering: Status, project, classification, date range
Streaming: JSONL format for constant-memory processing
Metadata: Embedding priority, content hashing, relations graph
FAISS-Ready: Compatible with vector databases and semantic search engines

Format Comparison

JSON Format

Use Case: Interactive exploration, small datasets (<100 ADRs), human-readable output

Characteristics: - Single JSON array - Pretty-printed by default (unless --compact) - Requires loading entire file into memory - Easy to manipulate with jq

Example:

adr export adr/accepted --format json > decisions.json

Output Structure:

[
  {
    "id": "ADR-0001",
    "type": "architecture_decision",
    ...
  },
  {
    "id": "ADR-0002",
    ...
  }
]

JSONL Format

Use Case: Pipeline ingestion, large datasets (>100 ADRs), streaming processing

Characteristics: - One JSON object per line - Streamable (constant memory) - Line-by-line processing - Ideal for ETL pipelines

Example:

adr export adr/accepted --format jsonl --compact > decisions.jsonl

Output Structure:

{"id": "ADR-0001", "type": "architecture_decision", ...}
{"id": "ADR-0002", "type": "architecture_decision", ...}

Compact vs Pretty

Option	Size	Use Case
Default (pretty)	Larger	Development, debugging, human reading
`--compact`	~60% smaller	Production, pipelines, storage

Example Comparison:

# Pretty (3.2MB)
adr export adr --format json > pretty.json

# Compact (1.2MB)
adr export adr --format json --compact > compact.json

Status Filter

Filter by ADR lifecycle status.

Syntax:

adr export <path> --filter-status <status>

Valid Values: proposed, accepted, rejected, deprecated, superseded

Multiple Status:

adr export adr --filter-status accepted --filter-status superseded

Use Cases: - Export only accepted decisions for production knowledge base - Analyze rejected alternatives for research - Track deprecated decisions for migration planning

Project Filter

Filter by project scope (OR logic).

Syntax:

adr export <path> --filter-project <project>

Common Projects: CEREBRO, PHANTOM, SPECTRE, NEUTRON, GLOBAL

Multiple Projects (OR logic):

adr export adr --filter-project CEREBRO --filter-project PHANTOM

This returns ADRs that belong to CEREBRO OR PHANTOM (not necessarily both).

Use Cases: - Export PHANTOM-specific decisions for ML ingestion - Generate project-specific documentation - Audit cross-project dependencies

Classification Filter

Filter by decision classification (severity/impact).

Syntax:

adr export <path> --filter-classification <class>

Valid Values: critical, major, minor, patch

Use Cases: - Export critical decisions for security review - Generate change logs by classification - Prioritize knowledge base updates

Date Range Filter

Filter by decision date (YYYY-MM-DD format).

Syntax:

adr export <path> --since YYYY-MM-DD --until YYYY-MM-DD

Examples:

# Decisions from 2026
adr export adr --since 2026-01-01 --until 2026-12-31

# Recent decisions (last 30 days)
adr export adr --since $(date -d '30 days ago' +%Y-%m-%d)

# Future decisions (proposed for next quarter)
adr export adr --filter-status proposed --since 2026-04-01

Use Cases: - Generate quarterly architecture reports - Track decision velocity over time - Export recent changes for onboarding

Combined Filters

All filters use AND logic when combined.

Example:

adr export adr \
  --filter-status accepted \
  --filter-project CEREBRO \
  --filter-classification major \
  --since 2026-01-01 \
  --format jsonl --compact

This returns ADRs that match ALL criteria: - Status is accepted AND - Project includes CEREBRO AND - Classification is major AND - Date >= 2026-01-01

Output Schema

Knowledge Fragment Structure

interface KnowledgeFragment {
  // Identity
  id: string;                    // ADR-0001
  type: "architecture_decision";
  title: string;
  status: "proposed" | "accepted" | "rejected" | "deprecated" | "superseded";
  summary: string;               // One-line summary

  // Scope
  scope: {
    projects: string[];          // ["CEREBRO", "PHANTOM"]
    layers: string[];            // ["infrastructure", "ml"]
  };

  // Knowledge (RAG-Optimized)
  knowledge: {
    what: string;                // Decision text
    why: string;                 // Context/rationale
    implications: {
      positive: string[];
      negative: string[];
    };
    alternatives_rejected: string[];
  };

  // Semantic Enrichment
  questions: string[];           // Questions this ADR answers
  keywords: string[];            // Searchable keywords
  concepts: string[];            // High-level concepts

  // Relations (Knowledge Graph)
  relations: {
    supersedes: string[];        // ADRs this replaces
    related: string[];           // Related decisions
    enables: string[];           // ADRs this enables
  };

  // Governance
  governance: {
    classification: "critical" | "major" | "minor" | "patch";
    compliance: string[];        // ["LGPD", "SOC2"]
  };

  // Metadata
  metadata: {
    date: string;                // YYYY-MM-DD
    version: number;
    hash: string;                // Content hash (change detection)
    embedding_priority: "low" | "normal" | "high";
  };
}

Field Descriptions

embedding_priority: Hints for semantic search engines - high: Critical decisions, frequently queried - normal: Standard decisions - low: Historical, rarely queried

hash: SHA256 prefix for change detection - Computed from context + decision fields - Use to detect updates without full comparison

questions: Natural language queries this ADR answers - Used for query-to-document matching - Example: "Why did we choose NixOS?" → ADR-0001

relations: Knowledge graph edges - supersedes: This ADR replaces older decisions - related: Cross-references to related decisions - enables: This ADR is a prerequisite for others

Integration Patterns

PHANTOM (Data Sanitizer)

PHANTOM processes ADRs into semantic chunks for FAISS indexing.

Export:

adr export adr/accepted --format jsonl --compact > /tmp/adr.jsonl

Ingestion:

from phantom import CortexProcessor, FAISSVectorStore

# Load JSONL
with open('/tmp/adr.jsonl') as f:
    adrs = [json.loads(line) for line in f]

# Process each ADR
for adr in adrs:
    # Chunk semantic sections
    chunks = [
        {"text": adr["summary"], "type": "summary", "priority": "high"},
        {"text": adr["knowledge"]["why"], "type": "context", "priority": adr["metadata"]["embedding_priority"]},
        {"text": adr["knowledge"]["what"], "type": "decision", "priority": "critical"}
    ]

    # Embed
    embeddings = cortex.embed([c["text"] for c in chunks])

    # Index with metadata
    faiss_store.add_documents(
        chunks=chunks,
        embeddings=embeddings,
        metadata={
            "id": adr["id"],
            "status": adr["status"],
            "projects": adr["scope"]["projects"],
            "classification": adr["governance"]["classification"],
            "date": adr["metadata"]["date"]
        }
    )

# Save index
faiss_store.save("adr_ledger.faiss")

Query:

# Semantic search with metadata filtering
results = faiss_store.search(
    query="Why NixOS?",
    top_k=5,
    filter={"status": "accepted", "projects": ["NEUTRON"]}
)

CEREBRO (Air-Gapped Knowledge Vault)

CEREBRO maintains isolated, RAG-optimized knowledge bases.

Export:

adr export adr/accepted --format json --filter-status accepted > cerebro_kb.json

Ingestion:

from cerebro import KnowledgeVault

vault = KnowledgeVault.load_or_create("adr_vault")

# Import ADRs
with open('cerebro_kb.json') as f:
    fragments = json.load(f)

for fragment in fragments:
    vault.add_decision(
        id=fragment["id"],
        content=fragment["knowledge"],
        metadata=fragment["metadata"],
        graph=fragment["relations"]
    )

# Query with graph traversal
response = vault.query("Explain our authentication strategy")
# Returns: Relevant ADRs + related decisions via graph

External Tools

PostgreSQL + pgvector

# Export
adr export adr/accepted --format jsonl --compact > adr.jsonl

# Import to PostgreSQL
cat adr.jsonl | jq -c '{id: .id, content: .knowledge, meta: .metadata}' | \
  psql -c "COPY adr_knowledge (data) FROM STDIN WITH (FORMAT csv, QUOTE '\"');"

Elasticsearch

# Bulk index
cat adr.jsonl | \
  jq -c '{index: {_index: "adrs", _id: .id}}, .' | \
  curl -X POST http://localhost:9200/_bulk -H 'Content-Type: application/x-ndjson' --data-binary @-

Neo4j (Knowledge Graph)

# Export relations
adr export adr/accepted --format json | \
  jq -r '.[] | "\(.id),\(.relations.supersedes[]),supersedes"' | \
  cypher-shell --format plain

Performance Considerations

Memory Usage

Operation	JSON	JSONL
Parse 100 ADRs	~5MB	~5MB
Export 100 ADRs	~10MB (array overhead)	~2MB (streaming)
Large datasets (1000+)	Consider chunking	Constant memory

Export Speed

Benchmarks (12-core, 32GB RAM, NVMe): - Parse 100 ADRs: ~2 seconds - Export 100 ADRs (JSON): ~0.5 seconds - Export 100 ADRs (JSONL): ~0.3 seconds - Filter 1000 ADRs by status: ~0.1 seconds

Optimization Strategies

1. Use JSONL for large datasets:

# Memory-efficient streaming
adr export adr --format jsonl --compact | \
  while read -r line; do
    process_adr "$line"
  done

2. Filter early:

# Export only what you need
adr export adr/accepted --filter-status accepted --filter-project PHANTOM

3. Parallel processing:

# Split by project and process in parallel
for project in CEREBRO PHANTOM SPECTRE; do
  adr export adr --filter-project $project --format jsonl --compact > ${project}.jsonl &
done
wait

Troubleshooting

Issue: Empty Result Set

Symptom:

adr export adr --filter-status accepted --filter-project NONEXISTENT
# Returns: []

Solution: - Check filter values (case-sensitive) - Verify projects exist in ADRs: bash adr export adr --format json | jq -r '.[].scope.projects[]' | sort -u

Issue: Parse Errors

Symptom:

Warning: Failed to parse adr/proposed/ADR-0017.md: 'str' object has no attribute 'get'

Cause: ADR uses simplified YAML format (strings instead of objects)

Solution: Parser now supports both formats automatically. Update to latest version.

Issue: Invalid JSONL

Symptom:

cat output.jsonl | jq .
# jq: parse error: Invalid numeric literal at line 2

Cause: Log messages mixed with JSON output

Solution: Redirect stderr:

adr export adr --format jsonl 2>/dev/null

Issue: Slow Export

Symptom: Export takes >30 seconds for 100 ADRs

Possible Causes: 1. Parsing overhead: Use --format jsonl instead of json 2. Large files: ADRs with huge markdown content 3. Disk I/O: Slow filesystem

Solution:

# Profile with time
time adr export adr --format jsonl --compact > /dev/null

# Use tmpfs for temp files
export TMPDIR=/dev/shm

Issue: Character Encoding

Symptom: Unicode characters garbled

Solution: Ensure UTF-8:

export LC_ALL=en_US.UTF-8
adr export adr --format json > output.json

Advanced Use Cases

Incremental Updates

Track ADR changes with content hashing:

import json

# Load previous export
with open('prev_export.json') as f:
    prev = {adr['id']: adr['metadata']['hash'] for adr in json.load(f)}

# Export current state
with open('curr_export.json') as f:
    curr = {adr['id']: adr['metadata']['hash'] for adr in json.load(f)}

# Find changes
new_ids = set(curr.keys()) - set(prev.keys())
changed_ids = {id for id in curr if id in prev and curr[id] != prev[id]}
deleted_ids = set(prev.keys()) - set(curr.keys())

print(f"New: {len(new_ids)}, Changed: {len(changed_ids)}, Deleted: {len(deleted_ids)}")

Custom Transformations

Extract specific fields with jq:

# Extract decision timeline
adr export adr/accepted --format json | \
  jq '.[] | {id, date: .metadata.date, status, projects: .scope.projects}' | \
  jq -s 'sort_by(.date)'

# Generate Markdown table
adr export adr/accepted --format json | \
  jq -r '.[] | "| \(.id) | \(.title) | \(.status) | \(.metadata.date) |"'

Compliance Reporting

Generate compliance reports:

# LGPD-tagged decisions
adr export adr --format json --filter-status accepted | \
  jq '[.[] | select(.governance.compliance | contains(["LGPD"]))] |
      {count: length, decisions: [.[] | {id, title}]}'

# Critical decisions by quarter
adr export adr --filter-classification critical --format json | \
  jq 'group_by(.metadata.date[0:7]) |
      map({month: .[0].metadata.date[0:7], count: length})'

Best Practices

Use JSONL for pipelines: Streaming, constant memory
Filter early: Export only what you need
Version exports: Tag exports with timestamps
Validate output: Always pipe to jq for validation
Compress for storage: gzip JSONL files (70% reduction)
Monitor hash changes: Track ADR evolution via metadata.hash
Automate exports: Use git hooks for auto-sync

Examples Gallery

Daily Knowledge Sync

#!/bin/bash
# Sync accepted ADRs to CEREBRO daily

DATE=$(date +%Y-%m-%d)
adr export adr/accepted \
  --format jsonl \
  --filter-status accepted \
  --compact > /tmp/adr_${DATE}.jsonl

# Upload to CEREBRO
cerebro-cli sync /tmp/adr_${DATE}.jsonl

# Cleanup old exports (keep 7 days)
find /tmp -name 'adr_*.jsonl' -mtime +7 -delete

Project Documentation

# Generate project-specific docs
for project in CEREBRO PHANTOM SPECTRE NEUTRON; do
  adr export adr/accepted \
    --filter-project $project \
    --filter-status accepted \
    --format json > docs/${project}/decisions.json
done

Architecture Review

# Export critical decisions for review
adr export adr \
  --filter-classification critical \
  --since 2026-01-01 \
  --format json | \
  jq '.[] | {id, title, date: .metadata.date, projects: .scope.projects}' > review.json

For more examples, see the documentation home or run adr export --help.

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search

ADR Export Guide

Table of Contents

Overview

Key Features

Format Comparison

JSON Format

JSONL Format

Compact vs Pretty

Filter Options

Status Filter

Project Filter

Classification Filter

Date Range Filter

Combined Filters

Output Schema

Knowledge Fragment Structure

Field Descriptions

Integration Patterns

PHANTOM (Data Sanitizer)

CEREBRO (Air-Gapped Knowledge Vault)

External Tools

PostgreSQL + pgvector

Elasticsearch

Neo4j (Knowledge Graph)

Performance Considerations

Memory Usage

Export Speed

Optimization Strategies

Troubleshooting

Issue: Empty Result Set

Issue: Parse Errors

Issue: Invalid JSONL

Issue: Slow Export

Issue: Character Encoding

Advanced Use Cases

Incremental Updates

Custom Transformations

Compliance Reporting

Best Practices

Examples Gallery

Daily Knowledge Sync

Project Documentation

Architecture Review