docc2json logo

docc2json

Turn Apple DocC output into a web-friendly SDK JSON schema

2026-01-03

Product Introduction

  1. Definition: docc2json is a Go-based CLI tool specializing in DocC JSON transformation. It processes Apple's native DocC documentation output (.doccarchive or docs/data folders) into a structured JSON schema optimized for web deployment, IDE integration, and SDK analysis.
  2. Core Value Proposition: It exists to solve the reusability gap of Apple DocC documentation, enabling developers to convert Xcode-built docs into a web-friendly, machine-readable JSON format for custom documentation sites, tooling, and automated analysis.

Main Features

  1. DocC JSON Parsing & Transformation:
    • How it works: Scans DocC output directories (38k+ files tested), parses JSON files concurrently using Go goroutines, and restructures data into a custom SDK schema. Extracts symbols (types, protocols, enums), declarations, relationships, and full documentation.
    • Technologies: Utilizes Go's encoding/json for parsing, custom token reconstruction for method signatures, and cross-reference resolution.
  2. Comprehensive Documentation Extraction:
    • How it works: Parses DocC comment structures to extract abstracts, discussions, parameters, return values, throws clauses, and code examples (marked with ## Discussion, ## Example). Handles Swift markup accurately.
    • Technologies: Processes DocC Abstract, DiscussionSections, and Content nodes, converting inline content (text, code listings) into clean JSON.
  3. Protocol Conformance Processing & Filtering:
    • How it works: Automatically unmangles Swift/Obj-C protocol names (e.g., s8SendablePSendable, ScANSSecureCoding). Filters out common stdlib protocols (Equatable, Hashable, View) by default. Supports custom exclusion lists (--filter-protocols) or full retention (--keep-all-conformances).
    • Technologies: Implements a protocol unmangling dictionary and configurable filtering logic.
  4. Flexible Type Grouping:
    • How it works: Groups related symbols (types, enums, protocols) using prefix-based patterns (--group-by-prefix) or a custom YAML configuration (--group-config). Groups define patterns, explicit types/protocols/enums, and descriptions. Ungrouped items go into "Other".
    • Technologies: Uses Go's regexp for pattern matching and YAML parsing for custom configs.
  5. Access Control Filtering & Performance:
    • How it works: Filters symbols by access level (--public-only flag). Processes large datasets (38k+ files) in <1 second via parallel parsing. Outputs compact or formatted JSON (--compact).
    • Technologies: Leverages Go concurrency (goroutines, sync.WaitGroup), efficient JSON marshaling, and structured logging (zap).

Problems Solved

  1. Pain Point: Apple DocC documentation is locked inside Xcode and challenging to reuse for web deployment, custom tooling, or cross-platform SDK analysis. Raw DocC JSON is complex and not web-optimized.
  2. Target Audience:
    • iOS/Swift SDK Developers: Teams needing to publish public API documentation for their frameworks.
    • Technical Writers: Creating custom documentation portals from DocC sources.
    • Tooling Engineers: Building IDE plugins, linters, or analysis tools requiring structured SDK metadata.
    • DevOps Engineers: Automating SDK documentation deployment pipelines.
  3. Use Cases:
    • Generating JSON feeds for custom React/Vue documentation websites.
    • Providing autocomplete/intellisense data for IDEs.
    • Analyzing API surface area, dependencies, or conformance programmatically.
    • Embedding live code examples extracted from DocC into web docs.
    • Auditing public vs. internal API exposure.

Unique Advantages

  1. Differentiation: Unlike generic DocC renderers or static site generators, docc2json focuses exclusively on transforming DocC's JSON into a minimal, analysis-friendly schema. It outperforms manual extraction or raw DocC processing with features like protocol unmangling, conformance filtering, and structured grouping.
  2. Key Innovation: Its automated protocol handling (unmangling + smart stdlib filtering) and flexible grouping system (prefix-based or YAML-configured) are unique. The parallel parsing engine delivers sub-second processing for large SDKs, a critical advantage for automation.

Frequently Asked Questions (FAQ)

  1. How do I install docc2json on macOS?
    Download the pre-built binary for Apple Silicon (docc2json-darwin-arm64) via curl, make it executable (chmod +x), and move it to /usr/local/bin/. Alternatively, build from source with go build.
  2. Can docc2json filter out internal Swift protocols like Equatable?
    Yes! By default, it filters common stdlib protocols (Equatable, Hashable, Sendable, View, etc.). Use --filter-protocols to add custom exclusions or --keep-all-conformances to disable filtering.
  3. How does docc2json handle DocC articles and tutorials?
    Use the --include-articles flag. docc2json will parse and include documentation articles, tutorials, and guides from the DocC output in the generated JSON schema under appropriate sections.
  4. What's the output schema structure of docc2json?
    The JSON includes metadata, modules (containing types, protocols, enums), and navigation. Each symbol has kind, name, access, declaration, conformances, properties, methods, description (with abstract, discussion, examples), and structured signature data for methods.
  5. How do I group related types like 'Wish', 'WishStatus' together?
    Use --group-by-prefix for automatic prefix grouping (e.g., Wish*). For advanced control, create a group-config.yaml defining groups with patterns (e.g., "Wish*"), explicit types, and descriptions, then run with --group-config ./group-config.yaml.

Subscribe to Our Newsletter

Get weekly curated tool recommendations and stay updated with the latest product news