Product Introduction
- Definition: docc2json is a Go-based CLI tool specializing in DocC JSON transformation. It processes Apple's native DocC documentation output (
.doccarchiveordocs/datafolders) into a structured JSON schema optimized for web deployment, IDE integration, and SDK analysis. - Core Value Proposition: It exists to solve the reusability gap of Apple DocC documentation, enabling developers to convert Xcode-built docs into a web-friendly, machine-readable JSON format for custom documentation sites, tooling, and automated analysis.
Main Features
- DocC JSON Parsing & Transformation:
- How it works: Scans DocC output directories (38k+ files tested), parses JSON files concurrently using Go goroutines, and restructures data into a custom SDK schema. Extracts symbols (types, protocols, enums), declarations, relationships, and full documentation.
- Technologies: Utilizes Go's
encoding/jsonfor parsing, custom token reconstruction for method signatures, and cross-reference resolution.
- Comprehensive Documentation Extraction:
- How it works: Parses DocC comment structures to extract abstracts, discussions, parameters, return values,
throwsclauses, and code examples (marked with## Discussion,## Example). Handles Swift markup accurately. - Technologies: Processes DocC
Abstract,DiscussionSections, andContentnodes, converting inline content (text, code listings) into clean JSON.
- How it works: Parses DocC comment structures to extract abstracts, discussions, parameters, return values,
- Protocol Conformance Processing & Filtering:
- How it works: Automatically unmangles Swift/Obj-C protocol names (e.g.,
s8SendableP→Sendable,ScA→NSSecureCoding). Filters out common stdlib protocols (Equatable, Hashable, View) by default. Supports custom exclusion lists (--filter-protocols) or full retention (--keep-all-conformances). - Technologies: Implements a protocol unmangling dictionary and configurable filtering logic.
- How it works: Automatically unmangles Swift/Obj-C protocol names (e.g.,
- Flexible Type Grouping:
- How it works: Groups related symbols (types, enums, protocols) using prefix-based patterns (
--group-by-prefix) or a custom YAML configuration (--group-config). Groups definepatterns, explicittypes/protocols/enums, and descriptions. Ungrouped items go into "Other". - Technologies: Uses Go's
regexpfor pattern matching and YAML parsing for custom configs.
- How it works: Groups related symbols (types, enums, protocols) using prefix-based patterns (
- Access Control Filtering & Performance:
- How it works: Filters symbols by access level (
--public-onlyflag). Processes large datasets (38k+ files) in <1 second via parallel parsing. Outputs compact or formatted JSON (--compact). - Technologies: Leverages Go concurrency (
goroutines,sync.WaitGroup), efficient JSON marshaling, and structured logging (zap).
- How it works: Filters symbols by access level (
Problems Solved
- Pain Point: Apple DocC documentation is locked inside Xcode and challenging to reuse for web deployment, custom tooling, or cross-platform SDK analysis. Raw DocC JSON is complex and not web-optimized.
- Target Audience:
- iOS/Swift SDK Developers: Teams needing to publish public API documentation for their frameworks.
- Technical Writers: Creating custom documentation portals from DocC sources.
- Tooling Engineers: Building IDE plugins, linters, or analysis tools requiring structured SDK metadata.
- DevOps Engineers: Automating SDK documentation deployment pipelines.
- Use Cases:
- Generating JSON feeds for custom React/Vue documentation websites.
- Providing autocomplete/intellisense data for IDEs.
- Analyzing API surface area, dependencies, or conformance programmatically.
- Embedding live code examples extracted from DocC into web docs.
- Auditing public vs. internal API exposure.
Unique Advantages
- Differentiation: Unlike generic DocC renderers or static site generators, docc2json focuses exclusively on transforming DocC's JSON into a minimal, analysis-friendly schema. It outperforms manual extraction or raw DocC processing with features like protocol unmangling, conformance filtering, and structured grouping.
- Key Innovation: Its automated protocol handling (unmangling + smart stdlib filtering) and flexible grouping system (prefix-based or YAML-configured) are unique. The parallel parsing engine delivers sub-second processing for large SDKs, a critical advantage for automation.
Frequently Asked Questions (FAQ)
- How do I install docc2json on macOS?
Download the pre-built binary for Apple Silicon (docc2json-darwin-arm64) viacurl, make it executable (chmod +x), and move it to/usr/local/bin/. Alternatively, build from source withgo build. - Can docc2json filter out internal Swift protocols like Equatable?
Yes! By default, it filters common stdlib protocols (Equatable, Hashable, Sendable, View, etc.). Use--filter-protocolsto add custom exclusions or--keep-all-conformancesto disable filtering. - How does docc2json handle DocC articles and tutorials?
Use the--include-articlesflag. docc2json will parse and include documentation articles, tutorials, and guides from the DocC output in the generated JSON schema under appropriate sections. - What's the output schema structure of docc2json?
The JSON includesmetadata,modules(containingtypes,protocols,enums), andnavigation. Each symbol haskind,name,access,declaration,conformances,properties,methods,description(withabstract,discussion,examples), and structuredsignaturedata for methods. - How do I group related types like 'Wish', 'WishStatus' together?
Use--group-by-prefixfor automatic prefix grouping (e.g.,Wish*). For advanced control, create agroup-config.yamldefining groups withpatterns(e.g.,"Wish*"), explicittypes, anddescriptions, then run with--group-config ./group-config.yaml.
