After years of config file debates at Šikulovi s.r.o., here's my honest take on when to use what. Spoiler: I use all three, but for very different reasons.
The format war nobody asked for
I've had this debate at least a dozen times. Someone on the team insists YAML is "more readable." Someone else points out YAML's indentation bugs nearly took down production. Meanwhile, the Java devs are still using XML because "enterprise."
Here's the thing: there's no universally best format. At Šikulovi s.r.o., I use JSON for APIs, YAML for Kubernetes, and occasionally XML when clients require it. Let me explain when each actually makes sense.
JSON: JavaScript Object Notation
JSON emerged from JavaScript but has become language-agnostic. It uses a minimal syntax with curly braces for objects, square brackets for arrays, and colons for key-value pairs. JSON supports six data types: strings, numbers, booleans, null, objects, and arrays.
- Syntax: Uses {} for objects, [] for arrays, "key": value pairs
- Data types: string, number, boolean, null, object, array
- No comments allowed in standard JSON
- Strict syntax: requires double quotes for strings and keys
- Native parsing in all modern browsers and languages
- Compact representation with minimal overhead
- Example: {"name": "Alice", "age": 30, "active": true}
XML: eXtensible Markup Language
XML is a markup language that uses tags to define elements. It predates JSON and was designed for document markup and data interchange. XML supports attributes, namespaces, schemas, and complex document structures.
- Syntax: Uses <tag>content</tag> for elements
- Supports attributes: <person name="Alice" age="30"/>
- Allows comments: <!-- comment -->
- Namespace support for avoiding naming conflicts
- Schema validation with XSD for strict typing
- XSLT for transforming XML documents
- Example: <person><name>Alice</name><age>30</age></person>
YAML: YAML Ain't Markup Language
YAML prioritizes human readability with indentation-based structure. It uses minimal punctuation and supports features like anchors, aliases, and multi-line strings. YAML is a superset of JSON, meaning valid JSON is also valid YAML.
- Syntax: Uses indentation for structure, no brackets required
- Supports comments with # symbol
- Multi-line strings with | (literal) or > (folded)
- Anchors (&) and aliases (*) for reusing values
- Multiple documents in one file with --- separator
- Type inference: 30 is number, "30" is string, true is boolean
- Example: name: Alice\n age: 30\n active: true
Syntax comparison: The same data in three formats
Seeing identical data in all three formats clarifies the syntactic differences. Consider a simple user profile with nested address data.
- JSON: {"user": {"name": "Alice", "email": "[email protected]", "address": {"city": "Berlin", "country": "Germany"}}}
- XML: <user><name>Alice</name><email>[email protected]</email><address><city>Berlin</city><country>Germany</country></address></user>
- YAML: user:\n name: Alice\n email: [email protected]\n address:\n city: Berlin\n country: Germany
- JSON is compact but requires quotes and brackets
- XML is verbose with opening and closing tags
- YAML is clean but whitespace-sensitive
Performance: Parsing and serialization speed
JSON offers the best parsing performance in most environments. Native JSON parsers are highly optimized, especially in JavaScript engines. XML parsers are mature but slower due to the complexity of handling attributes, namespaces, and validation.
- JSON: Fastest parsing in JavaScript (JSON.parse is native and optimized)
- JSON: Lightweight parsers available in all languages
- XML: DOM parsing loads entire document into memory
- XML: SAX parsing is memory-efficient for large files
- YAML: Slower than JSON due to complex features (anchors, type inference)
- YAML: Multiple parsing passes required for anchors and aliases
- For high-throughput APIs, JSON is typically 2-10x faster than YAML
- Binary alternatives like Protocol Buffers or MessagePack outperform all three
File size and verbosity
XML produces the largest files due to repeated tag names. JSON is more compact. YAML can be even smaller than JSON because it omits brackets and quotes, but this depends on the data structure.
- XML: Most verbose - tags appear twice (opening and closing)
- JSON: Compact - keys appear once, minimal syntax
- YAML: Often smallest - no brackets, quotes optional for many values
- For the same data: XML is typically 30-50% larger than JSON
- Gzip compression reduces differences significantly
- Network transfer: JSON is preferred for API responses
- Storage: Consider compression for any format with large datasets
Human readability and editing
YAML wins for human readability when properly formatted. Its clean syntax makes it ideal for configuration files that developers edit manually. JSON requires careful attention to commas and brackets. XML structure is clear but verbose.
- YAML: Excellent readability, minimal visual noise
- YAML: Comments support documentation inline
- JSON: Good readability when formatted, but no comments
- JSON: Missing commas and brackets cause common errors
- XML: Clear structure but tag repetition reduces readability
- XML: Comments can document complex structures
- For hand-edited files: YAML > XML > JSON
- For generated files: JSON > YAML > XML (easier to validate)
Comment support
Comments are essential for configuration files. Standard JSON does not support comments, which limits its use for configuration. XML and YAML both have comment syntax.
- JSON: No comments in standard specification
- JSON5 and JSONC: Extensions that add comment support
- Many tools strip comments when processing JSON
- XML: <!-- comment --> for block comments
- YAML: # for line comments (no block comments)
- Configuration files need comments for documentation
- This limitation makes JSON less suitable for configs
Schema validation and typing
XML has the most mature schema ecosystem with XSD (XML Schema Definition) and DTD (Document Type Definition). JSON Schema exists but is less widely adopted. YAML typically relies on application-level validation.
- XML: XSD provides comprehensive type validation
- XML: Namespaces prevent naming conflicts in complex documents
- JSON: JSON Schema defines structure and types
- JSON: Many validators available but adoption varies
- YAML: No native schema language
- YAML: Often validated by application code or JSON Schema
- For strict contracts: XML with XSD is most robust
- For flexible APIs: JSON Schema provides sufficient validation
Use case: APIs and web services
JSON dominates modern web APIs. REST APIs almost universally use JSON for request and response bodies. SOAP services historically used XML, but new APIs rarely choose XML unless integrating with legacy systems.
- JSON: Standard for REST APIs, GraphQL, and modern web services
- JSON: Native to JavaScript, no transformation needed in browsers
- JSON: Smaller payloads reduce bandwidth and latency
- XML: Still used in SOAP services and enterprise integrations
- XML: Required for some industry standards (healthcare, finance)
- YAML: Rarely used for APIs due to parsing overhead
- Recommendation: Use JSON for new APIs unless standards require XML
Use case: Configuration files
YAML has become the standard for configuration files in modern tools. Kubernetes, Docker Compose, GitHub Actions, and many other tools use YAML. The comment support and clean syntax make it ideal for files that developers read and edit.
- YAML: Kubernetes manifests, Docker Compose, CI/CD pipelines
- YAML: Comments explain configuration options
- YAML: Multi-line strings for embedded scripts or templates
- JSON: Used where programmatic generation is primary
- JSON: package.json, tsconfig.json (tooling generates these)
- XML: pom.xml (Maven), web.xml (older Java configurations)
- TOML: Gaining popularity for simpler configs (Cargo.toml, pyproject.toml)
- Recommendation: YAML for human-edited configs, JSON for generated configs
Use case: Data interchange and storage
For storing and exchanging data between systems, JSON is the practical choice. It balances human readability with parsing efficiency. XML is used when schema validation or document markup is essential.
- JSON: MongoDB native format, common for document databases
- JSON: Log formats (structured logging)
- JSON: Message queues and event streaming
- XML: Document-centric data (books, articles with markup)
- XML: Configuration with complex validation requirements
- YAML: Not recommended for data interchange (whitespace sensitivity causes issues)
- For databases: JSON or binary formats (BSON, Protocol Buffers)
- For documents: XML with proper schema validation
Tooling and ecosystem support
All three formats have mature tooling, but JSON has the broadest support in modern development environments. Every language has JSON parsers in its standard library or readily available.
- JSON: Built-in support in JavaScript, Python, Go, and most languages
- JSON: Extensive IDE support with syntax highlighting and validation
- XML: Mature libraries for parsing, XPath queries, XSLT transforms
- XML: Strong enterprise tooling and IDE support
- YAML: Good library support (js-yaml, PyYAML, go-yaml)
- YAML: IDE support varies; some editors struggle with complex YAML
- Online tools: Formatters and validators available for all three
- Consider team familiarity when choosing a format
Common pitfalls and issues
Each format has gotchas that cause bugs and frustration. Understanding these helps you avoid common mistakes and choose the right format for your needs.
- JSON: Trailing commas cause parse errors
- JSON: No comments means documentation lives elsewhere
- JSON: Large numbers may lose precision (JavaScript Number limits)
- XML: Namespace complexity in mixed documents
- XML: Entity encoding issues (&, <, >)
- YAML: Indentation errors break documents silently
- YAML: Norway problem - NO is parsed as boolean false
- YAML: Type coercion surprises (1.0 becomes integer 1 in some parsers)
Security considerations
All three formats can have security implications when parsing untrusted input. Understanding the risks helps you use each format safely.
- JSON: Generally safe, but JSON.parse can cause denial of service with deeply nested objects
- XML: XXE (XML External Entity) attacks can read files or cause SSRF
- XML: Always disable external entity processing when parsing untrusted XML
- YAML: Code execution risk with unsafe loaders (!!python/object)
- YAML: Always use safe loaders when parsing untrusted YAML
- All: Limit input size to prevent memory exhaustion
- All: Validate structure before processing
- Recommendation: Never parse untrusted input without proper safeguards
Making the decision: Choosing the right format
The best format depends on your specific requirements. Consider who creates the data, how it will be processed, and what ecosystem you are working in.
- Choose JSON for: APIs, web services, JavaScript applications, data interchange
- Choose JSON for: Programmatically generated configuration
- Choose XML for: Document markup, strict schema requirements, SOAP services
- Choose XML for: Industry standards requiring XML (HL7, XBRL)
- Choose YAML for: Human-edited configuration files
- Choose YAML for: DevOps tools (Kubernetes, Docker, CI/CD)
- Consider TOML for: Simple configuration with type safety
- Consider binary formats for: High-performance or bandwidth-constrained scenarios
Conclusion
JSON, XML, and YAML each serve different purposes well. JSON excels at data interchange and web APIs with its compact syntax and fast parsing. XML provides robust document markup and schema validation for enterprise systems. YAML offers the best human readability for configuration files.
Most modern projects use multiple formats: JSON for APIs, YAML for configuration, and occasionally XML for specific integrations. Understanding each format's strengths lets you choose appropriately rather than defaulting to one format for everything.
FAQ
Which format is fastest to parse?
JSON wins, especially in JavaScript. YAML is slowest due to anchors and type inference. For high-throughput APIs, JSON can be 2-10x faster than YAML.
Why does JSON not support comments?
Crockford left them out intentionally. JSON is for data interchange, not configs. If you need comments, use YAML or JSONC.
Can I convert between JSON, XML, and YAML?
Yes, but not always lossless. JSON to YAML is clean. XML to JSON loses attributes unless you use conventions like {@attr: value}.
Is YAML just JSON with different syntax?
YAML is a JSON superset, so technically yes. But YAML adds comments, anchors, and type inference. More features = more complexity = more bugs.
Why is XML still used if JSON is simpler?
XSD validation, namespaces, and industry standards (healthcare, finance). Plus, legacy systems aren't going anywhere. When a bank requires XML, you use XML.
What is the Norway problem in YAML?
YAML turns "NO" into false. The country code for Norway becomes a boolean. Always quote: country: "NO". This has caused real production bugs.
Should I use XML or JSON for configuration files?
Neither. Use YAML - it has comments and clean syntax. JSON for generated configs. XML only if you need XSD validation.
Which format should I use for a new REST API?
JSON. It's the standard, it's fast, and every tool expects it. XML only if the other system requires it.