XML in 2026: When I Still Reach for It Over JSON

Everyone says JSON won. But I still use XML regularly - for document processing, enterprise integrations, and anywhere I need real validation. Here's when and why.

2025-07-1012 min

Related toolXML Formatter

Use the tool alongside this guide for hands-on practice.

The project that made me appreciate XML

I used to roll my eyes at XML. Then I worked on a healthcare integration project where data validation was legally required. JSON Schema couldn't cut it. XSD saved us weeks of manual validation code.

Now I see XML and JSON as tools for different jobs. JSON for APIs and lightweight data. XML when you need schema validation, document structure, or you're integrating with systems that have used XML for decades. Knowing both makes you more versatile.

XML vs JSON: When to Choose Which

The choice between XML and JSON is not about which is better—each format serves different purposes. Understanding their strengths helps you make informed architectural decisions.

Choose JSON for: REST APIs, JavaScript applications, simple data structures, mobile apps, and when minimizing payload size matters
Choose XML for: Document-centric data, strict schema validation, mixed content (text with markup), namespaces for vocabulary mixing, and regulated industries requiring formal schemas
XML attributes provide metadata without adding complexity; JSON lacks this distinction
XML namespaces allow mixing vocabularies; JSON has no native equivalent
XML supports document validation via XSD, DTD, or Relax NG; JSON Schema is less mature and widely adopted
XML preserves whitespace and formatting when needed; JSON normalizes whitespace
XML comments are part of the format; JSON does not support comments
Both support Unicode and can represent hierarchical data structures

Understanding XML Namespaces

Namespaces solve the problem of element name collisions when combining XML vocabularies. They use URIs (usually URLs) as unique identifiers, though the URI does not need to point to an actual resource.

Default namespace: xmlns="http://example.com/ns" applies to unprefixed elements
Prefixed namespace: xmlns:prefix="http://example.com/ns" requires explicit prefix:element syntax
Namespace scope: Declaration applies to the element and all descendants unless overridden
The URI is just an identifier—it does not need to be a valid URL or resolve to anything
Common namespaces: XHTML (http://www.w3.org/1999/xhtml), SOAP, XLink, and SVG
Namespace-aware parsers distinguish elements by (namespace, local-name) pairs
Attributes can have namespaces too, though unqualified attributes belong to no namespace
Multiple namespaces can coexist in a single document using different prefixes

XML Validation with XSD (XML Schema Definition)

XSD provides powerful validation capabilities that go far beyond what JSON Schema offers. It defines the structure, data types, and constraints for XML documents with precision.

Element types: Define complex types with sequences, choices, or all groups
Data types: XSD includes 44 built-in types (string, integer, decimal, date, boolean, etc.)
Constraints: minOccurs, maxOccurs, minLength, maxLength, pattern (regex), enumeration
Custom types: Extend or restrict base types to create domain-specific types
Namespace support: XSD validates namespaced documents correctly
Key constraints: unique, key, and keyref enforce referential integrity
Substitution groups: Allow polymorphic element substitution
Annotations: Document your schema with appinfo and documentation elements

Validation Alternatives: DTD and RELAX NG

XSD is not the only validation option. DTD (Document Type Definition) and RELAX NG offer alternatives with different trade-offs.

DTD: Original XML validation method; simpler but limited (no namespaces, few data types)
DTD is still used for entity declarations and basic structure validation
RELAX NG: Modern alternative with simpler syntax than XSD and full namespace support
RELAX NG Compact: An even more readable non-XML syntax for RELAX NG schemas
Schematron: Rule-based validation for business logic constraints XSD cannot express
Many projects combine schemas: XSD for structure, Schematron for business rules
For new projects, prefer XSD for widespread tool support or RELAX NG for simplicity
Legacy systems often use DTDs; migration to XSD is possible but may not be worth the effort

DOM Parsing: Full Document Access

DOM (Document Object Model) parsing loads the entire XML document into memory as a tree structure. This provides random access to any element but requires memory proportional to document size.

Loads complete document into memory as navigable tree
Allows random access: jump to any element, traverse up/down/sideways
Supports document modification: add, remove, and change elements
XPath queries work on DOM trees for powerful element selection
Memory usage: roughly 5-10x the file size in memory
Best for: Small to medium documents (under 10MB), documents needing modification, complex queries
Available in all major languages: JavaScript (DOMParser), Python (xml.dom), Java (javax.xml.parsers)
Modern browsers provide native DOM parsing for XML just like HTML

SAX Parsing: Memory-Efficient Streaming

SAX (Simple API for XML) is an event-based parser that reads XML sequentially without building a tree. It calls handler functions when encountering elements, attributes, and text. This uses minimal memory but only allows forward-only reading.

Event-driven: Your code receives callbacks for startElement, endElement, characters, etc.
Constant memory: Processes gigabyte files with minimal RAM
Forward-only: Cannot go back or look ahead in the document
No modification: Read-only; you cannot change the document
Best for: Large files, data extraction tasks, streaming scenarios
Requires managing state: You track context (current element path) manually
Pull parsers (StAX in Java, xmlreader in Python) offer similar efficiency with iterator-style API
For validation during SAX parsing, use a validating parser with schema attached

Choosing Between DOM and SAX

The choice between DOM and SAX depends on your use case. Modern applications often use hybrid approaches, processing large documents in chunks while using DOM for complex subsections.

File size under 10MB with complex queries? Use DOM
File size over 100MB? SAX or streaming parser is essential
Need to modify the document? DOM (or load section, modify, stream out)
Extracting specific data from large files? SAX with state machine
Web services with medium payloads? DOM is usually fine
Processing log files or data feeds? SAX for efficiency
XPath queries required? DOM or streaming XPath libraries
Memory-constrained environment? SAX or pull parser

Modern XML Processing Libraries

Modern XML libraries offer better APIs than the standard DOM and SAX implementations. They provide cleaner syntax, better error messages, and often better performance.

Python: lxml (C-based, fast, XPath 1.0/2.0, XSLT), ElementTree (stdlib, simple API)
Java: JAXB (binding to objects), StAX (pull parser), Jackson XML (JSON-like API)
JavaScript: fast-xml-parser (fast, configurable), xml2js (promises, simple), libxmljs (native bindings)
.NET: LINQ to XML (modern API), XmlReader (streaming), XDocument (querying)
Go: encoding/xml (stdlib), etree (ElementTree-like API)
Rust: quick-xml (fast streaming), serde-xml-rs (serde integration)
For web browsers: DOMParser and XMLSerializer are built-in and efficient
Consider schema-driven code generation (JAXB, xjc) for strongly-typed access to known schemas

Where XML Still Dominates

Despite JSON popularity, XML remains the standard format in many domains. Knowing these areas helps you understand where XML expertise is valuable.

Enterprise integration: SOAP services, EDI, B2B communications
Document formats: OOXML (Office), ODF (LibreOffice), EPUB, DocBook
Configuration: Maven pom.xml, Ant, Spring XML config, Android layouts
Graphics: SVG (scalable vector graphics) used everywhere on the web
Feeds: RSS and Atom for content syndication
Healthcare: HL7 CDA, FHIR (increasingly), DICOM metadata
Finance: FpML, XBRL for regulatory reporting, ISO 20022 messaging
Government and legal: Often mandated by regulation for data exchange

Best Practices for XML in 2026

Whether maintaining legacy systems or choosing XML for new projects, following best practices ensures maintainability and interoperability.

Always use a schema (XSD preferred) for production XML formats
Validate early: Validate input at system boundaries before processing
Namespace everything: Even if you control all vocabularies, namespaces prevent future conflicts
Use meaningful element names: Self-documenting XML reduces need for external documentation
Prefer elements over attributes for data; use attributes for metadata
Handle encoding correctly: Declare encoding in XML declaration, prefer UTF-8
Pretty-print for human consumption, minify for transmission if size matters
Use XML tools for XML: Regex parsing XML is fragile and error-prone

Conclusion

XML is not dead—it is specialized. While JSON handles most API and configuration needs today, XML remains essential for document processing, strict validation, enterprise integration, and regulated industries.

Understanding XML deeply—namespaces, validation with XSD, and efficient parsing strategies—makes you effective in the many domains where XML is the standard. The choice between DOM and SAX parsing depends on your specific use case: random access and modification favor DOM, while large files and streaming scenarios demand SAX or pull parsers.

FAQ

Is XML still used in modern development?

Absolutely. Enterprise systems, Office documents, SVG graphics, RSS feeds, healthcare data, financial reporting. It's everywhere once you look. JSON didn't replace XML - they serve different purposes.

What is the main difference between XML and JSON?

XML is for documents with validation, namespaces, and mixed content. JSON is for simple data structures. I use JSON for APIs, XML when I need schema validation or I'm working with document-centric data.

What are XML namespaces and why do they matter?

They prevent name collisions when combining XML vocabularies. Without them, 'title' in one format would conflict with 'title' in another. Essential for large integrations.

When should I use DOM vs SAX parsing?

DOM loads everything into memory - fine for small files. SAX streams through with constant memory - necessary for large files. My rule: under 10MB use DOM, over 100MB use SAX.

What is XSD and why should I use it?

XSD defines the structure and data types for XML documents. It catches invalid data automatically. I use it whenever data correctness matters - it's like TypeScript for XML.

Can I use XML and JSON together?

All the time. Accept JSON from web clients, convert to XML for enterprise backends, convert back. Every language has libraries for this. Pick the right format for each context.

What XML library should I use in JavaScript?

Browser: built-in DOMParser. Node.js: fast-xml-parser for speed, xml2js for simplicity. I usually reach for fast-xml-parser these days.

How do I handle large XML files efficiently?

SAX or streaming parsers - they don't load the whole file into memory. Combine with state machines to track position. For Java, StAX. For Python, xmlreader. Don't try DOM on gigabyte files.

MŠ

Martin Šikula

Founder of CodeUtil. Web developer building tools I actually use. When I'm not coding, I experiment with productivity techniques (with mixed success).

More about me →LinkedIn

June 11, 20249 min

Email Regex: Why It Is Harder Than You Think

I've written probably 20 different email regex patterns in my career. Most of them were wrong. Here's what I learned after years of getting it wrong, and the patterns that actually work.

Regex Testerregexemailvalidation

July 15, 20247 min

CSV to JSON Converter: Complete Guide for Developers

I convert CSV to JSON probably weekly - data imports, API migrations, analytics exports. Here is everything I have learned about doing it right, including the edge cases that used to break my scripts.

CSV to JSON Convertercsvjsondata formats

October 24, 202414 min

JSON vs YAML vs XML: Which One Should You Actually Use?

After years of config file debates at Šikulovi s.r.o., here's my honest take on when to use what. Spoiler: I use all three, but for very different reasons.

JSON Formatterjsonxmlyamldata formats