Introduction to XML
XML (Extensible Markup Language) is a markup language used to store and transport data in a structured and readable format. Unlike HTML, XML is not predefined — you define your own tags.
XML emphasizes self-describing data and strict structure, making it suitable for data interchange, configuration, and document storage across heterogeneous systems.
Basic XML Example
<note>
<to>Alice</to>
<from>Bob</from>
<heading>Reminder</heading>
<body>Don't forget the meeting at 10 AM</body>
</note>
XML Basics
Elements and Tags
<tagname>Content</tagname>
Example: <name>John</name>
Attributes
<note date="2026-02-04">
<to>Alice</to>
</note>
Comments
<!-- This is a comment in XML -->
Well-formed XML requires proper nesting and valid tag names. Whitespace is preserved unless an application ignores it.
XML Structure Rules
- XML must have a single root element.
- Tags must be properly nested.
- XML is case-sensitive.
- Attribute values must be quoted.
- All elements must have a closing tag or be self-closing.
Example of Properly Nested XML
<library>
<book>
<title>Java Basics</title>
<author>John</author>
</book>
</library>
These rules ensure XML is well-formed. Validation with DTD/XSD adds semantic constraints beyond syntax.
XML Data Types
By default, XML stores everything as text. For structured data validation, you can use:
- DTD (Document Type Definition)
- XSD (XML Schema Definition)
Example: DTD
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
Example: XSD
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
XSD supports rich type constraints (dates, patterns, ranges) and is preferred for complex validation.
Parsing XML
There are two main ways to read XML in programming:
- DOM Parser (Document Object Model) – Loads full XML into memory as tree.
- SAX Parser (Simple API for XML) – Event-based, reads sequentially, memory-efficient.
Java DOM Parser Example
import javax.xml.parsers.*;
import org.w3c.dom.*;
import java.io.File;
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(new File("note.xml"));
NodeList nList = doc.getElementsByTagName("note");
for(int i=0;i<nList.getLength();i++){
Element element = (Element) nList.item(i);
System.out.println("To: " + element.getElementsByTagName("to").item(0).getTextContent());
}
Java SAX Parser Example
import org.xml.sax.*;
import org.xml.sax.helpers.DefaultHandler;
import javax.xml.parsers.SAXParserFactory;
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser parser = factory.newSAXParser();
DefaultHandler handler = new DefaultHandler(){
public void startElement(String uri,String localName,String qName,Attributes attributes){
System.out.println("Start Element: " + qName);
}
};
parser.parse("note.xml", handler);
Use DOM for smaller files when you need random access, and SAX or StAX for streaming large documents.
XPath in XML
XPath is used to navigate XML documents and select nodes.
<note>
<to>Alice</to>
<from>Bob</from>
</note>
XPath Examples:
1. /note/to → Selects <to> element
2. //from → Selects all <from> elements
3. /note/* → Selects all child elements of <note>
Java XPath Example
import javax.xml.xpath.*;
XPathFactory xpf = XPathFactory.newInstance();
XPath xpath = xpf.newXPath();
String expression = "/note/to";
String result = xpath.evaluate(expression, doc);
System.out.println(result); // Alice
XPath supports predicates, attribute selection, and functions for powerful queries.
XML Namespaces
Namespaces prevent element name conflicts when combining XML from different sources.
<note xmlns:h="http://www.w3.org/TR/html4/"
xmlns:f="http://www.w3schools.com/furniture">
<h:to>Alice</h:to>
<f:table>Desk</f:table>
</note>
Namespace prefixes map to URIs and should be consistent across documents for reliable parsing.
Advanced XML Topics
XML with XSLT (Transformation)
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/note">
<html><body>
<h2><xsl:value-of select="to"/></h2>
</body></html>
</xsl:template>
</xsl:stylesheet>
XML with Namespaces and Schema Validation
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
factory.setValidating(true);
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("file.xml");
Using CDATA Sections
<note>
<body><![CDATA[Special characters < & > inside content]]></body>
</note>
Advanced XML workflows often combine XSLT transforms, schema validation, and namespace-aware parsing.