How to Read XML Files in Java?
Introduction
Today, we are going to learn about how to read XML files in java. Before, reading the XML Files let us learn about XML.
XML Files
The full form of XML is Extensive Markup Language. HTML(Hyper Text Markup Language) and XML look alike. But here in the XML the tags created are user defined. This is an easier way to store data. Most crucially, because the fundamental structure of XML is defined, the recipient can still understand the data whether you share or transmit it across platforms, whether locally or over the internet. This is because XML syntax is standardized.
Sample Structure of XML File
<tag1>
<subtag1>
<subsubtag1>
</subsubtag1>
</subtag1>
<subtag2>
</subtag2>
</tag1>
<tag2>
</tag2>
Example XML Code
<?xml version="1.0"?>
< class >
< student >
< id > 101 < / id >
< firstname > JOE < / firstname >
< lastname > ROOT < / lastname >
< number > 66 < / number >
< runs > 10458 < / runs >
< / student >
< / class >
Output
101
JOE
ROOT
66
10458
Example 2:
<?xml version="1.0"?>
< India >
< medals >
< place > 4 < / place >
< gold > 21 < / golds >
< silver > 16 < / silver >
< bronze > 23 < / bronze >
< total > 61 < / total >
< / medals >
< / India >
Output
4
21
16
23
61
We will know more about the output while coding in java.
Difference between XML and HTML
XML | HTML |
XML stands for Extensive Markup Language | HTML stands for Hyper Text Markup Language |
User Defined tags are used here | Pre-defined tags are used here |
XML does not allow errors | HTML ignores some errors |
Case Sensitive | Not Case Sensitive |
XML Tags are extensible | HTML Tags are not extensible |
XML is dynamic in nature | HTML is static in nature |
XML stores data | HTML displays data |
Reading XML Files in Java
Reading XML files is not similar to reading of .txt, .py, etc. We require APIs for this task. The APIs are
- The SAX API
- The DOM API
DOM API
The classes to read and write an XML file are provided by the DOM API. Using the DOM API, we are able to create, remove, change, and rearrange the node. A DOM object is created in memory once the DOM parser fully processes the whole XML file. The node stands for one of an XML file's elements. When it loads an XML file into memory, the DOM parser takes a long time and uses a lot of memory.
XML Code
File Name: Players.xml
<?xml version="1.0"?>
<class>
<student>
<number>66</number>
<firstname>Joe</firstname>
<lastname>Root</lastname>
<Team>England</Team>
<runs>10458</runs>
</student>
<student>
<number>18</number>
<firstname>Virat</firstname>
<lastname>Kohli</lastname>
<Team>India</Team>
<runs>8074</runs>
</student>
<student>
<number>11</number>
<firstname>Kumara</firstname>
<lastname>Sangakkara</lastname>
<Team>Sri Lanka</Team>
<runs>12400</runs>
</student>
<student>
<number>26</number>
<firstname>Alastair</firstname>
<lastname>Cook</lastname>
<Team>England</Team>
<runs>12472</runs>
</student>
<student>
<number>19</number>
<firstname>RAHUL</firstname>
<lastname>DRAVID</lastname>
<Team>India</Team>
<runs>13288</runs>
</student>
</class>
Rules for writing XML Files
- XML Files are case sensitive
- White spaces are technically intended for inclusion of white spaces in the tags, elements, etc.
- All XML Tags must have a closing tag
- XML Files does not ignore errors, so, we have to be very careful
- Root elements are must for an XML File
- All XML Elements must be properly nested
This is how the XML file looks. Let us get the output of XML file by using DOM Parser.
Example Code:
File Name: DomAPI.java
// the program which uses DOM Parser to read the XML Files
import java.io.*;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
public class DomAPI
{
public static void main(String args [])
{
try
{
// creating an object for file class where the object contains the contents of the file
File file = new File("C:/new/D.xml");
//an instance of factory that gives a document builder
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
//an instance of builder
DocumentBuilder db = dbf.newDocumentBuilder();
// parsing the xml file
Document doc = db.parse(file);
doc.getDocumentElement().normalize();
System.out.println("Root element: " + doc.getDocumentElement().getNodeName());
NodeList nodeList = doc.getElementsByTagName("student");
// nodeList is used here
// nodeList cannot be iterated
//So are using for loop
for(int it = 0; it < nodeList.getLength(); it++)
{
Node node = nodeList.item(it);
System.out.println("\nNode Name:" + node.getNodeName());
if(node.getNodeType() == Node.ELEMENT_NODE)
{
Element e =(Element) node;
System.out.println("Number:"
+e.getElementsByTagName("number").item(0).getTextContent());
System.out.println("First Name:"
+e.getElementsByTagName("firstname").item(0).getTextContent());
System.out.println("Last Name:"
+e.getElementsByTagName("lastname").item(0).getTextContent());
System.out.println("Team: "+ e.getElementsByTagName("Team").item(0).getTextContent());
System.out.println("Runs: "+ e.getElementsByTagName("runs").item(0).getTextContent());
}
}
}
catch(Exception ex)
{
ex.printStackTrace();
}
}
Output:
C:\new>javac ReadXMLFileExample1.java
C:\new>java ReadXMLFileExample1
Root element: class
Node Name: student
Number: 66
First Name: Joe
Last Name: Root
Team: England
Runs: 10458
Node Name: student
Number: 18
First Name: Virat
Last Name: Kohli
Team: India
Runs: 8074
Node Name: student
Number: 11
First Name: Kumara
Last Name: Sangakkara
Team: Sri Lanka
Runs: 12400
Node Name: student
Number: 26
First Name: Alastair
Last Name: Cook
Team: England
Runs: 12472
Node Name: student
Number: 19
First Name: RAHUL
Last Name: DRAVID
Team: India
Runs: 13288
C:\new>
Note:
If the program is not working then download dom-2.3.0-jaxb-1.0.6.jar file.
SAX Parser
Simple API for XML is the name of the Java SAX parser. An XML file is parsed line by line by the SAX parser. When it comes across the opening tag, ending tag, and character data in an xml file, it starts events. The event-based parser is another name for the SAX parser.
Because SAX is a streaming interface for XML, XML files are parsed sequentially from the top of the page all the way down to the closure of the root element.
No XML files are loaded into memory by the SAX parser. The XML document's object representation is not created. Call back functions are used by the SAX parser to notify clients of the structure of the XML document. Compared to DOM parser, it is quicker and requires less memory.
Example XML File
File Name : D.xml
<?xml version="1.0"?>
<class>
<student>
<number>66</number>
<firstname>Joe</firstname>
<lastname>Root</lastname>
<Team>England</Team>
<runs>10458</runs>
</student>
<student>
<number>18</number>
<firstname>Virat</firstname>
<lastname>Kohli</lastname>
<Team>India</Team>
<runs>8074</runs>
</student>
<student>
<number>11</number>
<firstname>Kumara</firstname>
<lastname>Sangakkara</lastname>
<Team>Sri Lanka</Team>
<runs>12400</runs>
</student>
<student>
<number>26</number>
<firstname>Alastair</firstname>
<lastname>Cook</lastname>
<Team>England</Team>
<runs>12472</runs>
</student>
<student>
<number>19</number>
<firstname>RAHUL</firstname>
<lastname>DRAVID</lastname>
<Team>India</Team>
<runs>13288</runs>
</student>
</class>
Example Java Program
File Name: SaxAPi.java
// Using SAX Parser
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class ReadXMLFileExample3
{
public static void main(String args[])
{
try
{
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
DefaultHandler handler = new DefaultHandler()
{
boolean number = false;
boolean firstname = false;
boolean lastname = false;
boolean team = false;
boolean runs = false;
//parser starts parsing a specific element inside the document
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException
{
System.out.println("\nStart Element :" + qName);
if(qName.equalsIgnoreCase("number"))
{
number=true;
}
if(qName.equalsIgnoreCase("firstname"))
{
firstname = true;
}
if(qName.equalsIgnoreCase("lastname"))
{
lastname = true;
}
if(qName.equalsIgnoreCase("Team"))
{
team = true;
}
if(qName.equalsIgnoreCase("runs"))
{
runs = true;
}
}
//parser ends parsing the specific element inside the document
public void endElement(String uri, String localName, String qName) throws SAXException
{
System.out.println("\nEnd Element:" + qName);
}
//reads the text value of the currently parsed element
public void characters(char ch[], int start, int length) throws SAXException
{
if(number)
{
System.out.println("Number : " + new String(ch, start, length));
number = false;
}
if(firstname)
{
System.out.println("First Name: " + new String(ch, start, length));
firstname = false;
}
if(lastname)
{
System.out.println("Last Name: " + new String(ch, start, length));
lastname = false;
}
if(team)
{
System.out.println("Team: " + new String(ch, start, length));
team = false;
}
if(runs)
{
System.out.println("Runs : " + new String(ch, start, length));
runs = false;
}
}
};
saxParser.parse("C:/new/D.xml", handler);
}
catch(Exception e)
{
e.printStackTrace();
}
}
Output:
C:\new>javac ReadXMLFileExample3.java
C:\new>java ReadXMLFileExample3
Start Element: class
Start Element: student
Start Element: number
Number: 66
End Element: number
Start Element :firstname
First Name: Joe
End Element: firstname
Start Element :lastname
Last Name: Root
End Element:lastname
Start Element :Team
Team: England
End Element: Team
Start Element :runs
Runs : 10458
End Element: runs
End Element: student
Start Element :student
Start Element :number
Number : 18
End Element: number
Start Element :firstname
First Name: Virat
End Element: firstname
Start Element :last name
Last Name: Kohli
End Element:lastname
Start Element :Team
Team: India
End Element: Team
Start Element :runs
Runs : 8074
End Element: runs
End Element: student
Start Element :student
Start Element :number
Number : 11
End Element: number
Start Element :firstname
First Name: Kumara
End Element: firstname
Start Element :lastname
Last Name: Sangakkara
End Element: lastname
Start Element: Team
Team: Sri Lanka
End Element: Team
Start Element :runs
Runs : 12400
End Element: runs
End Element: student
Start Element :student
Start Element :number
Number : 26
End Element: number
Start Element :firstname
First Name: Alastair
End Element: firstname
Start Element :lastname
Last Name: Cook
End Element: lastname
Start Element :Team
Team: England
End Element: Team
Start Element :runs
Runs : 12472
End Element: runs
End Element: student
Start Element :student
Start Element :number
Number : 19
End Element: number
Start Element :firstname
First Name: RAHUL
End Element: firstname
Start Element :lastname
Last Name: DRAVID
End Element: lastname
Start Element :Team
Team: India
End Element: Team
Start Element :runs
Runs : 13288
End Element: runs
End Element: student
End Element: class
C:\new>