I`m a philosophy major. That means I can think deep thoughts about being unemployed.
                    - Bruce Lee
 
Search
Search

 
 
Navigation menu
Navigation menu
 
 
 
 
Size_box_tl   Size_box_tr
 

SAX - Java working with XML overview

Complete Source Code for all examples below

Points to remember:

  • Simple API for XML - the most basic approach
  • Event driven - events are things that happen during XML parsing
  • Nothing stored in memory by API you have to do that yourself
  • Use when you have memory constraints, or when working with huge files, or when you only need few pieces of information from the XML
  • We only have one go at this thing (while we are parsing it). There is no way to invoke / retrieve a portion of XML - we have only what is comming back from events.

 

Simple example:

We have a stock.xml file and we have Stock.class that contains only few data elements from the XML. We need to extract data as a List of Stock java objects.

 

The XML:

 



  
    PIPE
    Smoking pipe
    10.90
    10
  
  
    VIO
    Violin
    99.99
    5
  
  
    Hat
    9.49
  


 

The class:

 

package dp.test.xml.jaxp.sax.entity;

import java.math.BigDecimal;

/**
 * Example entity.
 * 
 * @author DPavlov
 */
public class Stock
{

	private String symbol;
	private BigDecimal quantity;
	
	public String getSymbol() {
		return symbol;
	}
	
	public void setSymbol(String symbol) {
		this.symbol = symbol;
	}
	
	public BigDecimal getQuantity() {
		return quantity;
	}
	
	public void setQuantity(BigDecimal quantity) {
		this.quantity = quantity;
	}

	public String toString() {
	    final String TAB = "    ";
	    
	    String retValue = "";
	    
	    retValue = "Stock ( "
	        + super.toString() + TAB
	        + "symbol = " + this.symbol + TAB
	        + "quantity = " + this.quantity + TAB
	        + " )";
	
	    return retValue;
	}
	
}

 

The code to do the task:

package dp.test.xml.jaxp.sax;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.junit.Test;
import org.xml.sax.SAXException;

import dp.test.xml.jaxp.sax.entity.Stock;

/**
 * JAXP SAX (Simple API for XML) is in fact the foundation of all java XML (even DOM
 * uses it behind the scenes). The use of SAX directly however can give you some benefits
 * in situations where you know exactly what data you will need to extract from the XML.
 * SAX is event driven, meaning it provides mechanism of reacting to event of encountering
 * nodes while the parsing of document is in progress. No data is stored in memory unless
 * you write the code to do so. The advantage is less memory consumption and maybe a little 
 * bit gain in speed since you can do a quick bail out switches in your code for the stuff 
 * you do not need. The cost is that you only have one go at this thing (i.e. it all happens 
 * while it happens), and you cannot manipulate the XML with this.
 * 
 * @author DPavlov
 */
public class JAXPSAXExample
{
	
	@Test
	public void testSAXNoNamespace() throws ParserConfigurationException, SAXException, IOException {
		
		SAXParserFactory factory = SAXParserFactory.newInstance();
		factory.setValidating(true);
		factory.setNamespaceAware(false); // This setting is very important - it influences the node name
		SAXParser parser = factory.newSAXParser();
		
		final List result = new ArrayList();
		parser.parse(JAXPSAXExample.class.getResourceAsStream("stock.xml"), new SAXHandlerNoNamespace(result));
		
		System.out.println(result);
		
	}
	
}

package dp.test.xml.jaxp.sax;

import java.math.BigDecimal;
import java.util.Arrays;
import java.util.List;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

import dp.test.xml.jaxp.sax.entity.Stock;

/**
 * Extending default handler allows to override the necessary methods that
 * are invoked during parsing.
 * 
 * @author DPavlov
 */
public class SAXHandlerNoNamespace extends DefaultHandler
{

	private final List stocks;

	private Stock currentStock;
	
	private boolean handleSymbol = false;
	private boolean handleQuantity = false;
	
	private boolean hasData = false;
	
	public SAXHandlerNoNamespace(List stocks) {
		super();
		this.stocks = stocks;
	}

	/**
	 * @param uri namespace full URI
	 * @param localName simple name without namespace
	 * @param name full name including namespace
	 */
	@Override
	public void endElement(String uri, String localName, String name) throws SAXException {
		if ("stock".equals(name)) {
			if (this.hasData) {
				this.stocks.add(this.currentStock);
			}
			this.currentStock = null;
			this.hasData = false;
		} else if ("symbol".equals(name)) {
			this.handleSymbol = false;
		} else if ("quantity".equals(name)) {
			this.handleQuantity = false;
		}
	}

	/**
	 * @param uri namespace full URI
	 * @param localName simple name without namespace
	 * @param name full name including namespace
	 * @param attributes attributes in the node if any
	 */
	@Override
	public void startElement(String uri, String localName, String name, Attributes attributes) throws SAXException {
		if ("stock".equals(name)) {
			this.currentStock = new Stock(); 
		} else if ("symbol".equals(name)) {
			this.handleSymbol = true;
		} else if ("quantity".equals(name)) {
			this.handleQuantity = true;
		}
	}

	/**
	 * Current text inside the node.
	 */
	@Override
	public void characters(char[] ch, int start, int length) throws SAXException {
		if (handleSymbol) {
			this.currentStock.setSymbol(String.valueOf(Arrays.copyOfRange(ch, start, start + length)));
			this.hasData = true;
		} else if (handleQuantity) {
			this.currentStock.setQuantity(new BigDecimal(String.valueOf(Arrays.copyOfRange(ch, start, start + length))));
			this.hasData = true;
		}
	}
	
	
	
}

 

The output:

[
  Stock ( dp.test.xml.jaxp.sax.entity.Stock@ee22f7    symbol = PIPE    quantity = 10     ), 
  Stock ( dp.test.xml.jaxp.sax.entity.Stock@39ab89    symbol = VIO    quantity = 5     )
]

There is also support for namespaces, which looks something like this in our modified example

The XML:

 



  
    PIPE
    Smoking pipe
    10.90
    10
  
  
    VIO
    Violin
    99.99
    5
  
  
    Hat
    9.49
  



The code to do the task:

public class JAXPSAXExample
{
	

	@Test
	public void testSAXWithNamespace() throws ParserConfigurationException, SAXException, IOException {
		
		SAXParserFactory factory = SAXParserFactory.newInstance();
		factory.setValidating(true);
		factory.setNamespaceAware(true); // This setting is very important - it influences the node name
		SAXParser parser = factory.newSAXParser();
		
		final List result = new ArrayList();
		parser.parse(JAXPSAXExample.class.getResourceAsStream("stock-ns.xml"), new SAXHandlerWithNamespace(result));
		
		System.out.println(result);
		
	}
	
}

package dp.test.xml.jaxp.sax;

import java.math.BigDecimal;
import java.util.Arrays;
import java.util.List;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

import dp.test.xml.jaxp.sax.entity.Stock;

/**
 * Extending default handler allows to override the necessary methods that
 * are invoked during parsing.
 * 
 * @author DPavlov
 */
public class SAXHandlerWithNamespace extends DefaultHandler
{

	private final List stocks;

	private Stock currentStock;
	
	private boolean handleSymbol = false;
	private boolean handleQuantity = false;
	
	private boolean hasData = false;
	
	public SAXHandlerWithNamespace(List stocks) {
		super();
		this.stocks = stocks;
	}

	/**
	 * @param uri namespace full URI
	 * @param localName simple name without namespace
	 * @param name full name including namespace
	 */
	@Override
	public void endElement(String uri, String localName, String name) throws SAXException {
		if ("stock".equals(localName)) {
			if (this.hasData) {
				this.stocks.add(this.currentStock);
			}
			this.currentStock = null;
			this.hasData = false;
		} else if ("symbol".equals(localName)) {
			this.handleSymbol = false;
		} else if ("quantity".equals(localName)) {
			this.handleQuantity = false;
		}
	}

	/**
	 * @param uri namespace full URI
	 * @param localName simple name without namespace
	 * @param name full name including namespace
	 * @param attributes attributes in the node if any
	 */
	@Override
	public void startElement(String uri, String localName, String name, Attributes attributes) throws SAXException {
		if ("stock".equals(localName)) {
			this.currentStock = new Stock();
		} else if ("symbol".equals(localName)) {
			this.handleSymbol = true;
		} else if ("quantity".equals(localName)) {
			this.handleQuantity = true;
		}
	}

	/**
	 * Current text inside the node.
	 */
	@Override
	public void characters(char[] ch, int start, int length) throws SAXException {
		if (handleSymbol) {
			this.currentStock.setSymbol(String.valueOf(Arrays.copyOfRange(ch, start, start + length)));
			this.hasData = true;
		} else if (handleQuantity) {
			this.currentStock.setQuantity(new BigDecimal(String.valueOf(Arrays.copyOfRange(ch, start, start + length))));
			this.hasData = true;
		}
	}
	
	
	
}

 

Summary:

  • The SAXParser instance is obtained through SAXParserFactory.
  • The namespace awareness is set through the factory, that influences how node names are passed as parameters to handler
  • We do our processing of data by using handler class that must extend the DefaultHandler and we override necessary method to add logic of how the XML data should be extracted.

 

 



© Inspire Software, Denys Pavlov, 2005-2012
© Inspire Software, Denys Pavlov, 2005-2012
 
Size_box_bl   Size_box_br