RapidXml – A Lightweight xml Library for C++

Article summary

I don’t write in C++ frequently and I can’t say that I am sad about that. However, we all have those projects where, for some reason or another, we must use a tool that wouldn’t normally be our first choice. In one particular case, just recently, I was given a library written in C++ that I needed to use to perform a specific action on some data. The front end of the application that I was working on was written in Ruby and it basically collected information from different sources and stuffed the data into an XML file. I needed to write a simple shim, in C++, that could read in the data from the XML file and then call the correct functions in the library based on the data contained within.

C++ does not have support for parsing XML data in the standard libraries. I started with a typical Google search, “C++ XML Parsing”, which returned a large number of results… of course. My goal was to find something really lightweight and simple to use. All I wanted to do was iterate over the tree, identify nodes and read attributes. I skimmed through several README files, downloaded a couple libraries and finally stumbled upon RapidXml.

As stated in the project documentation, “RapidXml is an attempt to create the fastest XML parser possible, while retaining usability, portability and reasonable W3C compatibility.” Sounded great to me! I gave it a try and the results were very positive. The library is composed of 4 files, the total size of which is 141KB.

Demonstration

To demonstrate its functionality, I created a sample XML file which contains information about a few breweries that I have visited. It looks like this:


<?xml version="1.0" encoding="utf-8"?>
<MyBeerJournal>
    <Brewery name="Founders Brewing Company" location="Grand Rapids, MI">
        <Beer name="Centennial" description="IPA" rating="A+" dateSampled="01/02/2011">
            "What an excellent IPA. This is the most delicious beer I have ever tasted!"
        </Beer>
    </Brewery>
    <Brewery name="Brewery Vivant" location="Grand Rapids, MI">
        <Beer name="Farmhouse Ale" description="Belgian Ale" rating="B" dateSampled="02/07/2015">
            This beer is not so good... but I am not that big of a fan of english style ales.
        </Beer>
    </Brewery>
    <Brewery name="Bells Brewery" location="Kalamazoo, MI">
        <Beer name="Two Hearted Ale" description="IPA" rating="A" dateSampled="03/15/2012">
            Another execllent brew. Two Hearted gives Founders Centennial a run for it's money.
        </Beer>
    </Brewery>
</MyBeerJournal>

The code below simply reads in the data from the XML file and outputs easily readable text describing the data.


#include <string.h>
#include <stdio.h>
#include <iostream>
#include <fstream>
#include <vector>
#include "rapidxml-1.13/rapidxml.hpp"

using namespace rapidxml;
using namespace std;

int main(void)
{
	cout << "Parsing my beer journal..." << endl;
	xml_document<> doc;
	xml_node<> * root_node;
	// Read the xml file into a vector
	ifstream theFile ("beerJournal.xml");
	vector<char> buffer((istreambuf_iterator<char>(theFile)), istreambuf_iterator<char>());
	buffer.push_back('\0');
	// Parse the buffer using the xml file parsing library into doc 
	doc.parse<0>(&buffer[0]);
	// Find our root node
	root_node = doc.first_node("MyBeerJournal");
	// Iterate over the brewerys
	for (xml_node<> * brewery_node = root_node->first_node("Brewery"); brewery_node; brewery_node = brewery_node->next_sibling())
	{
	    printf("I have visited %s in %s. ", 
	    	brewery_node->first_attribute("name")->value(),
	    	brewery_node->first_attribute("location")->value());
            // Interate over the beers
	    for(xml_node<> * beer_node = brewery_node->first_node("Beer"); beer_node; beer_node = beer_node->next_sibling())
	    {
	    	printf("On %s, I tried their %s which is a %s. ", 
	    		beer_node->first_attribute("dateSampled")->value(),
	    		beer_node->first_attribute("name")->value(), 
	    		beer_node->first_attribute("description")->value());
	    	printf("I gave it the following review: %s", beer_node->value());
	    }
	    cout << endl;
	}
}

and here is the output:

I have visited Founders Brewing Company in Grand Rapids, MI. On 01/02/2011, I tried their Centennial which is a IPA. I gave it the following review:
“What an excellent IPA. This is the most delicious beer I have ever tasted!”

I have visited Brewery Vivant in Grand Rapids, MI. On 02/07/2015, I tried their Farmhouse Ale which is a Belgian Ale. I gave it the following review:
This beer is not so good… but I am not that big of a fan of english style ales.

I have visited Bells Brewery in Kalamazoo, MI. On 03/15/2012, I tried their Two Hearted Ale which is a IPA. I gave it the following review:
Another execllent brew. Two Hearted gives Founders Centennial a run for it’s money.

Overall, I found that RapidXml works very well and is simple to use. One complaint is that if there is a syntax error in the XML file, it doesn’t give you much indication as to what the problem is. But, I suppose that is one of the tradeoffs of using a lightweight tool. The next time you need to parse some XML, from C++, you should give RapidXml a try!