Parsing Data from External Sources in Python

XML

XML is a widely used toll to store and transport data. XML stands for EXtensiable Markup Language, and stores data in tags similar to HTML, except that in XML, tags are NOT pre-defined.

A sample XML file is like this

<book>
    <title> Parsing Data in Python </title>
    <author> John Smith </author>
    <year> 2016  </year>
    <price> 25.00  </price>
</book>

Each field is given by a tag. Now, given an XML file, if we want to retreive some data, XPath can be used. Here, we illustrate how to use XPath to retreive data from XML. Using the example XML data above, XPath uses a

  • First, we need to create

In [4]:
import lxml, lxml.etree

sample_data = '<T1> <T2> Child 1 <T3> Child 3</T3> Child 2</T2> </T1>'

lxml.etree.XML(sample_data)


Out[4]:
<Element T1 at 0x7f0b44d47908>

In [ ]: