THE XML
I wanted to validate the XML sheets produced by auteurexample XML produced by auteur.
<?xml version="1.0" encoding="UTF-8"?> <auteur> <source> <location>/home/neil/Videos/2010_04_canada/M2U00547.MPG</location> <timestamp pos="11" /> <clip end="7.5" id="0001" start="1.5" /> <clip end="13.6" id="0002" start="7.5" /> <clip end="1.5" id="0003" start="0.0" /> </source> <source> <location>/home/neil/Videos/2010_04_canada/M2U00549.MPG</location> <timestamp pos="2.678" /> </source> </auteur>
The Schema
and here's the schema I wrote to check the validity of that data.(saved as foo.xsd)
<?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:annotation> <xs:documentation> The rules in this schema will be used to validate the content of an auteur project file. </xs:documentation> <xs:appinfo source="http://www.auteur-project.org" > </xs:appinfo> </xs:annotation> <!--RULES START HERE --> <!-- ROOT NODE --> <xs:element name="auteur" > <xs:complexType> <xs:sequence> <!-- SOURCES --> <xs:element name="source" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <!-- only 1 location allowed per source --> <xs:element name = "location" type="xs:string" minOccurs="1" maxOccurs="1"/> <!-- TIMESTAMPS --> <xs:element name = "timestamp" minOccurs="0" maxOccurs="unbounded" > <xs:complexType> <xs:attribute name="pos" type="xs:decimal" use="required" /> </xs:complexType> </xs:element> <!-- CLIPS --> <xs:element name="clip" minOccurs="0" maxOccurs="unbounded" > <xs:complexType> <xs:attribute name = "id" type="xs:string" use="required" /> <xs:attribute name = "start" type="xs:decimal" /> <xs:attribute name = "end" type="xs:decimal" /> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
THE PYTHON to put it all together
a short python script to check foo.xml's validity with foo.xsd(note lxml is NOT in python standard lib - a real pity!)
#! /usr/bin/env python from lxml import etree try: doc = etree.parse("foo.xml") xsd = etree.parse("foo.xsd") xmlschema = etree.XMLSchema(xsd) xmlschema.assertValid(doc) print ("document validates!") except etree.XMLSyntaxError as e: print ("PARSING ERROR", e) except AssertionError as e: print ("INVALID DOCUMENT", e)
No comments:
Post a Comment