validate xml with namespaces against schematron using lxml in python

  • Last Update :
  • Techknowledgy :

It only required a small change to the Schematron file, adding the "ns" element in as follows:

<?xml version='1.0' encoding='UTF-8'?>
<schema xmlns="http://purl.oclc.org/dsdl/schematron">
   <ns uri="http://foo" prefix="ns1" />
   <pattern>
      <rule context="//ns1:bar">
         <assert test="number(.) = 2">
            bar must be 2
         </assert>
      </rule>
   </pattern>
</schema>

Suggestion : 2

Built on a pure-xslt implementation, the actual validator is created as an XSLT 1.0 stylesheet using these steps:,Using the phase parameter of isoschematron.Schematron allows for selective validation of predefined pattern groups:,lxml also provides support for ISO-Schematron, based on the pure-XSLT skeleton implementation of Schematron:,(Extract embedded Schematron from XML Schema or RelaxNG schema)

>>> from lxml
import etree
>>> parser = etree.XMLParser(dtd_validation = True)
>>> schema_root = etree.XML('''\
... <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
   ...
   <xsd:element name="a" type="xsd:integer" />
   ...
</xsd:schema>
... ''')
>>> schema = etree.XMLSchema(schema_root)

>>> parser = etree.XMLParser(schema = schema)
>>> root = etree.fromstring("<a>5</a>", parser)
>>> root = etree.fromstring("<a>no int</a>", parser) # doctest: +ELLIPSIS
Traceback (most recent call last):
lxml.etree.XMLSyntaxError: Element 'a': 'no int' is not a valid value of the atomic type 'xs:integer'...
>>> f = StringIO("<!ELEMENT b EMPTY>") >>>
   dtd = etree.DTD(f)
>>> root = etree.XML("<b />")
>>> print(dtd.validate(root))
True

>>> root = etree.XML("<b><a /></b>")
>>> print(dtd.validate(root))
False

Suggestion : 3

I am not able to get lxml Schematron anycodings_schematron validator to recognize namespaces. anycodings_schematron Validation works fine in code without anycodings_schematron namespaces.,There must be a trick to registering anycodings_schematron namespaces in lxml Schematron that I am anycodings_schematron missing. Has anyone done this?,It only required a small change to the anycodings_schematron Schematron file, adding the "ns" element anycodings_schematron in as follows:,If I remove ns1 from both the XML file and anycodings_schematron the Schematron file, the example works anycodings_schematron perfectly-- no error message.

Here is the schematron file

<?xml version='1.0' encoding='UTF-8'?>
<schema xmlns="http://purl.oclc.org/dsdl/schematron" xmlns:ns1="http://foo">
   <pattern>
      <rule context="//ns1:bar">
         <assert test="number(.) = 2">
            bar must be 2
         </assert>
      </rule>
   </pattern>
</schema>

and here is the xml file

<?xml version="1.0" encoding="UTF-8"?>
<zip xmlns:ns1="http://foo">
   <ns1:bar>3</ns1:bar>
</zip>

here is the python code

from lxml
import etree, isoschematron
from plumbum
import local
schematron_doc = etree.parse(local.path('rules.sch'))
schematron = isoschematron.Schematron(schematron_doc)
xml_doc = etree.parse(local.path('test.xml'))
is_valid = schematron.validate(xml_doc)
assert not is_valid

It only required a small change to the anycodings_schematron Schematron file, adding the "ns" element anycodings_schematron in as follows:

<?xml version='1.0' encoding='UTF-8'?>
<schema xmlns="http://purl.oclc.org/dsdl/schematron">
   <ns uri="http://foo" prefix="ns1" />
   <pattern>
      <rule context="//ns1:bar">
         <assert test="number(.) = 2">
            bar must be 2
         </assert>
      </rule>
   </pattern>
</schema>

Suggestion : 4

© 2022 Tech Help Notes

It only required a small change to the Schematron file, adding the ns element in as follows:

<?xml version=1.0 encoding=UTF-8?>
<schema xmlns=http://purl.oclc.org/dsdl/schematron>
   <ns uri=http://foo prefix=ns1 />
   <pattern>
      <rule context=//ns1:bar>
         <assert test=number(.)=2>
            bar must be 2
         </assert>
      </rule>
   </pattern>
</schema>

Suggestion : 5

A schema instance has methods to validate an XML document against the schema.,You can also convert XML data using the lxml library, that works better because namespace information is associated within each node of the trees:,Import the library and then create an instance of a schema using the path of the file containing the schema as argument:,An alternative mode for validating an XML document is implemented by the method xmlschema.XMLSchemaBase.validate(), that raises an error when the XML doesn’t conform to the schema:

>>>
import xmlschema
   >>>
   schema = xmlschema.XMLSchema('tests/test_cases/examples/vehicles/vehicles.xsd')
>>> schema_file = open('tests/test_cases/examples/collection/collection.xsd') >>>
   schema = xmlschema.XMLSchema(schema_file)
>>> schema = xmlschema.XMLSchema("""
... <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
   ...
   <xs:element name="block" type="xs:string" />
   ...
</xs:schema>
... """)
>>> schema_xsd = open('tests/test_cases/examples/vehicles/vehicles.xsd').read()
>>> schema = xmlschema.XMLSchema(schema_xsd)
Traceback (most recent call last):
...
...
xmlschema.validators.exceptions.XMLSchemaParseError: unknown element '{http://example.com/vehicles}cars':

Schema:

  <xs:element xmlns:xs="http://www.w3.org/2001/XMLSchema" ref="vh:cars" />

Path: /xs:schema/xs:element/xs:complexType/xs:sequence/xs:element
>>> schema_file = open('tests/test_cases/examples/vehicles/vehicles.xsd') >>>
   schema = xmlschema.XMLSchema(schema_file, base_url = 'tests/test_cases/examples/vehicles/')
>>> schema_file = open('tests/test_cases/examples/vehicles/vehicles.xsd') >>>
   schema = xmlschema.XMLSchema(schema_file, build = False) >>>
   _ = schema.include_schema('tests/test_cases/examples/vehicles/cars.xsd') >>>
   _ = schema.include_schema('tests/test_cases/examples/vehicles/bikes.xsd') >>>
   schema.build()