Using ElementTree
:
import xml.etree.ElementTree
# Open original file
et = xml.etree.ElementTree.parse('file.xml')
# Append new tag: <a x='1' y='abc'>body text</a>
new_tag = xml.etree.ElementTree.SubElement(et.getroot(), 'a')
new_tag.text = 'body text'
new_tag.attrib['x'] = '1' # must be str; cannot be an int
new_tag.attrib['y'] = 'abc'
# Write back to file
#et.write('file.xml')
et.write('file_new.xml')
An example might help:
my_file = open(filename, "r")
lines_of_file = my_file.readlines()
lines_of_file.insert(-1, "This line is added one before the last line")
my_file.writelines(lines_of_file)
I likely would not try to use a single file pointer for what you are describing, and instead read the file into memory, edit it, then write it out.:
inFile = open('file.xml', 'r') data = inFile.readlines() inFile.close() # some manipulation on `data` outFile = open('file.xml', 'w') outFile.writelines(data) outFile.close()
For the modification, you could use tag.text
from xml. Here is snippet:
import xml.etree.ElementTree as ET
tree = ET.parse('country_data.xml')
root = tree.getroot()
for rank in root.iter('rank'):
new_rank = int(rank.text) + 1
rank.text = str(new_rank)
tree.write('output.xml')
It worked in python 2.7 modifying an .xml file named "b.xml" located in folder "a", where "a" was located in the "working folder" of python. It outputs the new modified file as "c.xml" in folder "a", without yielding encoding errors (for me) in further use outside of python 2.7.
pattern = '<Author>'
subst = ' <Author>' + domain + '\\' + user_name + '</Author>'
line_index =0 #set line count to 0 before starting
file = io.open('a/b.xml', 'r', encoding='utf-16')
lines = file.readlines()
outFile = open('a/c.xml', 'w')
for line in lines[0:len(lines)]:
line_index =line_index +1
if line_index == len(lines):
#1. & 2. delete last line and adding another line in its place not writing it
outFile.writelines("Write extra line here" + '\n')
# 4. Close root tag:
outFile.writelines("</phonebook>") # as in:
#http://tizag.com/xmlTutorial/xmldocument.php
else:
#3. Substitue a line if it finds the following substring in a line:
pattern = '<Author>'
subst = ' <Author>' + domain + '\\' + user_name + '</Author>'
if pattern in line:
line = subst
print line
outFile.writelines(line)#just writing/copying all the lines from the original xml except for the last.
Import Element Tree module and open xml file, get an xml element,Element object can be manipulated by changing its fields, adding and modifying attributes, adding and removing children, Opening and reading using an ElementTree , Opening and reading using an ElementTree
Import Element Tree module and open xml file, get an xml element
import xml.etree.ElementTree as ET
tree = ET.parse('sample.xml')
root = tree.getroot()
element = root[0] #get first child of root element
Element object can be manipulated by changing its fields, adding and modifying attributes, adding and removing children
element.set('attribute_name', 'attribute_value') #set the attribute to xml element
element.text = "string_text"
If you want to remove an element use Element.remove() method
root.remove(element)
In the above code, we load the xmldocument.xml to mytree python object. We store its root element in myroot python object. Then using element methods, we iterate through the elements and add attribute ‘newprices’ to each node. We finally write the modified XML tree to output.xml file.,XML contains data represented in a hierarchical format so we will treat it as a tree for ease of use. For our purpose, we will use xml.etree.ElementTree for the purpose of parsing, searching & modifying XML tree. It offers many features & functions but we will use ElementTree and Element for our purpose. ElementTree contains the whole XML document as a tree, while Element contains a single node. ElementTree is useful for searching, reading & writing the whole document. Element is useful for reading & writing a single element. Each element has the following properties:,Here is a sample XML file that we will modify. It contains information about breakfast menu at a restaurant, where each element contains information about a food item, with attribute such as name, price, description, calories.,In this article, we have learnt how to easily modify XML document in python. Basically, you need to parse XML document or string to a python object, then use element methods, depending on your requirement, to modify the XML document, and then write back the modified XML tree back to an XML file.
Here is a python code to create XML code. We have shown how to parse XML directly from string, as well as from XML file xmldocument.xml. When you parse XML document from file you need to explicitly call getroot() function to get the root. When you parse XML from string, it automatically returns the root.
# importing the module.
import xml.etree.ElementTree as ET
XMLexample_stored_in_a_string ='''
<?xml version ="1.0"?>
<COUNTRIES>
<country name="INDIA">
<neighbor name="Dubai" direction="W" />
</country>
<country name="Singapore">
<neighbor name="Malaysia" direction="N" />
</country>
</COUNTRIES>
'''
# parsing directly.
tree = ET.parse('xmldocument.xml')
root = tree.getroot()
# parsing using the string.
stringroot = ET.fromstring(XMLexample_stored_in_a_string)
# printing the root.
print(root)
print(stringroot)
Here is the XML document used in the above python code. It contains the same content as the string.
<?xml version="1.0"?>
<!--COUNTRIES is the root element-->
<COUNTRIES>
<country name="INDIA">
<neighbor name="Dubai" direction="W" />
</country>
<country name="Singapore">
<neighbor name="Malaysia" direction="N" />
</country>
</COUNTRIES>
Here is a sample XML file that we will modify. It contains information about breakfast menu at a restaurant, where each element contains information about a food item, with attribute such as name, price, description, calories.
<?xml version="1.0"?>
<breakfast_menu>
<food>
<name itemid="11">Belgian Waffles</name>
<price>5.95</price>
<description>Two of our famous Belgian Waffles
with plenty of real maple syrup</description>
<calories>650</calories>
</food>
<food>
<name itemid="31">Berry-Berry Belgian Waffles</name>
<price>8.95</price>
<description>Light Belgian waffles covered with
an assortment of fresh berries and whipped cream</description>
<calories>900</calories>
</food>
<food>
<name itemid="41">French Toast</name>
<price>4.50</price>
<description>Thick slices made from our
homemade sourdough bread</description>
<calories>600</calories>
</food>
</breakfast_menu>
The elements present your XML file can be manipulated. To do this, you can use the set() function. Let us first take a look at how to add something to XML.,Once you execute this, you will be able to split the XML file and fetch the required data. You can also parse an open file using this function.,To calculate the number of items on our menu, you can make use of the len() function as follows:,The output shows that the first subelement of the food tag has been deleted. In case you want to delete all tags, you can make use of the clear() function as follows:
EXAMPLE:
<?xml version="1.0" encoding="UTF-8"?>
<metadata>
<food>
<item name="breakfast">Idly</item>
<price>$2.5</price>
<description>
Two idly's with chutney
</description>
<calories>553</calories>
</food>
<food>
<item name="breakfast">Paper Dosa</item>
<price>$2.7</price>
<description>
Plain paper dosa with chutney
</description>
<calories>700</calories>
</food>
<food>
<item name="breakfast">Upma</item>
<price>$3.65</price>
<description>
Rava upma with bajji
</description>
<calories>600</calories>
</food>
<food>
<item name="breakfast">Bisi Bele Bath</item>
<price>$4.50</price>
<description>
Bisi Bele Bath with sev
</description>
<calories>400</calories>
</food>
<food>
<item name="breakfast">Kesari Bath</item>
<price>$1.95</price>
<description>
Sweet rava with saffron
</description>
<calories>950</calories>
</food>
</metadata>
import xml.etree.ElementTree as ET
mytree = ET.parse('sample.xml')
myroot = mytree.getroot()
print(myroot)
You can also use fromstring() function to parse your string data. In case you want to do this, pass your XML as a string within triple quotes as follows:
import xml.etree.ElementTree as ET
data='''
<?xml version="1.0" encoding="UTF-8"?>
<metadata>
<food>
<item name="breakfast">Idly</item>
<price>$2.5</price>
<description>
Two idly's with chutney
</description>
<calories>553</calories>
</food>
</metadata>
'''
myroot = ET.fromstring(data)
#print(myroot)
print(myroot.tag)
for x in myroot.findall('food'):
item = x.find('item').text
price = x.find('price').text
print(item, price)
Modifying an XML File,XML Processing Modules,Building XML documents,Finding interesting elements
<?xml version="1.0"?>
<data>
<country name="Liechtenstein">
<rank>1</rank>
<year>2008</year>
<gdppc>141100</gdppc>
<neighbor name="Austria" direction="E" />
<neighbor name="Switzerland" direction="W" />
</country>
<country name="Singapore">
<rank>4</rank>
<year>2011</year>
<gdppc>59900</gdppc>
<neighbor name="Malaysia" direction="N" />
</country>
<country name="Panama">
<rank>68</rank>
<year>2011</year>
<gdppc>13600</gdppc>
<neighbor name="Costa Rica" direction="W" />
<neighbor name="Colombia" direction="E" />
</country>
</data>
import xml.etree.ElementTree as ET
tree = ET.parse('country_data.xml')
root = tree.getroot()
root = ET.fromstring(country_data_as_string)
>>> root.tag 'data' >>>
root.attrib {}
>>>
for child in root:
...print(child.tag, child.attrib)
...
country {
'name': 'Liechtenstein'
}
country {
'name': 'Singapore'
}
country {
'name': 'Panama'
}
>>> root[0][1].text '2008'
In the below image you see I have opened a cmd prompt and navigated to the directory where I have to create Python script for modifying XML using Python.,Now it’s time to test for the example on modifying XML using Python.,Here in the below Python XML builder script, I import the required module. Then I define a method that does the task of pretty printing of the XML structure otherwise all will be written in one line and it would a bit difficult to read the XMl file.,Now I will create a python script that will read the attached XML file and modify any node value in the XML content and update the same XML file.
I want to update or modify the below XML document and in this document I have the root element called bookstore. Then I have other sub-elements under the root element called book and so on.
<?xml version='1.0' encoding='utf-8'?>
<bookstore specialty="novel">
<book style="autobiography">
<author>
<first-name>Joe</first-name>
<last-name>Bob</last-name>
<award>Trenton Literary Review Honorable Mention</award>
</author>
<price>12</price>
</book>
<magazine frequency="monthly" style="glossy">
<price>12</price>
<subscription per="year" price="24" />
</magazine>
</bookstore>
So let’s create the below script called xml_modifier.py.
import xml.etree.ElementTree as ET
#pretty print method
def indent(elem, level = 0):
i = "\n" + level * " "
j = "\n" + (level - 1) * " "
if len(elem):
if not elem.text or not elem.text.strip():
elem.text = i + " "
if not elem.tail or not elem.tail.strip():
elem.tail = i
for subelem in elem:
indent(subelem, level + 1)
if not elem.tail or not elem.tail.strip():
elem.tail = j
else:
if level and(not elem.tail or not elem.tail.strip()):
elem.tail = j
return elem
tree = ET.parse('bookstore2.xml')
root = tree.getroot()
#update all price
for price in root.iter('price'):
new_price = int(price.text) + 5
price.text = str(new_price)
price.set('updated', 'yes')
#write to file
tree = ET.ElementTree(indent(root))
tree.write('bookstore2.xml', xml_declaration = True, encoding = 'utf-8')
Simply run the above script you should see the modified bookstore2.xml file in the current directory. Here is the below output XML file.
<?xml version='1.0' encoding='utf-8'?>
<bookstore specialty="novel">
<book style="autobiography">
<author>
<first-name>Joe</first-name>
<last-name>Bob</last-name>
<award>Trenton Literary Review Honorable Mention</award>
</author>
<price updated="yes">17</price>
</book>
<magazine frequency="monthly" style="glossy">
<price updated="yes">17</price>
<subscription per="year" price="24" />
</magazine>
</bookstore>