Sunday, 19 October 2014

hadoop configuration xml encode



I make a python script to deploy hadoop automatically. the xml config files, such as "core-site.xml", was modified & encoded by utf-8 with python script. But the hadoop can't read this xml file.


Here is my xml after modified by script:



<?xml version='1.0' encoding='utf-8'?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
    <name>fs.default.name</name>
    <value>http://hdfsc31:9000</value>
  </property>
</configuration>


Here is my python script:



import lxml.etree

def modify_core_site(namenode_hostname):
doc = lxml.etree.parse("pkg/core-site.xml")
root = doc.getroot()
print dir(root)
# print root.getchildren()
for child in root.iter("property"):
name = child.find("name").text
if name == "fs.default.name":
text = "hdfs://%s:9000" % namenode_hostname
child.find("value").text = text
s = lxml.etree.tostring(doc, encoding="utf-8", xml_declaration = True, pretty_print = True)
with open("pkg/core-site.xml", "w") as f:
f.write(s)


Where is the bug? Somebody help?


No comments:

Post a Comment