XLST new line insert does not work as excepted with Hive



I am converting a xml to csv using xslt, here is my xsl file:



<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://ift.tt/tCZ8VR" xmlns:cov="http://ift.tt/1xMfo15">
<xsl:output method="text"/>

<xsl:template match="testcase">
<xsl:value-of select ="@classname"/>
<xsl:text>,</xsl:text>
<xsl:value-of select ="@name"/>
<xsl:text>,</xsl:text>
<xsl:value-of select ="@time"/>
<xsl:text>&#xD;</xsl:text>
</xsl:template>

</xsl:stylesheet>


The csv file looks good, all the new lines are there, but when I try to create an external table with Hive (from Cloudera Hadoop) using this query:



Create external table csv_test(className STRING, testName STRING, duration DOUBLE)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LOCATION '/TEST/';


For each line in my csv file, I have two lines, one with values and the other one with NULLs. As if it wasn not taking the line change correctly and was thinking it was two lines instead of one.


I tried different tricks for the new line, such as &#10;, &#xa;, &#xd;, &#13;, \n, combinations of the preceding codes and even placing the tags on two separtate lines, but I have the same result.


The other issue is with the third field, the duration. I always have a NULL value. In my create table query, if I replace DOUBLE by STRING, it works.


Everything (new lines and DOUBLE) works fine if I create my cvs file manually with the same data, the issue if only with the csv file created by the xslt.


Am I doing something wrong?


No comments:

Post a Comment