Tuesday, 17 February 2015

How to ignore Html elements with XMLPullParser?



I have an xml file that is streamed to an xml parser.


The content of the xml file contains html tags, which I would like to ignore:



<overview>
<p>Situated on a peninsula halfway up the west coast of India, Mumbai (formerly Bombay) is India's economic powerhouse and home to more millionaires than any other city on the Indian sub-continent.</p>
<p>The Portuguese established this old Hindu city as a colony in 1509.</p>
<p>Like many Indian cities, the streets of Mumbai are congested with cattle, carts and motor vehicles and the air is thick with smog.</p>
</overview>


The method to parse the overview is:



private String readOverview(XmlPullParser parser) throws IOException, XmlPullParserException{
parser.require(XmlPullParser.START_TAG, ns, TAG_OVERVIEW);
String overview = readText(parser);
parser.require(XmlPullParser.END_TAG, ns, TAG_OVERVIEW);
return overview;
}


The error is: expected: END_TAG {null}overview (position:START_TAG <p>@6:10 in java.io.InputStreamReader@537c80f4).


No comments:

Post a Comment