XML : A decimal representation must immediately follow the "

I am getting content of pdf documents through tika and sending it to solr to index it through xml request in coldfusion.

But i am facing many issues:

Issue 1:

An invalid XML character (Unicode: 0xb) was found in the element content of the document

I have used following solution to escape uni code characters and also tried many others

  p= createObject("java","java.util.regex.Pattern").compile("[^\\u0009\\u000A\\u000D\u0020-\\uD7FF\\uE000-\\uFFFD\\u10000-\\u10FFF]+");  p.matcher(myText).replaceAll("")    

Now i am facing the following error:

A decimal representation must immediately follow the "&#" in a character reference.

Can any one please help me to resolved this.

No comments:

Post a Comment