I'm trying to use escape_utils via xmlhasher:
http://ift.tt/176lJJe http://ift.tt/1ym2qHs
I'm unzipping an XML file directly in memory through rubyzip which apparently is not UTF-8 but ISO-8859-1:
content = entry.get_input_stream.read.encode('UTF-8', invalid: :replace, undef: :replace, replace: '?')
puts content.encoding # returns UTF-8
Problem is that when I try to run:
test = XmlHasher.parse(content)
I get an error:
expected no Exception, got #<Encoding::CompatibilityError: Input must be UTF-8 or US-ASCII, ISO-8859-1 given> with backtrace:
# /Users/x/.rvm/gems/ruby-2.1.2/gems/xmlhasher-0.0.6/lib/xmlhasher/handler.rb:48:in `unescape_html'
# /Users/x/.rvm/gems/ruby-2.1.2/gems/xmlhasher-0.0.6/lib/xmlhasher/handler.rb:48:in `escape'
# /Users/x/.rvm/gems/ruby-2.1.2/gems/xmlhasher-0.0.6/lib/xmlhasher/handler.rb:21:in `attr'
# /Users/x/.rvm/gems/ruby-2.1.2/gems/xmlhasher-0.0.6/lib/xmlhasher/parser.rb:12:in `sax_parse'
# /Users/x/.rvm/gems/ruby-2.1.2/gems/xmlhasher-0.0.6/lib/xmlhasher/parser.rb:12:in `parse'
I don't get it, the string should now be UTF-8, but I still get this error.
What can I do to move forward? Any pointers are appreciated, thanks.
No comments:
Post a Comment