XML : Eliminate or translate HTML character in SQL import from XML file

I came across an issue when running a procedure that shreds an XML file and imports the data into SQL server.

It has been running successfully for a few months, but today I got an error:

Conversion failed when converting date and/or time from character string.

This is the line where it fails

  SELECT   ltrim(rtrim(T.X.value('Cell[4]/Data[1]','varchar(max)'))) AS StartDate  ,ltrim(rtrim(T.X.value('Cell[5]/Data[1]','varchar(max)'))) AS EndDate  FROM @xml.nodes('/Workbook[1]/Worksheet1]/Table[1]/Row') as T(X)    

When I looked at the XML file, I noticed that some of the dates were written like this:

01/12/2016&#160

This character &#160 is a Non-breaking space.

I would like to know if there is any way in SQL Server to account for these types of issues? For this specific problem, I can use REPLACE:

  SELECT   REPLACE(ltrim(rtrim(T.X.value('Cell[4]/Data[1]','varchar(max)'))),'&#160','') AS StartDate  ,ltrim(rtrim(T.X.value('Cell[5]/Data[1]','varchar(max)'))) AS EndDate  FROM @xml.nodes('/Workbook[1]/Worksheet1]/Table[1]/Row') as T(X)    

but if other XML/HTML characters come up, is there a way to universally check for/deal with them?

No comments:

Post a Comment