Extract Base64 String



I have a string from an XML file that I need to base64 decode. However, it looks like there are some invalid characters within the string.


I tried writing some code to remove any invalid characters, and then re-pad but it appears that my source data may have some invalid invisible characters within it.


So, I figured the best approach since I know the valid base64 characters is to just have a function to extract all the valid characters, that way every invalid character would be purged.


Is there an quick way to test this theory?


My other question was about base64 itself, after researching it appears that while A-Z,a-z,0-9 are 62 of the valid base64 characters the rest can depend on file encoding.


Is it possible my file is using some other base64 scheme?


When I scanned my file on top of the common A-Z,a-z,0-9 I have found ='s within the files, not just as padding. I tried removing those and repadding, but I still seem to have invalid characters.


This is now more a general question since I am unable to post my base64 encoded string.


No comments:

Post a Comment