On Windows, I am currently using gawk to find the first occurrence of a string + 100 bytes for all XMLs withing a directory:
gawk "/[some string]/" { match ( $0, /[some string]/); print substr($0,RSTART,RLENGTH + 100) FILENAME; }" C:\XML*.xml > C:\Results.txt
What I would like to do now is output all the matches (not just the first) to C:\Results.txt for each XML and also include 100 characters before the match + 100 characters after the match.
Is it possible to easily change this to get the desired results?
I understand that gawk might not be the best tool for the job, but this is just a one time task and if this is slow I can let this run overnight.
No comments:
Post a Comment