I have problem with importing big xml file (1.3 gb) into mongodb in order to search for most frequent words in map & reduce manner.
I know that I can't import xml directly into mongodb. I used some tools do so. I used some python scripts and all has failed.
Which tool or script should I use? What should be a key & value? I think the best solution to find most frequent world would be this.
(_id : id, value: word )
then I would sum all the elements like in docs example:
Any clues would be greatly appreciated, but how to import this file into mongodb to have collections like that?
(_id : id, value: word )
If you have any idea please share.
No comments:
Post a Comment