I am receiving a large amount of XML files and JSON files every day. The goal is to send these files to an other machine and stores them in HDFS with the same format (XML/JSON). Next I will have to do operations on them with Spark.
I know that I can use Flume to do this kind of think but it seems that this tool is more appropriate for log files. So when we start to work with XML/JSON files, it turns less trivial.
Is there a good way to deal with XML/JSON files transfer with Flume or do I have to use an other file transfert tool?
Thanks for your help :)
No comments:
Post a Comment