I have a project I'm currently working on. The idea is to implement "scikit-learn" into Storm and integrate it with HDFS.
I've already implemented "scikit-learn". But, currently I'm using a text file to read and write. However, I need to use HDFS, but finding it hard to integrate with HDFS.
Here is the link to github
. (I only included files that I used, not whole project)
Basically, I have a few questions if you don't mint to answer them
1) How to use HDFS to read and write?
2) Is my "scikit-learn" implementation correct?
3) How to create a Storm project? (Currently working in "storm-starter")
These questions may sound a bit silly, but I really can't find a proper solution.
Thank you for your attention to this matter.