crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <josh.wi...@gmail.com>
Subject Re: MongoDB Input Format
Date Thu, 25 Dec 2014 13:13:28 GMT
Hey Danny,

Maybe the sources we use for reading from HBase tables in crunch-hbase? I
agree that extending one of the File Source impl classes probably isn't the
right thing to do.

J
On Dec 24, 2014 8:13 PM, "Danny Morgan" <unluckyboy@hotmail.com> wrote:

> Hi Everyone,
>
> I'm working on getting a MongoDB Source working for crunch as a holiday
> project. Luckily there is already a MongoInputFormat provided by the
> mongo-hadoop project. I tried to follow the example of the JDBC input in
> crunch-contrib, but I couldn't quite get things working. I'm inheriting
> from FileTableSourceImpl and creating my own FormatBundle since
> MongoInputFormat extends InputFormat as opposed to FileInputFormat which is
> what most of the crunch source classes expect.
>
> Anyway I can't seem to get the darn thing to work. I have feeling that
> CrunchInputs and friends expect the Source to be backed my some kind of
> file in HDFS and in the case of the MongoInputFormat it's a connection over
> the network to a mongo server so something isn't quite working. Any
> pointer's to which interfaces and classes I should base my implementation
> off of or which methods I should override would be much appreciated.
>
> Thanks!
>
> -Danny
>

Mime
View raw message