crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <>
Subject Re: MongoDB Input Format
Date Thu, 25 Dec 2014 13:13:28 GMT
Hey Danny,

Maybe the sources we use for reading from HBase tables in crunch-hbase? I
agree that extending one of the File Source impl classes probably isn't the
right thing to do.

On Dec 24, 2014 8:13 PM, "Danny Morgan" <> wrote:

> Hi Everyone,
> I'm working on getting a MongoDB Source working for crunch as a holiday
> project. Luckily there is already a MongoInputFormat provided by the
> mongo-hadoop project. I tried to follow the example of the JDBC input in
> crunch-contrib, but I couldn't quite get things working. I'm inheriting
> from FileTableSourceImpl and creating my own FormatBundle since
> MongoInputFormat extends InputFormat as opposed to FileInputFormat which is
> what most of the crunch source classes expect.
> Anyway I can't seem to get the darn thing to work. I have feeling that
> CrunchInputs and friends expect the Source to be backed my some kind of
> file in HDFS and in the case of the MongoInputFormat it's a connection over
> the network to a mongo server so something isn't quite working. Any
> pointer's to which interfaces and classes I should base my implementation
> off of or which methods I should override would be much appreciated.
> Thanks!
> -Danny

View raw message