crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Danny Morgan <>
Subject RE: MongoDB Input Format
Date Thu, 25 Dec 2014 17:12:29 GMT
Oh yea I always forget about hbase, I'll take a look there. Thanks Josh!

Date: Thu, 25 Dec 2014 05:13:28 -0800
Subject: Re: MongoDB Input Format

Hey Danny,
Maybe the sources we use for reading from HBase tables in crunch-hbase? I agree that extending
one of the File Source impl classes probably isn't the right thing to do.
On Dec 24, 2014 8:13 PM, "Danny Morgan" <> wrote:

Hi Everyone,
I'm working on getting a MongoDB Source working for crunch as a holiday project. Luckily there
is already a MongoInputFormat provided by the mongo-hadoop project. I tried to follow the
example of the JDBC input in crunch-contrib, but I couldn't quite get things working. I'm
inheriting from FileTableSourceImpl and creating my own FormatBundle since MongoInputFormat
extends InputFormat as opposed to FileInputFormat which is what most of the crunch source
classes expect. Anyway I can't seem to get the darn thing to work. I have feeling that CrunchInputs
and friends expect the Source to be backed my some kind of file in HDFS and in the case of
the MongoInputFormat it's a connection over the network to a mongo server so something isn't
quite working. Any pointer's to which interfaces and classes I should base my implementation
off of or which methods I should override would be much appreciated.
View raw message