hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pramod N <npramo...@gmail.com>
Subject Re: Not saving any output
Date Wed, 29 May 2013 06:38:07 GMT
*Sqoop* is often used in this scenario.

You might also want to look at https://github.com/mongodb/mongo-hadoop
*MongoDBHadoop
Connector*.
More on streaming support can be found here
http://api.mongodb.org/hadoop/Hadoop+Streaming+Support.html
There are pros and cons. Choose what suits you the best.



Pramod N <http://atmachinelearner.blogspot.in>
Bruce Wayne of web
@machinelearner <https://twitter.com/machinelearner>

--


On Wed, May 29, 2013 at 2:13 AM, Kai Voigt <k@123.org> wrote:

> You can have your python streaming script simply not write any key/value
> pairs to stdout, so you'll get an empty job output.
>
> Independently, your script could do anything external, such as connecting
> to a remote database and store data in those. You probably want to avoid
> too many tasks doing this in parallel.
>
> But more common would be a regular job which writes data to HDFS, and then
> use Sqoop to store that data into a RDBMS. But it's your choice.
>
> Kai
>
> Am 28.05.2013 um 20:57 schrieb jamal sasha <jamalshasha@gmail.com>:
>
> > Hi,
> >   I want to process some text files and then save the output in a db.
> > I am using python (hadoop streaming).
> > I am using mongo as backend server.
> > Is it possible to run hadoop streaming jobs without specifying any
> output?
> > What is the best way to deal with this.
> >
>
> --
> Kai Voigt
> k@123.org
>
>
>
>
>

Mime
View raw message