hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rasit OZDAS <rasitoz...@gmail.com>
Subject Re: HELP: I wanna store the output value into a list not write to the disk
Date Thu, 02 Apr 2009 15:45:21 GMT
That seems interesting, we have 3 replications as default.
Is there a way to define, lets say, 1 replication for only job-specific files?

2009/4/2 Owen O'Malley <omalley@apache.org>:
>
> On Apr 2, 2009, at 2:41 AM, andy2005cst wrote:
>
>>
>> I need to use the output of the reduce, but I don't know how to do.
>> use the wordcount program as an example if i want to collect the wordcount
>> into a hashtable for further use, how can i do?
>
> You can use an output format and then an input format that uses a database,
> but in practice, the cost of writing to hdfs and reading it back is not a
> problem, especially if you set the replication of the output files to 1.
> (You'll need to re-run the job if you lose a node, but it will be fast.)
>
> -- Owen
>



-- 
M. Raşit ÖZDAŞ

Mime
View raw message