hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ondřej Klimpera <klimp...@fit.cvut.cz>
Subject Re: Using MultipleOutputs with new API (v1.0)
Date Wed, 25 Jan 2012 13:02:36 GMT
One more question. Just downloaded Hadoop 0.20.203.0 considered to be 
last stable release. What about JobConf vs. Confirguration classes. What 
should I use to avoid wrong approaches, because JobConf seems to be 
depricated.
Sorry for bothering you with this questions. I'm just not used to having 
depricated things in my projects.

Thanks.


On 01/25/2012 01:46 PM, Ondřej Klimpera wrote:
> I'm using 1.0.0 beta, suppose it was wrong decision to use beta 
> version. So do you recommend using 0.20.203.X and stick to Hadoop 
> definitive guide approaches?
>
> Thanks for your reply
>
> On 01/25/2012 01:41 PM, Harsh J wrote:
>> Oh and btw, do not fear the @deprecated 'Old' API. We have
>> undeprecated it in the recent stable releases, and will continue to
>> support it for a long time. I'd recommend using the older API, as that
>> is more feature complete and test covered in the version you use.
>>
>> On Wed, Jan 25, 2012 at 6:09 PM, Harsh J<harsh@cloudera.com>  wrote:
>>> What version/release/distro of Hadoop are you using? Apache releases
>>> got the new (unstable) API MultipleOutputs only in 0.21+, and was only
>>> very recently backported to branch-1.
>>>
>>> That said, the next release in 1.x (1.1.0, out soon) will carry the
>>> new API MultipleOutputs, but presently no release in 0.20.xxx/1.x has
>>> it.
>>>
>>> I'd still recommend sticking to stable API if you are using a
>>> 0.20.x/1.x stable Apache release.
>>>
>>> On Wed, Jan 25, 2012 at 5:13 PM, Ondřej 
>>> Klimpera<klimpond@fit.cvut.cz>  wrote:
>>>> Hello,
>>>>
>>>> I'm trying to develop an application, where Reducer has to produce 
>>>> multiple
>>>> outputs.
>>>>
>>>> In detail I need the Reducer to produce two types of files. Each 
>>>> file will
>>>> have different output.
>>>>
>>>> I found in Hadoop, The Definitive Guide, that new API uses only
>>>> MultipleOutputs, but working with MultipleOutputs requires JobConf 
>>>> instace,
>>>> that is @deprecated (I'm using org.apache.hadoop.mapreduce.Job 
>>>> instance to
>>>> handle job configuration).
>>>>
>>>> So I'm wondering how to get MultipleOutputs working.
>>>>
>>>> Can you please provide me some short example or explanation.
>>>>
>>>> Thanks for your reply.
>>>>
>>>> Regards
>>>>
>>>> Ondrej Klimpera
>>>
>>>
>>> -- 
>>> Harsh J
>>> Customer Ops. Engineer, Cloudera
>>
>>
>


Mime
View raw message