hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vikas Jadhav <vikascjadha...@gmail.com>
Subject Re: How can I record some position of context in Reduce()?
Date Tue, 09 Apr 2013 05:15:50 GMT
Hi
I am also woring on join using MapReduce
i think instead of finding postion of table in RawKeyValuIterator.
what we can do modify context.write method to alway write key as table name
or id
then we dont need to find postion we can get Key and Value from
"reducerContext"

befor calling reducer.run(reducerContext) in ReduceTask.java we can  add
method join in Reducer.java Reducer class and give call to
reducer.join(reduceContext)


I just wonder how r going to support NON EQUI join.

I am also having same problem how to do join if datasets cant fit in to
memory.


for now i am cloning using following code :


KEYIN key = context.getCurrentKey() ;
KEYIN outKey = null;
try {
    outKey = (KEYIN)key.getClass().newInstance();
   }
catch(Exception e)
 {}
ReflectionUtils.copy(context.getConfiguration(), key, outKey);

 Iterable<VALUEIN> values = context.getValues();
 ArrayList<VALUEIN> myValues = new ArrayList<VALUEIN>();
 for(VALUEIN value: values) {
   VALUEIN outValue = null;
    try {
         outValue = (VALUEIN)value.getClass().newInstance();
   }
   catch(Exception e)    {}
   ReflectionUtils.copy(context.getConfiguration(), value, outValue);
 }


if you have found any other solution please feel free to share

Thank You.







On Thu, Mar 14, 2013 at 1:53 PM, Roth Effy <effyroth@gmail.com> wrote:

> In reduce() we have:
>
> key1 values1
> key2 values2
> ...
> keyn valuesn
>
> so,what i want to do is join all values like a SQL:
>
> select * from values1,values2...valuesn;
>
> if memory is not enough to cache values,how to complete the join operation?
> my idea is clone the reducecontext,but it maybe not easy.
>
> Any help will be appreciated.
>
>
> 2013/3/13 Roth Effy <effyroth@gmail.com>
>
>> I want a n:n join as Cartesian product,but the DataJoinReducerBase looks
>> like only support equal join.
>> I want a non-equal join,but I have no idea now.
>>
>>
>> 2013/3/13 Azuryy Yu <azuryyyu@gmail.com>
>>
>>> you want a n:n join or 1:n join?
>>> On Mar 13, 2013 10:51 AM, "Roth Effy" <effyroth@gmail.com> wrote:
>>>
>>>> I want to join two table data in reducer.So I need to find the start of
>>>> the table.
>>>> someone said the DataJoinReducerBase can help me,isn't it?
>>>>
>>>>
>>>> 2013/3/13 Azuryy Yu <azuryyyu@gmail.com>
>>>>
>>>>> you cannot use RecordReader in Reducer.
>>>>>
>>>>> what's the mean of you want get the record position? I cannot
>>>>> understand, can you give a simple example?
>>>>>
>>>>>
>>>>> On Wed, Mar 13, 2013 at 9:56 AM, Roth Effy <effyroth@gmail.com>
wrote:
>>>>>
>>>>>> sorry´╝îI still can't understand how to use recordreader in the
>>>>>> reduce(),because the input is a RawKeyValueIterator in the class
>>>>>> reducecontext.so,I'm confused.
>>>>>> anyway,thank you.
>>>>>>
>>>>>>
>>>>>> 2013/3/12 samir das mohapatra <samir.helpdoc@gmail.com>
>>>>>>
>>>>>>> Through the RecordReader and FileStatus you can get it.
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Mar 12, 2013 at 4:08 PM, Roth Effy <effyroth@gmail.com>wrote:
>>>>>>>
>>>>>>>> Hi,everyone,
>>>>>>>> I want to join the k-v pairs in Reduce(),but how to get the
record
>>>>>>>> position?
>>>>>>>> Now,what I thought is to save the context status,but class
Context
>>>>>>>> doesn't implement a clone construct method.
>>>>>>>>
>>>>>>>> Any help will be appreciated.
>>>>>>>> Thank you very much.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>
>


-- 
*
*
*

Thanx and Regards*
* Vikas Jadhav*

Mime
View raw message