hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vikas Jadhav <vikascjadha...@gmail.com>
Subject Re: How can I record some position of context in Reduce()?
Date Wed, 10 Apr 2013 13:11:49 GMT
How are you going to support NON EQUI Join using MapReduce ?
As per my understanding there is only one way to do this is
to bring all data to one reducer then reducer will know lesser/greater
values correctly.
Correct me if I am wrong.
Thank You.

*  Regards,*
*  Vikas *



On Wed, Apr 10, 2013 at 4:22 PM, Michel Segel <michael_segel@hotmail.com>wrote:

> Can you show an example of your join?
> All joins are an equality in that the key has to match.
> Whether its a one to one , one to many, or many to many remains to be seen.
>
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Apr 9, 2013, at 10:35 AM, Effyroth Gu <effyroth@gmail.com> wrote:
>
> Only equality joins, outer joins, and left semi joins are supported in
> Hive. Hive does not support join conditions that are not equality
> conditions as it is very difficult to express such conditions as a
> map/reduce job. Also, more than two tables can be joined in Hive.
>
>
> 2013/4/9 Michael Segel <michael_segel@hotmail.com>
>
>> Hi,
>>
>> Your cross join is supported in both pig and hive. (Cross, and Theta
>> joins)
>>
>> So there must be code to do this.
>>
>> Essentially in the reducer you would have your key and then the set of
>> rows that match the key. You would then perform the cross product on the
>> key's result set and output them to the collector as separate rows.
>>
>> I'm not sure why you would need the reduce context.
>>
>> But then again, I'm still on my first cup of coffee. ;-)
>>
>>
>> On Apr 9, 2013, at 12:15 AM, Vikas Jadhav <vikascjadhav87@gmail.com>
>> wrote:
>>
>> Hi
>> I am also woring on join using MapReduce
>> i think instead of finding postion of table in RawKeyValuIterator.
>> what we can do modify context.write method to alway write key as table
>> name or id
>> then we dont need to find postion we can get Key and Value from
>> "reducerContext"
>>
>> befor calling reducer.run(reducerContext) in ReduceTask.java we can  add
>> method join in Reducer.java Reducer class and give call to
>> reducer.join(reduceContext)
>>
>>
>> I just wonder how r going to support NON EQUI join.
>>
>> I am also having same problem how to do join if datasets cant fit in to
>> memory.
>>
>>
>> for now i am cloning using following code :
>>
>>
>> KEYIN key = context.getCurrentKey() ;
>> KEYIN outKey = null;
>> try {
>>     outKey = (KEYIN)key.getClass().newInstance();
>>    }
>> catch(Exception e)
>>  {}
>> ReflectionUtils.copy(context.getConfiguration(), key, outKey);
>>
>>  Iterable<VALUEIN> values = context.getValues();
>>  ArrayList<VALUEIN> myValues = new ArrayList<VALUEIN>();
>>  for(VALUEIN value: values) {
>>    VALUEIN outValue = null;
>>     try {
>>          outValue = (VALUEIN)value.getClass().newInstance();
>>    }
>>    catch(Exception e)    {}
>>    ReflectionUtils.copy(context.getConfiguration(), value, outValue);
>>  }
>>
>>
>> if you have found any other solution please feel free to share
>>
>> Thank You.
>>
>>
>>
>>
>>
>>
>> On Thu, Mar 14, 2013 at 1:53 PM, Roth Effy <effyroth@gmail.com> wrote:
>>
>>> In reduce() we have:
>>>
>>> key1 values1
>>> key2 values2
>>> ...
>>> keyn valuesn
>>>
>>> so,what i want to do is join all values like a SQL:
>>>
>>> select * from values1,values2...valuesn;
>>>
>>> if memory is not enough to cache values,how to complete the join
>>> operation?
>>> my idea is clone the reducecontext,but it maybe not easy.
>>>
>>> Any help will be appreciated.
>>>
>>>
>>> 2013/3/13 Roth Effy <effyroth@gmail.com>
>>>
>>>> I want a n:n join as Cartesian product,but the DataJoinReducerBase looks
>>>> like only support equal join.
>>>> I want a non-equal join,but I have no idea now.
>>>>
>>>>
>>>> 2013/3/13 Azuryy Yu <azuryyyu@gmail.com>
>>>>
>>>>> you want a n:n join or 1:n join?
>>>>> On Mar 13, 2013 10:51 AM, "Roth Effy" <effyroth@gmail.com> wrote:
>>>>>
>>>>>> I want to join two table data in reducer.So I need to find the start
>>>>>> of the table.
>>>>>> someone said the DataJoinReducerBase can help me,isn't it?
>>>>>>
>>>>>>
>>>>>> 2013/3/13 Azuryy Yu <azuryyyu@gmail.com>
>>>>>>
>>>>>>> you cannot use RecordReader in Reducer.
>>>>>>>
>>>>>>> what's the mean of you want get the record position? I cannot
>>>>>>> understand, can you give a simple example?
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Mar 13, 2013 at 9:56 AM, Roth Effy <effyroth@gmail.com>wrote:
>>>>>>>
>>>>>>>> sorry´╝îI still can't understand how to use recordreader
in the
>>>>>>>> reduce(),because the input is a RawKeyValueIterator in the
class
>>>>>>>> reducecontext.so,I'm confused.
>>>>>>>> anyway,thank you.
>>>>>>>>
>>>>>>>>
>>>>>>>> 2013/3/12 samir das mohapatra <samir.helpdoc@gmail.com>
>>>>>>>>
>>>>>>>>> Through the RecordReader and FileStatus you can get it.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Mar 12, 2013 at 4:08 PM, Roth Effy <effyroth@gmail.com>wrote:
>>>>>>>>>
>>>>>>>>>> Hi,everyone,
>>>>>>>>>> I want to join the k-v pairs in Reduce(),but how
to get the
>>>>>>>>>> record position?
>>>>>>>>>> Now,what I thought is to save the context status,but
class
>>>>>>>>>> Context doesn't implement a clone construct method.
>>>>>>>>>>
>>>>>>>>>> Any help will be appreciated.
>>>>>>>>>> Thank you very much.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>>
>>
>> --
>> *
>> *
>> *
>>
>> Thanx and Regards*
>> * Vikas Jadhav*
>>
>>
>>
>


-- 
*
*
*

Thanx and Regards*
* Vikas Jadhav*

Mime
View raw message