hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Malligarjunan S <malligarju...@gmail.com>
Subject Re: Hive UDF performance issue
Date Fri, 11 Jul 2014 02:23:48 GMT
Hello Edwards,

Thank you very much for the update.
What size you mean is small table. In our case the small table will have
minimum of 1 million records.
Can we use this UDTF? how much time improvement will be there?

Appreciate your help!
Thanks and Regards
SankarS


On Thu, Jul 10, 2014 at 11:26 PM, Edward Capriolo <edlinuxguru@gmail.com>
wrote:

> There is no magic. Hopefully one table is smaller then the other. You
> could make a UDTF to do something like this MR job is doing
>
> Make a mapper that runs over table A.
> InputFormat.setInputPath("/path/to/table/a")
>
> Then inside the mapper
>
> private Conf c
> setup(Conf c){
>   this.c = c
> }
> public void map(Text key, Text value, Collector c){
>   FileSystem fs = Filesystem.get(c);
>   file f =fs.open("/path/to/table/b")
>   for (line in f){
>     c.collect( value + line);
>   }
> }
>
>
>
> On Thu, Jul 10, 2014 at 12:56 PM, Malligarjunan S <malligarjunan@gmail.com
> > wrote:
>
>> Hello Edward,
>>
>> Thank you very much for helping me.
>> I am new to hive.  Could you please provide the sample map reduce job?
>>
>> Regards,
>> Sankar S
>>
>>
>>
>>
>> On Thu, Jul 10, 2014 at 8:19 AM, Edward Capriolo <edlinuxguru@gmail.com>
>> wrote:
>>
>>> Hive cross product stinks . I have a map reduce job that will do it
>>>
>>>
>>> On Wednesday, July 9, 2014, Navis류승우 <navis.ryu@nexr.com> wrote:
>>>
>>>> Yes, 2M x 1M makes 2T pairing in single reducer.
>>>>
>>>> Thanks,
>>>> Navis
>>>>
>>>>
>>>> 2014-07-10 1:50 GMT+09:00 Malligarjunan S <malligarjunan@gmail.com>:
>>>>
>>>>> Hello All,
>>>>> Is that the expected behavior from hive to take so much of time?
>>>>>
>>>>>
>>>>> Thanks and Regards,
>>>>> Sankar S
>>>>>
>>>>>
>>>>> On Tue, Jul 8, 2014 at 11:23 PM, Malligarjunan S <
>>>>> malligarjunan@gmail.com> wrote:
>>>>>
>>>>>> Hello All,
>>>>>>
>>>>>> Can any one help me to answer to my question posted on Stackoverflow?
>>>>>>
>>>>>> http://stackoverflow.com/questions/24416373/hive-udf-performance-too-slow
>>>>>> It is pretty urgent. Please help me.
>>>>>>
>>>>>> Thanks and Regards,
>>>>>> Sankar S.
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>> --
>>> Sorry this was sent from mobile. Will do less grammar and spell check
>>> than usual.
>>>
>>
>>
>

Mime
View raw message