hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suraj Nayak <snay...@gmail.com>
Subject Re: Reading 2 table data in MapReduce for Performing Join
Date Fri, 27 Mar 2015 06:00:59 GMT
This is solved. Used Writable instead of LongWritable or NullWritable in
Mapper input key type.

Thanks
Suraj Nayak
On 19-Mar-2015 9:48 PM, "Suraj Nayak" <snayakm@gmail.com> wrote:

> Is this related to https://issues.apache.org/jira/browse/HIVE-4329 ? Is
> there a workaround?
>
> On Thu, Mar 19, 2015 at 9:47 PM, Suraj Nayak <snayakm@gmail.com> wrote:
>
>> Hi All,
>>
>> I was successfully able to integrate HCatMultipleInputs with the patch
>> for the tables created with TEXTFILE. But I get error when I read table
>> created with ORC file. The error is below :
>>
>> 15/03/19 10:51:32 INFO mapreduce.Job: Task Id :
>> attempt_1425012118520_9756_m_000000_0, Status : FAILED
>> Error: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable
>> cannot be cast to org.apache.hadoop.io.LongWritable
>>     at com.abccompany.mapreduce.MyMapper.map(MyMapper.java:15)
>>     at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>>     at java.security.AccessController.doPrivileged(Native Method)
>>     at javax.security.auth.Subject.doAs(Subject.java:415)
>>     at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
>>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
>>
>>
>> Can anyone help?
>>
>> Thanks in advance!
>>
>> On Wed, Mar 18, 2015 at 11:00 PM, Suraj Nayak <snayakm@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> https://issues.apache.org/jira/browse/HIVE-4997 patch helped!
>>>
>>>
>>> On Tue, Mar 17, 2015 at 1:05 AM, Suraj Nayak <snayakm@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I tried reading data via HCatalog for 1 Hive table in MapReduce using
>>>> something similar to
>>>> https://cwiki.apache.org/confluence/display/Hive/HCatalog+InputOutput#HCatalogInputOutput-RunningMapReducewithHCatalog.
>>>> I was able to read successfully.
>>>>
>>>> Now am trying to read 2 tables, as the requirement is to join 2 tables.
>>>> I did not find API similar to *FileInputFormat.addInputPaths* in
>>>> *HCatInputFormat*. What is the equivalent of the same in HCat ?
>>>>
>>>> I had performed join using FilesInputFormat in HDFS(by getting split
>>>> information in mapper). This article(
>>>> http://www.codingjunkie.com/mapreduce-reduce-joins) helped me code
>>>> join. <http://www.codingjunkie.com/mapreduce-reduce-joins/> Can
>>>> someone suggest how I can perform join operation using HCatalog ?
>>>>
>>>> Briefly, the aim is to
>>>>
>>>>    - Read 2 tables (almost similar schema)
>>>>    - If key exists in both the table send it to same reducer.
>>>>    - Do some processing on the records in reducer.
>>>>    - Save the output into file/Hive table.
>>>>
>>>> *P.S : The reason for using MapReduce to perform join is because of
>>>> complex requirement which can't be solved via Hive/Pig directly. *
>>>>
>>>> Any help will be greatly appreciated :)
>>>>
>>>> --
>>>> Thanks
>>>> Suraj Nayak M
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks
>>> Suraj Nayak M
>>>
>>
>>
>>
>> --
>> Thanks
>> Suraj Nayak M
>>
>
>
>
> --
> Thanks
> Suraj Nayak M
>

Mime
View raw message