hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aihua Xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-10754) Pig+Hcatalog doesn't work properly since we need to clone the Job instance in HCatLoader
Date Mon, 15 Jun 2015 14:13:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586087#comment-14586087
] 

Aihua Xu commented on HIVE-10754:
---------------------------------

[~mithun] Sorry for the late reply. Busy with something else. Seems it's hadoop version related
issue.

Would it be fair to update all the calls in HCatalog to use the new getInstance() since it's
deprecated anyway? If you agree, I will use this jira to do that and I will update the title
to reflect it.

> Pig+Hcatalog doesn't work properly since we need to clone the Job instance in HCatLoader
> ----------------------------------------------------------------------------------------
>
>                 Key: HIVE-10754
>                 URL: https://issues.apache.org/jira/browse/HIVE-10754
>             Project: Hive
>          Issue Type: Sub-task
>          Components: HCatalog
>    Affects Versions: 1.2.0
>            Reporter: Aihua Xu
>            Assignee: Aihua Xu
>         Attachments: HIVE-10754.patch
>
>
> {noformat}
> Create table tbl1 (key string, value string) stored as rcfile;
> Create table tbl2 (key string, value string);
> insert into tbl1 values( '1', '111');
> insert into tbl2 values('1', '2');
> {noformat}
> Pig script:
> {noformat}
> src_tbl1 = FILTER tbl1 BY (key == '1');
> prj_tbl1 = FOREACH src_tbl1 GENERATE
>            key as tbl1_key,
>            value as tbl1_value,
>            '333' as tbl1_v1;
>            
> src_tbl2 = FILTER tbl2 BY (key == '1');
> prj_tbl2 = FOREACH src_tbl2 GENERATE
>            key as tbl2_key,
>            value as tbl2_value;
>            
> dump prj_tbl1;
> dump prj_tbl2;
> result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key);
> prj_result = FOREACH result 
>       GENERATE  prj_tbl1::tbl1_key AS key1,
>                 prj_tbl1::tbl1_value AS value1,
>                 prj_tbl1::tbl1_v1 AS v1,
>                 prj_tbl2::tbl2_key AS key2,
>                 prj_tbl2::tbl2_value AS value2;
>                
> dump prj_result;
> {noformat}
> The expected result is (1,111,333,1,2) while the result is (1,2,333,1,2).  We need to
clone the job instance in HCatLoader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message