incubator-hcatalog-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rajesh Balamohan <rajesh.balamo...@gmail.com>
Subject Re: Hcataog + PIG filter issue
Date Wed, 16 May 2012 23:05:56 GMT
Thanks a lot David. I will try with 0.4.1
On May 16, 2012 9:23 PM, "David Capwell" <dcapwell@gmail.com> wrote:

> Not at a computer right now so can't check jira but this should be fixed
> in  hcat 0.4.1.
>
> You should be able to compile truck or branch 4.  I live off trunk and I
> remember this being fixed awhile ago
> On May 16, 2012 6:37 AM, "Rajesh Balamohan" <rajesh.balamohan@gmail.com>
> wrote:
>
>> Thanks for the reply David.
>>
>> I tried with pig 0.9.3 as well. It had the same issue.
>>
>> Would 0.4.1 fix this?
>> On May 16, 2012 6:57 PM, "David Capwell" <dcapwell@gmail.com> wrote:
>>
>>> If I remember correctly upgrading to pig 0.9.3 fixes this.  Or its fixed
>>> in 0.4.1 hcat. Can't remember which. Try pig first since 0.4.1 isn't out.
>>> On May 15, 2012 10:53 PM, "Rajesh Balamohan" <rajesh.balamohan@gmail.com>
>>> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I am currently using the following. In certain scenario the filter
>>>> condition is not applied and it ends up scanning the entire data. Sample
is
>>>> given below.
>>>>
>>>>
>>>> Pig 0.9.0
>>>> HCatalog 0.4.0
>>>> Hadoop 0.20.20x
>>>>
>>>> dim_referrer = LOAD 'tableA' USING org.apache.hcatalog.pig.
>>>> HCatLoader();
>>>> source_data = LOAD 'tableB' USING org.apache.hcatalog.pig.HCatLoader();
>>>> source_data_new = FILTER source_data BY d =='20120415';
>>>> joined_data_referrer = JOIN source_data_new BY referrer LEFT OUTER,
>>>> dim_referrer BY referrer_url using 'skewed';
>>>> dump joined_data_referrer;
>>>>
>>>> In this case, all records are scanned and the filtering is not applied
>>>> by HCatalog.
>>>>
>>>> Shouldn't it apply the filter first and then do the sampling M/R job
>>>> required for "skewed" join?
>>>>
>>>> Is this a known issue. Any pointers would be of great help.
>>>>
>>>>
>>>>
>>>> --
>>>> ~Rajesh.B
>>>>
>>>

Mime
View raw message