incubator-hcatalog-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Gates <ga...@hortonworks.com>
Subject Re: HCatalog scans all partition even after mentioning date filter
Date Mon, 23 Apr 2012 20:43:57 GMT
What version of HCatalog are you using?  How do you know it is scanning all the partitions,
does it say so in the logs, or are you getting all the records back?

And yes, HCat is supposed to do partition pruning so that it only scans the required partitions.

Alan.

On Apr 21, 2012, at 8:27 PM, Rajesh Balamohan wrote:

> Hi All,
> 
> I have a hcatalog table "partitioned by (d string)". 
> 
> I have couple of days worth of data and when i run "show partitions" it provides the
correct daa.
> 
> d=20111215
> d=20111216
> d=20111217
> d=20111218
> d=20111219
> d=20111220
> d=20111221
> d=20111222
> d=20111223
> d=20111224
> d=20111225
> d=20120415
> 
> However, when I run PIG with "filter a by d == '20120415'", it ends up scanning all data.

> 
> Is this a known bug/enhancement in HCatalog?. Ideally, shouldn't it scan only the d=20120415
directory?
> 
> Any pointers would be of great help.
> 
> 
> -- 
> ~Rajesh.B


Mime
View raw message