incubator-hcatalog-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sushanth Sowmyan <khorg...@gmail.com>
Subject Re: HCataStorer performance
Date Fri, 01 Jun 2012 20:36:46 GMT
Hi Rajesh,

I'm afraid we haven't done a performance analysis for a while now (the
last time we did so was around HCat 0.1 timeframe.

What we noticed when we did that was that I/O was always the biggest
bottleneck, and so things like what underlying format(say RCFile
versus text) was used and whether or not compression was on were the
relevant performance predictors. A 4x slowness is not expected.

What data sizes are you looking at, and are there any other variables
between your HCatStorer() and PigStorage() cases?

Thanks,
-Sushanth

On Tue, May 22, 2012 at 5:06 PM, Rajesh Balamohan
<rajesh.balamohan@gmail.com> wrote:
> Hi All,
>
> Currently I am using HCat 0.4 & Pig 0.9.3.
>
> While running jobs, I observed that HCatStorer() is a lot slower than
> PigStorage(). (approximately 4x)
>
> Is this a known issue? Any pointers would be of great help.
>
> --
> ~Rajesh.B

Mime
View raw message