incubator-hcatalog-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Travis Crawford <traviscrawf...@gmail.com>
Subject Re: HCatalog Thrift Error
Date Fri, 31 Aug 2012 17:13:51 GMT
On Fri, Aug 31, 2012 at 9:50 AM, agateaaa <agateaaa@gmail.com> wrote:
> Thanks Rekha for looking at this.
>
> Travis, when you encountered the error did your thrift server crash or
> behave abnormally - unresponsive?

The errors I see are when deserializing thrift structs in MapReduce
jobs; records are stored as serialized thrift structs, and the mapper
deserializes+processes each record. The stack trace looks similar to
the one posted previously.

--travis


> Also what kind of hardware were you using - I was hoping this error was
> because of my
> puny 1GB/1CPU VM so just wanted to check with you
>
> Thanks,
> A
>
>
>
> On Wed, Aug 29, 2012 at 10:31 AM, Travis Crawford <traviscrawford@gmail.com>
> wrote:
>>
>> Hey thrift & hcat gurus -
>>
>> We've also noticed this OOM issue when processing corrupt thrift
>> messages. We're attempting to work around this issue as follows (see
>> https://github.com/kevinweil/elephant-bird/pull/239/files#L5R45):
>>
>> @Override
>> public void deserialize(TBase base, byte[] bytes) throws TException {
>>     // set upper bound on bytes available so that protocol does not try
>>     // to allocate and read large amounts of data in case of corrupt input
>>     protocol.setReadLength(bytes.length);
>>     super.deserialize(base, bytes);
>> }
>>
>> Would it make sense to setReadLength directly in
>> TDeserializer.deserialize?
>>
>>
>> https://github.com/apache/thrift/blob/trunk/lib/java/src/org/apache/thrift/TDeserializer.java#L60
>>
>> --travis
>>
>>
>>
>>
>> On Wed, Aug 29, 2012 at 4:00 AM, Joshi, Rekha <Rekha_Joshi@intuit.com>
>> wrote:
>> > Thanks for confirming Agateaa.
>> >
>> > Since the Hcat server behaves normally , and you observed the issue in
>> > your log just once, it does drop in a concern for me at the moment.
>> > Also not sure if it is CMS related/environment related behavior.At some
>> > point of time I might try to replicate your system, and update you if I face
>> > this too.
>> >
>> > However cc-ing to thrift dev mailing list as well, as there are some
>> > known libthrift/TBinaryProtocol issue inline with yours -
>> > https://issues.apache.org/jira/browse/THRIFT-1643
>> >
>> > Thanks
>> > Rekha
>> >
>> > From: agateaaa <agateaaa@gmail.com<mailto:agateaaa@gmail.com>>
>> > Reply-To:
>> > <hcatalog-user@incubator.apache.org<mailto:hcatalog-user@incubator.apache.org>>
>> > Date: Tue, 28 Aug 2012 07:50:00 -0700
>> > To:
>> > <hcatalog-user@incubator.apache.org<mailto:hcatalog-user@incubator.apache.org>>
>> > Subject: Re: HCatalog Thrift Error
>> >
>> > Hi Rekha
>> >
>> > Yes the hcatalog server was up and and still running. I can query tables
>> > via pig scripts and also run hive queries. As a matter of fact its still
>> > running.
>> >
>> > Before I applied a patch for THRIFT-1468 I had seen my server crash
>> > frequently under similar circumstances (OutOfMemory). Since the patch
>> > havent seen any crashes (just that error once)
>> >
>> > I did take java heap dump just after I saw the error and did not see any
>> > increase in the heap size. I read in GC tuning docs that if
>> > full gc is taking longer (taking 98% of time), JVM may throw that
>> > OutOfMemory error - but I am not really sure (I am using CMS so I am not
>> > sure if that
>> > applies)
>> >
>> > I can check if I get same error as THRIFT-1205
>> >
>> > Isnt HIVE-2715 same as fixing THRIFT-1468 (atleast for in terms of its
>> > resolution)?
>> >
>> > Thanks
>> > A
>> >
>> >
>> >
>> >
>> >
>> > On Tue, Aug 28, 2012 at 2:33 AM, Joshi, Rekha
>> > <Rekha_Joshi@intuit.com<mailto:Rekha_Joshi@intuit.com>> wrote:
>> > Hi Agateaa,
>> >
>> > Impressive bug description.
>> >
>> > Can you confirm HCat server was up (inspite of thread dump/GC) and for
>> > all practical purposes commands were getting executed in a normal fashion
>> > for fairly good time after the GC issues were noticed on log?
>> > Unless there is a self-healing effect built-in :-) /timeout after which
>> > the error is automatically invalid/system is reset/space is reclaimed, there
>> > must be a way it would have directly impact the system, and not just known
>> > because one checks the log.
>> >
>> > I do not have the same patched environment as yours, but would you care
>> > to unpatch Thrift-1468 and then check if your system bug behavior is in sync
>> > with -
>> > https://issues.apache.org/jira/browse/THRIFT-1205
>> > https://issues.apache.org/jira/browse/THRIFT-1468
>> > https://issues.apache.org/jira/browse/HIVE-2715
>> >
>> > Or especially since you did not enter arbitrary data, can you confirm
>> > you get usual if you do enter provide arbitrary data?
>> >
>> > Thanks
>> > Rekha
>> >
>> > From: agateaaa <agateaaa@gmail.com<mailto:agateaaa@gmail.com>>
>> > Reply-To:
>> > <hcatalog-user@incubator.apache.org<mailto:hcatalog-user@incubator.apache.org>>
>> > Date: Mon, 27 Aug 2012 10:38:01 -0700
>> > To:
>> > <hcatalog-user@incubator.apache.org<mailto:hcatalog-user@incubator.apache.org>>
>> > Subject: Re: HCatalog Thrift Error
>> >
>> > Correction:
>> >
>> > I have a fairly small server (VM) 1GB RAM and 1 CPU  and using HCatalog
>> > Version 0.4, Hive 0.9 (patched for HIVE-3008) with Thrift 0.7 (patched for
>> > THRIFT-1468)
>> >
>> >
>> > On Mon, Aug 27, 2012 at 10:27 AM, agateaaa
>> > <agateaaa@gmail.com<mailto:agateaaa@gmail.com>> wrote:
>> > Hi,
>> >
>> > I got this error over the weekend hcat.err log file.
>> >
>> > Noticed at the approximately same time Full GC was happening in the gc
>> > logs.
>> >
>> > Exception in thread "pool-1-thread-200" java.lang.OutOfMemoryError: Java
>> > heap space
>> >         at
>> > org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:353)
>> >         at
>> > org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:215)
>> >         at
>> > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:81)
>> >         at
>> > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176)
>> >         at
>> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> >         at
>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> >         at java.lang.Thread.run(Thread.java:662)
>> > Exception in thread "pool-1-thread-201" java.lang.OutOfMemoryError: Java
>> > heap space
>> >         at
>> > org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:353)
>> >         at
>> > org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:215)
>> >         at
>> > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:81)
>> >         at
>> > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176)
>> >         at
>> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> >         at
>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> >         at java.lang.Thread.run(Thread.java:662)
>> > Exception in thread "pool-1-thread-202" java.lang.OutOfMemoryError: Java
>> > heap space
>> >         at
>> > org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:353)
>> >         at
>> > org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:215)
>> >         at
>> > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:81)
>> >         at
>> > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176)
>> >         at
>> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> >         at
>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> >         at java.lang.Thread.run(Thread.java:662)
>> > Exception in thread "pool-1-thread-203" java.lang.OutOfMemoryError: Java
>> > heap space
>> >         at
>> > org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:353)
>> >     at
>> > org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:215)
>> >     at
>> > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:81)
>> >     at
>> > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176)
>> >     at
>> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> >     at
>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> >     at java.lang.Thread.run(Thread.java:662)
>> >
>> >
>> > I noticed that the hcatalog server had not shutdown, don't see any other
>> > abnormality in the logs
>> >
>> >
>> > Searching led me to these two thrift issues
>> > https://issues.apache.org/jira/browse/THRIFT-601
>> > https://issues.apache.org/jira/browse/THRIFT-1205
>> >
>> > Only difference is that in my case HCatalog server did not crash and I
>> > wasn't trying to send
>> > any arbritary data to the thrift server at the telnet port
>> >
>> > I have a fairly small server (VM) 1GB RAM and 1 CPU  and using HCatalog
>> > Version 0.4, Hive 0.9 (patched HIVE-3008) with Thrift 0.7 (patched for
>> > THRIFT-1438)
>> >
>> > Has anyone seen this before ?
>> >
>> > Thanks
>> > - A
>> >
>> >
>> >
>
>

Mime
View raw message