incubator-hcatalog-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Timothy Potter <thelabd...@gmail.com>
Subject Re: Pig partition filter using operator other than ==
Date Tue, 20 Nov 2012 00:11:28 GMT
Ok, thanks for the tip! Moving to postgres didn't help - same issue. Trying
fix from 2609 ...

On Mon, Nov 19, 2012 at 4:50 PM, Aniket Mokashi <aniket486@gmail.com> wrote:

> There is an easy way to fix this. You need to re-compile the fix suggested
> in HIVE-2609 and jar it up in datanucleus-rdbms jar along with other class
> files.
>
> ~Aniket
>
>
> On Mon, Nov 19, 2012 at 12:51 PM, Timothy Potter <thelabdude@gmail.com>wrote:
>
>> Ok, nevermind - looks like a known issue with Hive's data nucleus
>> dependency: https://issues.apache.org/jira/browse/PIG-2339
>>
>> Will move to Postgres!
>>
>>
>> On Mon, Nov 19, 2012 at 1:30 PM, Timothy Potter <thelabdude@gmail.com>wrote:
>>
>>> More to this ... finally tracked down the hive server log and am seeing
>>> this:
>>>
>>> 2012-11-19 19:42:53,700 ERROR server.TThreadPoolServer
>>> (TThreadPoolServer.java:run(182)) - Error occurred during processing of
>>> message.
>>> java.lang.NullPointerException
>>> at
>>> org.datanucleus.store.mapped.mapping.MappingHelper.getMappingIndices(MappingHelper.java:35)
>>> at
>>> org.datanucleus.store.mapped.expression.StatementText.applyParametersToStatement(StatementText.java:194)
>>>  at
>>> org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getPreparedStatementForQuery(RDBMSQueryUtils.java:233)
>>> at
>>> org.datanucleus.store.rdbms.query.legacy.SQLEvaluator.evaluate(SQLEvaluator.java:115)
>>>  at
>>> org.datanucleus.store.rdbms.query.legacy.JDOQLQuery.performExecute(JDOQLQuery.java:288)
>>> at org.datanucleus.store.query.Query.executeQuery(Query.java:1657)
>>>  at
>>> org.datanucleus.store.rdbms.query.legacy.JDOQLQuery.executeQuery(JDOQLQuery.java:245)
>>> at org.datanucleus.store.query.Query.executeWithMap(Query.java:1526)
>>>  at org.datanucleus.jdo.JDOQuery.executeWithMap(JDOQuery.java:334)
>>> at
>>> org.apache.hadoop.hive.metastore.ObjectStore.listMPartitionsByFilter(ObjectStore.java:1711)
>>>  at
>>> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1581)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>  at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>  at java.lang.reflect.Method.invoke(Method.java:597)
>>> at
>>> org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
>>>  at $Proxy4.getPartitionsByFilter(Unknown Source)
>>> at
>>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:2466)
>>>  at
>>> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions_by_filter.getResult(ThriftHiveMetastore.java:5817)
>>> at
>>> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions_by_filter.getResult(ThriftHiveMetastore.java:5805)
>>>  at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
>>> at
>>> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:115)
>>>  at
>>> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:112)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>>  at javax.security.auth.Subject.doAs(Subject.java:396)
>>> at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>>>  at
>>> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:520)
>>> at
>>> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:123)
>>>  at
>>> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>  at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>> at java.lang.Thread.run(Thread.java:662)
>>>
>>>
>>> On Mon, Nov 19, 2012 at 12:53 PM, Timothy Potter <thelabdude@gmail.com>wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm using HCatalog 0.4.0 with Pig 0.10 and am not having success using
>>>> an operator other than (==) with my partition field.
>>>>
>>>> For example, the following works (day is my partition field):
>>>>
>>>> signals = load 'signals' using org.apache.hcatalog.pig.HCatLoader();
>>>>
>>>> signals_for_day = filter signals by (day == '2012-10-30_1200' ANDservice
>>>> IS NOT NULL);
>>>>
>>>> samp1 = sample signals_for_day 0.01;
>>>>
>>>> dump samp1;
>>>>
>>>>
>>>> but, if I change my filter to: signals_for_day = filter signals by (day
>>>> >= '2012-10-30_1200' AND service IS NOT NULL);
>>>>
>>>> Then I get the following error:
>>>>
>>>> Caused by: java.io.IOException:
>>>> org.apache.thrift.transport.TTransportException
>>>> at
>>>> org.apache.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:42)
>>>>  at org.apache.hcatalog.pig.HCatLoader.setLocation(HCatLoader.java:90)
>>>> at
>>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:380)
>>>>  ... 19 more
>>>> Caused by: org.apache.thrift.transport.TTransportException
>>>> at
>>>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
>>>>  at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>>>> at
>>>> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
>>>>  at
>>>> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
>>>> at
>>>> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
>>>>  at
>>>> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
>>>> at
>>>> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partitions_by_filter(ThriftHiveMetastore.java:1511)
>>>>  at
>>>> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partitions_by_filter(ThriftHiveMetastore.java:1495)
>>>> at
>>>> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:691)
>>>>  at
>>>> org.apache.hcatalog.mapreduce.InitializeInput.getSerializedHcatKeyJobInfo(InitializeInput.java:98)
>>>> at
>>>> org.apache.hcatalog.mapreduce.InitializeInput.setInput(InitializeInput.java:73)
>>>>  at
>>>> org.apache.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:40)
>>>> ... 21 more
>>>>
>>>> I can start debugging but would like to know if HCatalog is supposed to
>>>> support this type of filtering by partition fields?
>>>>
>>>> Thanks.
>>>> Tim
>>>>
>>>>
>>>
>>
>
>
> --
> "...:::Aniket:::... Quetzalco@tl"
>

Mime
View raw message