hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eugene Koifman <ekoif...@hortonworks.com>
Subject Re: hcatalog takes minutes talking to mysql metadata
Date Wed, 28 Aug 2013 18:43:13 GMT
perhaps HIVE-4914 relevant


On Wed, Aug 28, 2013 at 3:11 AM, Michał Czerwiński <michal@qubitproducts.com
> wrote:

> Also what is worth mentioning I have tried running 0.4.0-cdh4.3.0-SNAPSHOT
> jars (from
> https://repository.cloudera.com/content/groups/public/org/apache/hcatalog/hcatalog-core/)
> with exactly the same issue. That could possibly indicate that problem may
> be related to the actual hive-metastore component and the way it interacts
> with metastore, thoughts?
>
>
> On 27 August 2013 18:41, Michał Czerwiński <michal@qubitproducts.com>wrote:
>
>> In PIG I am doing query like this:
>>
>> sdp1 = load 'db1.table1' using org.apache.hcatalog.pig.HCatLoader;
>> sdp = FILTER sdp1 BY key1=='value1' AND key2=='value2';
>> ll = LIMIT sdp 100;
>> dump ll;
>>
>> and hcatalog starts talking for few minutes to mysql asking for metadata,
>> in the meantime after few seconds pig
>> does: org.apache.thrift.transport.TTransportException:
>> java.net.SocketTimeoutException: Read timed out
>>
>> Number of partitions I have:
>> hive -e 'use db1; show partitions table1' |wc -l
>> Time taken: 1.467 seconds
>> 37748
>>
>> When I run the same query on a different environment where I have only
>> ~1000 partitions all works fine.
>>
>> Also problem does not exist on cdh3 and hcatalog-0.4.0.
>>
>> In hcatalog's logs I can see:
>> (note the timestamp, I run the query at 17:10:45,216)
>>
>> 2013-08-27 17:10:46,275 INFO  DataNucleus.MetaData
>> (Log4JLogger.java:info(77)) - Listener found initialisation for persistable
>> class org.apache.hadoop.hive.metastore.model.MPartition
>>
>> 2013-08-27 17:14:23,661 DEBUG metastore.ObjectStore
>> (ObjectStore.java:listMPartitionsByFilter(1832)) - Done retrieving all
>> objects for listMPartitionsByFilter
>>
>> 2013-08-27 17:22:32,410 INFO  metastore.ObjectStore
>> (ObjectStore.java:getPartitionsByFilter(1699)) - # parts after pruning =
>> 37748
>>
>> After that the hcatalog continues to:
>> 2013-08-27 17:30:14,631 DEBUG DataNucleus.Transaction
>> (Log4JLogger.java:debug(58)) - Transaction committed in 462221 ms
>>
>> Please note that I have datanucleus set to DEBUG and that slows things
>> down significantly, without that, it still takes around 7 minutes for
>> hcatalog to settle.
>>
>> Also datanucleus settings from the hcatalog's logs:
>>
>>  datanucleus.autoStartMechanismMode = checked
>>  javax.jdo.option.Multithreaded = true
>>  datanucleus.identifierFactory = datanucleus
>>  datanucleus.transactionIsolation = read
>>  datanucleus.validateTables = false
>>  javax.jdo.option.ConnectionURL = jdbc:mysql://XXX
>>  javax.jdo.option.DetachAllOnCommit = true
>>  javax.jdo.option.NonTransactionalRead = true
>>  datanucleus.validateConstraints = false
>>  javax.jdo.option.ConnectionDriverName = com.mysql.jdbc.Driver
>>  javax.jdo.option.ConnectionUserName = hive
>>  datanucleus.validateColumns = false
>>  datanucleus.cache.level2 = false
>>  datanucleus.plugin.pluginRegistryBundleCheck = LOG
>>  datanucleus.cache.level2.type = none
>>  javax.jdo.PersistenceManagerFactoryClass =
>> org.datanucleus.jdo.JDOPersistenceManagerFactory
>>  datanucleus.autoCreateSchema = true
>>  datanucleus.storeManagerType = rdbms
>>  datanucleus.connectionPoolingType = DBCP
>>
>> This runs on CDH4 4.3.0
>> hcatalog version: 0.5.0+9-1.cdh4.3.0.p0.12~precise-cdh4.3.0
>>
>> Ideas?
>>
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Mime
View raw message