spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "james.green9@baesystems.com" <james.gre...@baesystems.com>
Subject RE: spark hivethriftserver problem on 1.5.0 -> 1.6.0 upgrade
Date Wed, 03 Feb 2016 12:17:35 GMT
I have a workaround for this issue which is to go back to single session mode for the thrift
server:

conf.set("spark.sql.hive.thriftServer.singleSession", "true")

This seems to mean that temp tables can be registered in 1.6.0 with a remote metastore.

Cheers

James



From: Yin Huai [mailto:yhuai@databricks.com]
Sent: 26 January 2016 17:48
To: Green, James (UK Guildford)
Cc: dev@spark.apache.org
Subject: Re: spark hivethriftserver problem on 1.5.0 -> 1.6.0 upgrade

Can you post more logs, specially lines around "Initializing execution hive ..." (this is
for an internal used fake metastore and it is derby) and "Initializing HiveMetastoreConnection
version ..." (this is for the real metastore. It should be your remote one)? Also, those temp
tables are stored in the memory and are associated with a HiveContext. If you can not see
temp tables, it usually means that the HiveContext that you used with JDBC was different from
the one used to create the temp table. However, in your case, you are using HiveThriftServer2.startWithContext(hiveContext).
So, it will be good to provide more logs and see what happened.

Thanks,

Yin

On Tue, Jan 26, 2016 at 1:33 AM, james.green9@baesystems.com<mailto:james.green9@baesystems.com>
<james.green9@baesystems.com<mailto:james.green9@baesystems.com>> wrote:
Hi

I posted this on the user list yesterday,  I am posting it here now because on further investigation
I am pretty sure this is a bug:


On upgrade from 1.5.0 to 1.6.0 I have a problem with the hivethriftserver2, I have this code:

val hiveContext = new HiveContext(SparkContext.getOrCreate(conf));

val thing = hiveContext.read.parquet("hdfs://dkclusterm1.imp.net:8020/user/jegreen1/ex208<http://dkclusterm1.imp.net:8020/user/jegreen1/ex208>")

thing.registerTempTable("thing")

HiveThriftServer2.startWithContext(hiveContext)


When I start things up on the cluster my hive-site.xml is found – I can see that the metastore
connects:


INFO  metastore - Trying to connect to metastore with URI thrift://dkclusterm2.imp.net:9083<http://dkclusterm2.imp.net:9083>
INFO  metastore - Connected to metastore.


But then later on the thrift server seems not to connect to the remote hive metastore but
to start a derby instance instead:

INFO  AbstractService - Service:CLIService is started.
INFO  ObjectStore - ObjectStore, initialize called
INFO  Query - Reading in results for query "org.datanucleus.store.rdbms.query.SQLQuery@0<mailto:org.datanucleus.store.rdbms.query.SQLQuery@0>"
since the connection used is closing
INFO  MetaStoreDirectSql - Using direct SQL, underlying DB is DERBY
INFO  ObjectStore - Initialized ObjectStore
INFO  HiveMetaStore - 0: get_databases: default
INFO  audit - ugi=jegreen1      ip=unknown-ip-addr      cmd=get_databases: default
INFO  HiveMetaStore - 0: Shutting down the object store...
INFO  audit - ugi=jegreen1      ip=unknown-ip-addr      cmd=Shutting down the object store...
INFO  HiveMetaStore - 0: Metastore shutdown complete.
INFO  audit - ugi=jegreen1      ip=unknown-ip-addr      cmd=Metastore shutdown complete.
INFO  AbstractService - Service:ThriftBinaryCLIService is started.
INFO  AbstractService - Service:HiveServer2 is started.

On 1.5.0 the same bit of the log reads:

INFO  AbstractService - Service:CLIService is started.
INFO  metastore - Trying to connect to metastore with URI thrift://dkclusterm2.imp.net:9083<http://dkclusterm2.imp.net:9083>
     ******* ie 1.5.0 connects to remote hive
INFO  metastore - Connected to metastore.
INFO  AbstractService - Service:ThriftBinaryCLIService is started.
INFO  AbstractService - Service:HiveServer2 is started.
INFO  ThriftCLIService - Starting ThriftBinaryCLIService on port 10000 with 5...500 worker
threads



So if I connect to this with JDBC I can see all the tables on the hive server – but not
anything temporary – I guess they are going to derby.

I see someone on the databricks website is also having this problem.


Thanks

James
Please consider the environment before printing this email. This message should be regarded
as confidential. If you have received this email in error please notify the sender and destroy
it immediately. Statements of intent shall only become binding when confirmed in hard copy
by an authorised signatory. The contents of this email may relate to dealings with other companies
under the control of BAE Systems Applied Intelligence Limited, details of which can be found
at http://www.baesystems.com/Businesses/index.htm.

Please consider the environment before printing this email. This message should be regarded
as confidential. If you have received this email in error please notify the sender and destroy
it immediately. Statements of intent shall only become binding when confirmed in hard copy
by an authorised signatory. The contents of this email may relate to dealings with other companies
under the control of BAE Systems Applied Intelligence Limited, details of which can be found
at http://www.baesystems.com/Businesses/index.htm.
Mime
View raw message