hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Szehon Ho <sze...@cloudera.com>
Subject Re: FW: HiveServer2 VS HiveServer1 Logging
Date Wed, 11 Jun 2014 20:14:18 GMT
Sorry I missed the last part mentioning that it messes up logs of "show
tables" after "set".  That's strange, I tried on latest trunk and I don't
see that happening, show tables still shows the "perf" logs.


On Wed, Jun 11, 2014 at 1:06 PM, Szehon Ho <szehon@cloudera.com> wrote:

> I think that's expected.  SQL Operations like "show tables" will reach
> Driver, which has perf and detailed logs about execution.
>
> Other operations like "set" or "add" are not SQL Operations, so in HS2
> they don't hit the Driver and don't generate the logs.  They are pretty
> simple ops that just set some state.  Did those show in HS1?  If so, maybe
> the implementation changed.
>
> Thanks
> Szehon
>
>
> On Wed, Jun 11, 2014 at 4:40 AM, Dima Machlin <Dima.Machlin@pursway.com>
> wrote:
>
>> Any change somebody has a clue about this?
>>
>> From: Dima Machlin [mailto:Dima.Machlin@pursway.com]
>> Sent: Sunday, May 25, 2014 1:54 PM
>> To: user@hive.apache.org
>> Subject: RE: HiveServer2 VS HiveServer1 Logging
>>
>> I’ve made some progress in investigating this.
>> It seems that this behavior happens on certain conditions.
>>
>> As long as i’m running any query that isn’t “set” or “add” command the
>> logging is fine.
>> For example “show table” :
>>
>> 14/05/25 13:47:17 INFO cli.CLIService: SessionHandle
>> [2db07453-2235-4f22-ab72-4a27c1b1457d]: openSession()
>> 14/05/25 13:47:17 INFO cli.CLIService: SessionHandle
>> [2db07453-2235-4f22-ab72-4a27c1b1457d]: getInfo()
>> 14/05/25 13:47:18 INFO ql.Driver: <PERFLOG method=Driver.run>
>> 14/05/25 13:47:18 INFO ql.Driver: <PERFLOG method=TimeToSubmit>
>> 14/05/25 13:47:18 INFO ql.Driver: <PERFLOG method=compile>
>> 14/05/25 13:47:18 INFO ql.Driver: <PERFLOG method=parse>
>> 14/05/25 13:47:18 INFO parse.ParseDriver: Parsing command: show tables
>> 14/05/25 13:47:18 INFO parse.ParseDriver: Parse Completed
>> 14/05/25 13:47:18 INFO ql.Driver: </PERFLOG method=parse
>> start=1401014838047 end=1401014838376 duration=329>
>> 14/05/25 13:47:18 INFO ql.Driver: <PERFLOG method=semanticAnalyze>
>> 14/05/25 13:47:18 INFO ql.Driver: Semantic Analysis Completed
>> 14/05/25 13:47:18 INFO ql.Driver: </PERFLOG method=semanticAnalyze
>> start=1401014838376 end=1401014838453 duration=77>
>> 14/05/25 13:47:18 INFO exec.ListSinkOperator: Initializing Self 0 OP
>> 14/05/25 13:47:18 INFO exec.ListSinkOperator: Operator 0 OP initialized
>> 14/05/25 13:47:18 INFO exec.ListSinkOperator: Initialization Done 0 OP
>> 14/05/25 13:47:18 INFO ql.Driver: Returning Hive schema:
>> Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from
>> deserializer)], properties:null)
>> 14/05/25 13:47:18 INFO ql.Driver: </PERFLOG method=compile
>> start=1401014838011 end=1401014838521 duration=510>
>> 14/05/25 13:47:18 INFO ql.Driver: <PERFLOG method=Driver.execute>
>> 14/05/25 13:47:18 INFO ql.Driver: Starting command: show tables
>> 14/05/25 13:47:18 INFO ql.Driver: </PERFLOG method=TimeToSubmit
>> start=1401014838011 end=1401014838531 duration=520>
>> 14/05/25 13:47:18 INFO ql.Driver: <PERFLOG method=runTasks>
>> 14/05/25 13:47:18 INFO ql.Driver: <PERFLOG method=task.DDL.Stage-0>
>> 14/05/25 13:47:18 INFO hive.metastore: Trying to connect to metastore
>> with URI thrift://localhost:9083
>> 14/05/25 13:47:18 INFO hive.metastore: Waiting 1 seconds before next
>> connection attempt.
>> 14/05/25 13:47:19 INFO hive.metastore: Connected to metastore.
>> 14/05/25 13:47:19 INFO ql.Driver: </PERFLOG method=task.DDL.Stage-0
>> start=1401014838531 end=1401014839627 duration=1096>
>> 14/05/25 13:47:19 INFO ql.Driver: </PERFLOG method=runTasks
>> start=1401014838531 end=1401014839627 duration=1096>
>> 14/05/25 13:47:19 INFO ql.Driver: </PERFLOG method=Driver.execute
>> start=1401014838521 end=1401014839627 duration=1106>
>> OK
>> 14/05/25 13:47:19 INFO ql.Driver: OK
>> 14/05/25 13:47:19 INFO ql.Driver: <PERFLOG method=releaseLocks>
>> 14/05/25 13:47:19 INFO ql.Driver: </PERFLOG method=releaseLocks
>> start=1401014839627 end=1401014839627 duration=0>
>> 14/05/25 13:47:19 INFO ql.Driver: </PERFLOG method=Driver.run
>> start=1401014838011 end=1401014839627 duration=1616>
>> 14/05/25 13:47:19 INFO cli.CLIService: SessionHandle
>> [2db07453-2235-4f22-ab72-4a27c1b1457d]: executeStatement()
>> 14/05/25 13:47:19 INFO cli.CLIService: OperationHandle
>> [opType=EXECUTE_STATEMENT,
>> getHandleIdentifier()=0628b8f8-01de-4397-8279-a314cf553d7f]:
>> getResultSetMetadata()
>> 14/05/25 13:47:19 WARN snappy.LoadSnappy: Snappy native library not loaded
>> 14/05/25 13:47:19 INFO mapred.FileInputFormat: Total input paths to
>> process : 1
>> 14/05/25 13:47:19 INFO cli.CLIService: OperationHandle
>> [opType=EXECUTE_STATEMENT,
>> getHandleIdentifier()=0628b8f8-01de-4397-8279-a314cf553d7f]: fetchResults()
>> 14/05/25 13:47:19 INFO cli.CLIService: OperationHandle
>> [opType=EXECUTE_STATEMENT,
>> getHandleIdentifier()=0628b8f8-01de-4397-8279-a314cf553d7f]: fetchResults()
>> 14/05/25 13:47:19 INFO exec.ListSinkOperator: 0 finished. closing...
>> 14/05/25 13:47:19 INFO exec.ListSinkOperator: 0 forwarded 0 rows
>> 14/05/25 13:47:19 INFO ql.Driver: <PERFLOG method=releaseLocks>
>> 14/05/25 13:47:19 INFO ql.Driver: </PERFLOG method=releaseLocks
>> start=1401014839857 end=1401014839857 duration=0>
>> 14/05/25 13:47:19 INFO cli.CLIService: OperationHandle
>> [opType=EXECUTE_STATEMENT,
>> getHandleIdentifier()=0628b8f8-01de-4397-8279-a314cf553d7f]: closeOperation
>>
>>
>> Now running : “set hive.enforce.bucketing = true;”
>>
>> 14/05/25 13:48:07 INFO operation.Operation: Putting temp output to file
>> /tmp/hadoop/2db07453-2235-4f22-ab72-4a27c1b1457d2566159976359370628.pipeout
>> 14/05/25 13:48:07 INFO cli.CLIService: SessionHandle
>> [2db07453-2235-4f22-ab72-4a27c1b1457d]: executeStatement()
>> 14/05/25 13:48:07 INFO cli.CLIService: OperationHandle
>> [opType=EXECUTE_STATEMENT,
>> getHandleIdentifier()=7b13a3e2-e0ea-4dae-b693-0d456519fc66]:
>> getOperationStatus()
>>
>> First thing that happens is : “Putting temp output to file” and from now
>> on, nothing is shown in the console.
>>
>> Running again “show tables” generates no logging at all in the console.
>>
>> The hiveserver is started like this :
>> /opt/mapr/hive/hive-0.12/bin/hive --service hiveserver2 --hiveconf
>> hive.server2.thrift.port=26006
>>
>> The connection is made from JDBC.
>>
>> Executing the same procedure from the CLI generates all the expected
>> logging.
>>
>>
>> From: Dima Machlin [mailto:Dima.Machlin@pursway.com]
>> Sent: Wednesday, May 21, 2014 5:06 PM
>> To: user@hive.apache.org<mailto:user@hive.apache.org>
>> Subject: HiveServer2 VS HiveServer1 Logging
>>
>>
>> Hi all,
>>
>> It seems that for some reason HS2 outputs far less logging than HS1 in
>> hive 0.12 for example, starting HS1 in the following way : hive --service
>> hiveserver
>> and executing show tables produces this :
>>
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO service.HiveServer: Running the
>> query: show tables
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO exec.ListSinkOperator: 1
>> finished. closing...
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO exec.ListSinkOperator: 1
>> forwarded 0 rows
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: <PERFLOG
>> method=Driver.run>
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: <PERFLOG
>> method=TimeToSubmit>
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: <PERFLOG
>> method=compile>
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: <PERFLOG method=parse>
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO parse.ParseDriver: Parsing
>> command: show tables
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO parse.ParseDriver: Parse
>> Completed
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: </PERFLOG
>> method=parse start=1398867256379 end=1398867256380 duration=1>
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: <PERFLOG
>> method=semanticAnalyze>
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: Semantic Analysis
>> Completed
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: </PERFLOG
>> method=semanticAnalyze start=1398867256380 end=1398867256381 duration=1>
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO exec.ListSinkOperator:
>> Initializing Self 2 OP
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO exec.ListSinkOperator: Operator
>> 2 OP initialized
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO exec.ListSinkOperator:
>> Initialization Done 2 OP
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: Returning Hive
>> schema: Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string,
>> comment:from deserializer)], properties:null)
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: </PERFLOG
>> method=compile start=1398867256379 end=1398867256384 duration=5>
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: <PERFLOG
>> method=Driver.execute>
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: Starting command:
>> show tables
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: </PERFLOG
>> method=TimeToSubmit start=1398867256378 end=1398867256384 duration=6>
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: <PERFLOG
>> method=runTasks>
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: <PERFLOG
>> method=task.DDL.Stage-0>
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: </PERFLOG
>> method=task.DDL.Stage-0 start=1398867256384 end=1398867256398 duration=14>
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: </PERFLOG
>> method=runTasks start=1398867256384 end=1398867256398 duration=14>
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: </PERFLOG
>> method=Driver.execute start=1398867256384 end=1398867256398 duration=14> OK
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: OK
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: <PERFLOG
>> method=releaseLocks>
>> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: </PERFLOG
>> method=releaseLocks start=1398867256398 end=1398867256398 duration=0>
>>
>> while when running on HS2 (hive --service hiveserver2), the only logging
>> data is from the initialization of the connection:
>>
>> 14/04/30 17:18:54 [pool-2-thread-1] INFO cli.CLIService: SessionHandle
>> [4b0a96cb-626b-45ae-ae0d-6a959e84e5a4]: openSession()
>> 14/04/30 17:18:55 [pool-2-thread-1] INFO operation.Operation: Putting
>> temp output to file
>> /tmp/hadoop/4b0a96cb-626b-45ae-ae0d-6a959e84e5a44568051681296760332.pipeout
>> 14/04/30 17:18:55 [pool-2-thread-1] INFO cli.CLIService: SessionHandle
>> [4b0a96cb-626b-45ae-ae0d-6a959e84e5a4]: executeStatement()
>> 14/04/30 17:18:55 [pool-2-thread-1] INFO cli.CLIService: OperationHandle
>> [opType=EXECUTE_STATEMENT,
>> getHandleIdentifier()=01a12804-8f77-4e8f-b17a-de4c39e481c8]:
>> getOperationStatus()
>>
>> And no sign for the executed query.
>> The show tables is just an example, it also doesn’t show any queries
>> (both mapreduce and direct)
>>
>> Settings :
>> hive.root.logger in hive-log4j.properties is set to INFO,console
>>
>> Any reason for this?
>>
>>
>>
>>
>>
>>
>> ************************************************************************************
>> This footnote confirms that this email message has been scanned by
>> PineApp Mail-SeCure for the presence of malicious code, vandals &
>> computer viruses.
>>
>> ************************************************************************************
>>
>>
>>
>>
>>
>> ************************************************************************************
>> This footnote confirms that this email message has been scanned by
>> PineApp Mail-SeCure for the presence of malicious code, vandals &
>> computer viruses.
>>
>> ************************************************************************************
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message