hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Margus Roo <mar...@roo.ee>
Subject Re: Create external table
Date Thu, 12 May 2016 07:41:52 GMT
No I got closer and discovered that my problem is related with permissions.

In example

drwxr-xr-x   - margusja  hdfs          0 2016-05-12 03:33 /tmp/files_10k

...

-rw-r--r--   3 margusja hdfs          5 2016-05-12 02:01 
/tmp/files_10k/f1959.txt
-rw-r--r--   3 margusja hdfs          4 2016-05-12 02:01 
/tmp/files_10k/f196.txt
-rw-r--r--   3 margusja hdfs          5 2016-05-12 02:01 
/tmp/files_10k/f1960.txt
-rw-r--r--   3 margusja hdfs          5 2016-05-12 02:01 
/tmp/files_10k/f1961.txt
-rw-r--r--   3 margusja hdfs          5 2016-05-12 02:01 
/tmp/files_10k/f1962.txt
-rw-r--r--   3 margusja hdfs          5 2016-05-12 02:01 
/tmp/files_10k/f1963.txt
-rw-r--r--   3 margusja hdfs          5 2016-05-12 02:01 
/tmp/files_10k/f1964.txt
-rw-r--r--   3 margusja hdfs          5 2016-05-12 02:01 
/tmp/files_10k/f1965.txt
-rw-r--r--   3 margusja hdfs          5 2016-05-12 02:01 
/tmp/files_10k/f1966.txt

...

Connected to: Apache Hive (version 1.2.1.2.3.4.0-3485)
Driver: Hive JDBC (version 1.2.1.2.3.4.0-3485)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://bigdata29.webmedia.int:10000/> create external table 
files_10k (i int) row format delimited fields terminated by '\t' 
location '/tmp/files_10k';
No rows affected (3.184 seconds)
0: jdbc:hive2://bigdata29.webmedia.int:10000/>


Now I change owner to flume in example.

drwxr-xr-x   - flume     hdfs          0 2016-05-12 03:33 /tmp/files_10k

...

-rw-r--r--   3 flume hdfs          5 2016-05-12 02:01 
/tmp/files_10k/f1968.txt
-rw-r--r--   3 flume hdfs          5 2016-05-12 02:01 
/tmp/files_10k/f1969.txt
-rw-r--r--   3 flume hdfs          4 2016-05-12 02:01 
/tmp/files_10k/f197.txt
-rw-r--r--   3 flume hdfs          5 2016-05-12 02:01 
/tmp/files_10k/f1970.txt
-rw-r--r--   3 flume hdfs          5 2016-05-12 02:01 
/tmp/files_10k/f1971.txt
-rw-r--r--   3 flume hdfs          5 2016-05-12 02:01 
/tmp/files_10k/f1972.txt

...

Others can read. In example user margusja can read

[margusja@bigdata29 ~]$ hdfs dfs -ls /tmp/files_10k
Found 1112 items
-rw-r--r--   3 flume hdfs          2 2016-05-12 01:59 /tmp/files_10k/f1.txt
-rw-r--r--   3 flume hdfs          3 2016-05-12 01:59 /tmp/files_10k/f10.txt
-rw-r--r--   3 flume hdfs          4 2016-05-12 01:59 
/tmp/files_10k/f100.txt
-rw-r--r--   3 flume hdfs          5 2016-05-12 01:59 
/tmp/files_10k/f1000.txt
-rw-r--r--   3 flume hdfs          6 2016-05-12 01:59 
/tmp/files_10k/f10000.txt

I try now create a table

0: jdbc:hive2://bigdata29.webmedia.int:10000/> create external table 
files_10k (i int) row format delimited fields terminated by '\t' 
location '/tmp/files_10k';
Error: Error while compiling statement: FAILED: 
HiveAccessControlException Permission denied: user [margusja] does not 
have [READ] privilege on [hdfs://mycluster/tmp/files_10k] 
(state=42000,code=40000)
0: jdbc:hive2://bigdata29.webmedia.int:10000/>

In Hiveserver2.log:

2016-05-12 03:38:58,111 INFO  [HiveServer2-Handler-Pool: Thread-69]: 
parse.ParseDriver (ParseDriver.java:parse(185)) - Parsing command: 
create external table files_10k (i int) row format delimited fields 
terminated by '\t' location '/tmp/files_10k'
2016-05-12 03:38:58,112 INFO  [HiveServer2-Handler-Pool: Thread-69]: 
parse.ParseDriver (ParseDriver.java:parse(209)) - Parse Completed
2016-05-12 03:38:58,112 INFO  [HiveServer2-Handler-Pool: Thread-69]: 
log.PerfLogger (PerfLogger.java:PerfLogEnd(162)) - </PERFLOG 
method=parse start=1463038738111 end=1463038738112 duration=1 
from=org.apache.hadoop.hive.ql.Driver>
2016-05-12 03:38:58,112 INFO  [HiveServer2-Handler-Pool: Thread-69]: 
log.PerfLogger (PerfLogger.java:PerfLogBegin(135)) - <PERFLOG 
method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
2016-05-12 03:38:58,112 INFO  [HiveServer2-Handler-Pool: Thread-69]: 
parse.CalcitePlanner (SemanticAnalyzer.java:analyzeInternal(10114)) - 
Starting Semantic Analysis
2016-05-12 03:38:58,113 INFO  [HiveServer2-Handler-Pool: Thread-69]: 
parse.CalcitePlanner (SemanticAnalyzer.java:analyzeCreateTable(10776)) - 
Creating table default.files_10k position=22
2016-05-12 03:38:58,113 INFO  [HiveServer2-Handler-Pool: Thread-69]: 
metastore.HiveMetaStore (HiveMetaStore.java:logInfo(747)) - 1: 
get_database: default
2016-05-12 03:38:58,113 INFO  [HiveServer2-Handler-Pool: Thread-69]: 
HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(372)) - 
ugi=hive/bigdata29.webmedia.int@TESTHADOOP.COM ip=unknown-ip-addr      
cmd=get_database: default
2016-05-12 03:38:58,118 INFO  [HiveServer2-Handler-Pool: Thread-69]: 
ql.Driver (Driver.java:compile(466)) - Semantic Analysis Completed
2016-05-12 03:38:58,118 INFO  [HiveServer2-Handler-Pool: Thread-69]: 
log.PerfLogger (PerfLogger.java:PerfLogEnd(162)) - </PERFLOG 
method=semanticAnalyze start=1463038738112 end=1463038738118 duration=6 
from=org.apache.hadoop.hive.ql.Driver>
2016-05-12 03:38:58,118 INFO  [HiveServer2-Handler-Pool: Thread-69]: 
ql.Driver (Driver.java:getSchema(246)) - Returning Hive schema: 
Schema(fieldSchemas:null, properties:null)
2016-05-12 03:38:58,118 INFO  [HiveServer2-Handler-Pool: Thread-69]: 
log.PerfLogger (PerfLogger.java:PerfLogBegin(135)) - <PERFLOG 
method=doAuthorization from=org.apache.hadoop.hive.ql.Driver>
2016-05-12 03:39:00,148 INFO 
[org.apache.hadoop.util.JvmPauseMonitor$Monitor@53bb71e5]: 
util.JvmPauseMonitor (JvmPauseMonitor.java:run(195)) - Detected pause in 
JVM or host machine (eg GC): pause of approximately 1916ms
GC pool 'PS MarkSweep' had collection(s): count=1 time=2002ms
2016-05-12 03:39:01,733 INFO 
[org.apache.hadoop.util.JvmPauseMonitor$Monitor@53bb71e5]: 
util.JvmPauseMonitor (JvmPauseMonitor.java:run(195)) - Detected pause in 
JVM or host machine (eg GC): pause of approximately 1081ms
GC pool 'PS MarkSweep' had collection(s): count=1 time=1455ms
2016-05-12 03:39:20,984 ERROR [HiveServer2-Handler-Pool: Thread-69]: 
authorizer.RangerHiveAuthorizer 
(RangerHiveAuthorizer.java:isURIAccessAllowed(755)) - Error getting 
permissions for hdfs://mycluster/tmp/files_10k
java.io.IOException: Couldn't create proxy provider class 
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider


I am confused. What extra rights Hive except?


Margus (margusja) Roo
http://margus.roo.ee
skype: margusja
+372 51 48 780

On 11/05/16 14:17, Margus Roo wrote:
>
> One more example:
>
> [hdfs@hadoopnn1 ~]$ hdfs dfs -count -h /user/margusja/files_10k/
>            1        9.8 K             47.7 K /user/margusja/files_10k
> [hdfs@hadoopnn1 ~]$ hdfs dfs -count -h /datasource/dealgate/
>           53        7.9 K              8.5 G /datasource/dealgate
>
> 2: jdbc:hive2://hadoopnn1.estpak.ee:10000/def> create external table 
> files_10k (i int) row format delimited fields terminated by '\t' 
> location '/user/margusja/files_10k';
> No rows affected (0.197 seconds)
> 2: jdbc:hive2://hadoopnn1.estpak.ee:10000/def> drop table files_10k;
> No rows affected (0.078 seconds)
> 2: jdbc:hive2://hadoopnn1.estpak.ee:10000/def> create external table 
> files_10k (i int) row format delimited fields terminated by '\t' 
> location '/datasource/dealgate';
> Error: org.apache.thrift.transport.TTransportException 
> (state=08S01,code=0)
> 2: jdbc:hive2://hadoopnn1.estpak.ee:10000/def>
>
>
> So in my point of view beeline in some reason looks data and old hive 
> client does not.
>
> Margus (margusja) Roo
> http://margus.roo.ee
> skype: margusja
> +372 51 48 780
> On 11/05/16 13:35, Margus Roo wrote:
>>
>> More information:
>>
>> 2016-05-11 13:31:17,086 INFO  [HiveServer2-Handler-Pool: 
>> Thread-5867]: parse.ParseDriver (ParseDriver.java:parse(185)) - 
>> Parsing command: create external table files_10k (i int) row format 
>> delimited fields terminated by '\t' location '/user/margusja/files_10k'
>> 2016-05-11 13:31:17,089 INFO  [HiveServer2-Handler-Pool: 
>> Thread-5867]: parse.ParseDriver (ParseDriver.java:parse(209)) - Parse 
>> Completed
>> 2016-05-11 13:31:17,089 INFO  [HiveServer2-Handler-Pool: 
>> Thread-5867]: log.PerfLogger (PerfLogger.java:PerfLogEnd(162)) - 
>> </PERFLOG method=parse start=1462962677086 end=1462962677089 
>> duration=3 from=org.apache.hadoop.hive.ql.Driver>
>> 2016-05-11 13:31:17,089 INFO  [HiveServer2-Handler-Pool: 
>> Thread-5867]: log.PerfLogger (PerfLogger.java:PerfLogBegin(135)) - 
>> <PERFLOG method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
>> 2016-05-11 13:31:17,090 INFO  [HiveServer2-Handler-Pool: 
>> Thread-5867]: parse.CalcitePlanner 
>> (SemanticAnalyzer.java:analyzeInternal(10114)) - Starting Semantic 
>> Analysis
>> 2016-05-11 13:31:17,093 INFO  [HiveServer2-Handler-Pool: 
>> Thread-5867]: parse.CalcitePlanner 
>> (SemanticAnalyzer.java:analyzeCreateTable(10776)) - Creating table 
>> default.files_10k position=22
>> 2016-05-11 13:31:17,094 INFO  [HiveServer2-Handler-Pool: 
>> Thread-5867]: metastore.HiveMetaStore 
>> (HiveMetaStore.java:logInfo(747)) - 2: get_database: default
>> 2016-05-11 13:31:17,094 INFO  [HiveServer2-Handler-Pool: 
>> Thread-5867]: HiveMetaStore.audit 
>> (HiveMetaStore.java:logAuditEvent(372)) - 
>> ugi=hive/hadoopnn1.estpak.ee@TESTHADOOP.COM ip=unknown-ip-addr      
>> cmd=get_database: default
>> 2016-05-11 13:31:17,098 WARN  [HiveServer2-Handler-Pool: 
>> Thread-5867]: security.UserGroupInformation 
>> (UserGroupInformation.java:getGroupNames(1521)) - No groups available 
>> for user hive
>> 2016-05-11 13:31:17,098 WARN  [HiveServer2-Handler-Pool: 
>> Thread-5867]: security.UserGroupInformation 
>> (UserGroupInformation.java:getGroupNames(1521)) - No groups available 
>> for user hive
>> 2016-05-11 13:31:17,099 WARN  [HiveServer2-Handler-Pool: 
>> Thread-5867]: security.UserGroupInformation 
>> (UserGroupInformation.java:getGroupNames(1521)) - No groups available 
>> for user hive
>> 2016-05-11 13:31:17,099 WARN  [HiveServer2-Handler-Pool: 
>> Thread-5867]: security.UserGroupInformation 
>> (UserGroupInformation.java:getGroupNames(1521)) - No groups available 
>> for user hive
>> 2016-05-11 13:31:17,099 INFO  [HiveServer2-Handler-Pool: 
>> Thread-5867]: metadata.HiveUtils 
>> (HiveUtils.java:getMetaStoreAuthorizeProviderManagers(353)) - Adding 
>> metastore authorization provider: 
>> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider
>> 2016-05-11 13:31:17,102 WARN  [HiveServer2-Handler-Pool: 
>> Thread-5867]: security.UserGroupInformation 
>> (UserGroupInformation.java:getGroupNames(1521)) - No groups available 
>> for user hive
>> 2016-05-11 13:31:17,102 WARN  [HiveServer2-Handler-Pool: 
>> Thread-5867]: security.UserGroupInformation 
>> (UserGroupInformation.java:getGroupNames(1521)) - No groups available 
>> for user hive
>> 2016-05-11 13:31:17,106 INFO  [HiveServer2-Handler-Pool: 
>> Thread-5867]: ql.Driver (Driver.java:compile(466)) - Semantic 
>> Analysis Completed
>> 2016-05-11 13:31:17,106 INFO  [HiveServer2-Handler-Pool: 
>> Thread-5867]: log.PerfLogger (PerfLogger.java:PerfLogEnd(162)) - 
>> </PERFLOG method=semanticAnalyze start=1462962677089 
>> end=1462962677106 duration=17 from=org.apache.hadoop.hive.ql.Driver>
>> 2016-05-11 13:31:17,106 INFO  [HiveServer2-Handler-Pool: 
>> Thread-5867]: ql.Driver (Driver.java:getSchema(246)) - Returning Hive 
>> schema: Schema(fieldSchemas:null, properties:null)
>> 2016-05-11 13:31:17,106 INFO  [HiveServer2-Handler-Pool: 
>> Thread-5867]: log.PerfLogger (PerfLogger.java:PerfLogBegin(135)) - 
>> <PERFLOG method=doAuthorization from=org.apache.hadoop.hive.ql.Driver>
>> 2016-05-11 13:31:17,107 WARN  [HiveServer2-Handler-Pool: 
>> Thread-5867]: security.UserGroupInformation 
>> (UserGroupInformation.java:getGroupNames(1521)) - No groups available 
>> for user margusja
>> 2016-05-11 13:31:18,289 INFO 
>> [org.apache.hadoop.util.JvmPauseMonitor$Monitor@59f45950]: 
>> util.JvmPauseMonitor (JvmPauseMonitor.java:run(195)) - Detected pause 
>> in JVM or host machine (eg GC): pause of approximately 1092ms
>> 2016-05-11 13:31:29,547 INFO  [HiveServer2-Handler-Pool: 
>> Thread-5867]: retry.RetryInvocationHandler 
>> (RetryInvocationHandler.java:invoke(144)) - Exception while invoking 
>> getListing of class ClientNamenodeProtocolTranslatorPB over 
>> hadoopnn1.estpak.ee/88.196.164.42:8020. Trying to fail over immediately.
>> java.io.IOException: com.google.protobuf.ServiceException: 
>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>         at 
>> org.apache.hadoop.ipc.ProtobufHelper.getRemoteException(ProtobufHelper.java:47)
>>         at 
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:580)
>>         at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
>>         at 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:497)
>>         at 
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252)
>>         at 
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>>         at com.sun.proxy.$Proxy16.getListing(Unknown Source)
>>         at 
>> org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2094)
>>         at 
>> org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2077)
>>         at 
>> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:832)
>>         at 
>> org.apache.hadoop.hdfs.DistributedFileSystem.access$700(DistributedFileSystem.java:106)
>>         at 
>> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:863)
>>         at 
>> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:859)
>>         at 
>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>         at 
>> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:859)
>>         at 
>> org.apache.hadoop.hive.common.FileUtils.isOwnerOfFileHierarchy(FileUtils.java:481)
>>         at 
>> org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer.isURIAccessAllowed(RangerHiveAuthorizer.java:749)
>>         at 
>> org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer.checkPrivileges(RangerHiveAuthorizer.java:252)
>>   at 
>> org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer.isURIAccessAllowed(RangerHiveAuthorizer.java:749)
>>         at 
>> org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer.checkPrivileges(RangerHiveAuthorizer.java:252)
>>         at 
>> org.apache.hadoop.hive.ql.Driver.doAuthorizationV2(Driver.java:817)
>>         at 
>> org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:608)
>>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:499)
>>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:314)
>>         at 
>> org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1164)
>>         at 
>> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1158)
>>         at 
>> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:110)
>>         at 
>> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:181)
>>         at 
>> org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
>>         at 
>> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:410)
>>         at 
>> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:397)
>>         at 
>> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:274)
>>         at 
>> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486)
>>         at 
>> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
>>         at 
>> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
>>         at 
>> org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>>         at 
>> org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>>         at 
>> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692)
>>         at 
>> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
>>         at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>         at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>         at java.lang.Thread.run(Thread.java:745)
>> Caused by: com.google.protobuf.ServiceException: 
>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>         at 
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:271)
>>         at com.sun.proxy.$Proxy15.getListing(Unknown Source)
>>         at 
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:573)
>>         ... 39 more
>>
>> I have hdfs namenode high availability configured and automatic fail 
>> over enabled. I can see that active namenode does not change during 
>> the creating table process.
>>
>> Also I have hive high availability configured.
>>
>>
>>
>>
>>
>>
>>
>>
>> Margus (margusja) Roo
>> http://margus.roo.ee
>> skype: margusja
>> +372 51 48 780
>> On 11/05/16 12:26, Margus Roo wrote:
>>>
>>> Sadly in our environment:
>>>
>>>
>>> Generated files like you did.
>>>
>>> Connected to: Apache Hive (version 1.2.1.2.3.4.0-3485)
>>> Driver: Hive JDBC (version 1.2.1.2.3.4.0-3485)
>>> Transaction isolation: TRANSACTION_REPEATABLE_READ
>>> 0: jdbc:hive2://hadoopnn1.estpak.ee:2181,hado> create external table 
>>> files_10k (i int) row format delimited fields terminated by '\t' 
>>> location '/user/margusja/files_10k';
>>> Error: Shutdown in progress, cannot remove a shutdownHook 
>>> (state=,code=0)
>>> 0: jdbc:hive2://hadoopnn1.estpak.ee:2181,hado>
>>>
>>> Using just hive:
>>>
>>> [margusja@hadoopnn1 ~]$ hive
>>> WARNING: Use "yarn jar" to launch YARN applications.
>>> log4j:WARN No such property [maxBackupIndex] in 
>>> org.apache.log4j.DailyRollingFileAppender.
>>>
>>> Logging initialized using configuration in 
>>> file:/etc/hive/2.3.4.0-3485/0/hive-log4j.properties
>>> hive> create external table files_10k (i int) row format delimited 
>>> fields terminated by '\t' location '/user/margusja/files_10k';
>>> OK
>>> Time taken: 1.255 seconds
>>> hive>
>>>
>>>
>>> Margus (margusja) Roo
>>> http://margus.roo.ee
>>> skype: margusja
>>> +372 51 48 780
>>> On 11/05/16 10:16, Markovitz, Dudu wrote:
>>>> create external table files_10k (i int) row format delimited fields 
>>>> terminated by '\t' location '/tmp/files_10k';
>>>
>>
>


Mime
View raw message