hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mafish Liu <maf...@gmail.com>
Subject Re: help!
Date Thu, 28 Jan 2010 07:32:23 GMT
you can avoid moving data by create a external table, as

CREATE EXTERNAL TABLE collect_info (
  id string,
  t1 string,
  t2 string,
  t3 string,
  t4 string,
  t5 string,
  collector string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE;


2010/1/28 Fu Ecy <fuzhijie1985@gmail.com>:
> <property>
>   <name>hive.metastore.warehouse.dir</name>
>   <value>/group/tbdev/kunlun/henshao/hive/</value>
>   <description>location of default database for the warehouse</description>
> </property>
>
> <property>
>   <name>hive.exec.scratchdir</name>
>   <value>/group/tbdev/kunlun/henshao/hive/temp</value>
>   <description>Scratch space for Hive jobs</description>
> </property>
>
> [kunlun@gate2 ~]$ hive --config config/ -u root -p root
> Hive history file=/tmp/kunlun/hive_job_log_kunlun_201001281514_422659187.txt
> hive> create table pokes (foo int, bar string);
> OK
> Time taken: 0.825 seconds
>
> Yes, I have the permission for Hive's warehouse directory and  tmp
> directory.
>
> 2010/1/28 김영우 <warwithin@gmail.com>
>>
>> Hi Fu,
>>
>> Your query seems correct but I think, It's a problem related HDFS
>> permission.
>> Did you set right permission for Hive's warehouse directory and  tmp
>> directory?
>> Seems user 'kunlun' does not have WRITE permission for hive warehouse
>> directory.
>>
>> Youngwoo
>>
>> 2010/1/28 Fu Ecy <fuzhijie1985@gmail.com>
>>>
>>> 2010-01-27 12:58:22,182 ERROR ql.Driver
>>> (SessionState.java:printError(303)) - FAILED: Parse Error: line 2:10 cannot
>>> recognize
>>>  input ',' in column type
>>>
>>> org.apache.hadoop.hive.ql.parse.ParseException: line 2:10 cannot
>>> recognize input ',' in column type
>>>
>>>         at
>>> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:357)
>>>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:249)
>>>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:290)
>>>         at
>>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:163)
>>>         at
>>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:221)
>>>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:335)
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>         at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>         at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
>>>         at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
>>>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>>         at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
>>>
>>> 2010-01-27 12:58:40,394 ERROR hive.log
>>> (MetaStoreUtils.java:logAndThrowMetaException(570)) - Got exception:
>>> org.apache.hadoop
>>> .security.AccessControlException
>>> org.apache.hadoop.security.AccessControlException: Permission denied:
>>> user=kunlun, access=WR
>>> ITE, inode="user":hadoop:cug-admin:rwxr-xr-x
>>> 2010-01-27 12:58:40,395 ERROR hive.log
>>> (MetaStoreUtils.java:logAndThrowMetaException(571)) -
>>> org.apache.hadoop.security.Acces
>>> sControlException: org.apache.hadoop.security.AccessControlException:
>>> Permission denied: user=kunlun, access=WRITE, inode="us
>>> er":hadoop:cug-admin:rwxr-xr-x
>>>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>>> Method)
>>>         at
>>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>>>         at
>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>>>         at
>>> java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>>>         at
>>> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:96)
>>>         at
>>> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:58)
>>>         at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:831)
>>>         at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:257)
>>>         at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1118)
>>>         at
>>> org.apache.hadoop.hive.metastore.Warehouse.mkdirs(Warehouse.java:123)
>>>         at
>>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table(HiveMetaStore.java:505)
>>>         at
>>> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:256)
>>>         at
>>> org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:254)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:883)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:105)
>>>         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:388)
>>>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:294)
>>>         at
>>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:163)
>>>         at
>>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:221)
>>>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:335)
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>         at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>         at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
>>>         at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
>>>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>>         at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
>>> Caused by: org.apache.hadoop.ipc.RemoteException:
>>> org.apache.hadoop.security.AccessControlException: Permission denied: user=
>>> kunlun, access=WRITE, inode="user":hadoop:cug-admin:rwxr-xr-x
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.PermissionChecker.check(PermissionChecker.java:176)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.PermissionChecker.check(PermissionChecker.java:157)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.PermissionChecker.checkPermission(PermissionChecker.java:105)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4400)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:4370)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:1771)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:1740)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.mkdirs(NameNode.java:471)
>>>         at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
>>>         at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
>>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)
>>>
>>>         at org.apache.hadoop.ipc.Client.call(Client.java:697)
>>>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>         at $Proxy4.mkdirs(Unknown Source)
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>         at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>         at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>>         at
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>>>         at
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>>>         at $Proxy4.mkdirs(Unknown Source)
>>>         at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:829)
>>>         ... 22 more
>>>
>>> Is there any problem with the input data format?
>>>
>>> CREATE TABLE collect_info (
>>>   id string,
>>>   t1 string,
>>>   t2 string,
>>>   t3 string,
>>>   t4 string,
>>>   t5 string,
>>>   collector string)
>>> ROW FORMAT DELIMITED
>>> FIELDS TERMINATED BY '\t'
>>> STORED AS TEXTFILE;
>>>
>>> 5290086045      330952255       1       2010-01-26 02:41:27
>>> 0       196050201       2010-01-26 02:41:27     2010-01-26 02:41:27
>>> qijansher93771          0       1048
>>>
>>> Fields are separated by '\t', I want to get the fields mark by red.
>>>
>>> 2010/1/28 Eric Sammer <eric@lifeless.net>
>>>>
>>>> On 1/27/10 10:59 PM, Fu Ecy wrote:
>>>> > I want to load some files on HDFS to a hive table, but there is
>>>> > an execption as follow:
>>>> > hive> load data inpath
>>>> > '/group/taobao/taobao/dw/stb/20100125/collect_info/*' into table
>>>> > collect_info;
>>>> > Loading data to table collect_info
>>>> > Failed with exception addFiles: error while moving files!!!
>>>> > FAILED: Execution Error, return code 1 from
>>>> > org.apache.hadoop.hive.ql.exec.MoveTask
>>>> >
>>>> > But, when I download the files from HDFS to local machine, then load
>>>> > them into the table, it works.
>>>> > Data in '/group/taobao/taobao/dw/stb/20100125/collect_info/*' is a
>>>> > little more than 200GB.
>>>> >
>>>> > I need to use the Hive to make some statistics.
>>>> > much thanks :-)
>>>>
>>>> The size of the files shouldn't really matter (move operations affect
>>>> metadata only - the blocks aren't rewritten or anything like that).
>>>> Check in your Hive log files (by default in /tmp/<user>/hive.log on
the
>>>> local machine you run Hive on, I believe) and you should see a stack
>>>> trace with additional information.
>>>>
>>>> Regards.
>>>> --
>>>> Eric Sammer
>>>> eric@lifeless.net
>>>> http://esammer.blogspot.com
>>>
>>
>
>



-- 
Mafish@gmail.com

Mime
View raw message