hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fu Ecy <fuzhijie1...@gmail.com>
Subject Re: help!
Date Thu, 28 Jan 2010 07:48:28 GMT
Much thanks for all :-)

2010/1/28 Zheng Shao <zshao9@gmail.com>

> Please see http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL for
> how to use "External" table.
> You don't need to "load" into external table because external table
> can directly point to your data directory.
>
> Zheng
>
> On Wed, Jan 27, 2010 at 11:38 PM, Fu Ecy <fuzhijie1985@gmail.com> wrote:
> > hive> CREATE EXTERNAL TABLE collect_info (
> >     >
> >     >  id string,
> >     >  t1 string,
> >     >  t2 string,
> >     >  t3 string,
> >     >  t4 string,
> >     >  t5 string,
> >     >  collector string)
> >     > ROW FORMAT DELIMITED
> >     > FIELDS TERMINATED BY '\t'
> >     > STORED AS TEXTFILE;
> > OK
> > Time taken: 0.234 seconds
> >
> > hive> load data inpath
> >
> '/group/taobao/taobao/dw/stb/20100125/collect_info/coll_9.collect_info575'
> > overwrite into table collect_info;
> > Loading data to table collect_info
> > Failed with exception replaceFiles: error while moving files!!!
> > FAILED: Execution Error, return code 1 from
> > org.apache.hadoop.hive.ql.exec.MoveTask
> >
> > It doesn't wok.
> >
> > 2010/1/28 Fu Ecy <fuzhijie1985@gmail.com>
> >>
> >> I think this is the problem, I don't have the write permissions to the
> >> source files/directories. Thank you, Shao :-)
> >>
> >> 2010/1/28 Zheng Shao <zshao9@gmail.com>
> >>>
> >>> When Hive loads data from HDFS, it moves the files instead of copying
> the
> >>> files.
> >>>
> >>> That means the current user should have write permissions to the
> >>> source files/directories as well.
> >>> Can you check that?
> >>>
> >>> Zheng
> >>>
> >>> On Wed, Jan 27, 2010 at 11:18 PM, Fu Ecy <fuzhijie1985@gmail.com>
> wrote:
> >>> > <property>
> >>> >   <name>hive.metastore.warehouse.dir</name>
> >>> >   <value>/group/tbdev/kunlun/henshao/hive/</value>
> >>> >   <description>location of default database for the
> >>> > warehouse</description>
> >>> > </property>
> >>> >
> >>> > <property>
> >>> >   <name>hive.exec.scratchdir</name>
> >>> >   <value>/group/tbdev/kunlun/henshao/hive/temp</value>
> >>> >   <description>Scratch space for Hive jobs</description>
> >>> > </property>
> >>> >
> >>> > [kunlun@gate2 ~]$ hive --config config/ -u root -p root
> >>> > Hive history
> >>> > file=/tmp/kunlun/hive_job_log_kunlun_201001281514_422659187.txt
> >>> > hive> create table pokes (foo int, bar string);
> >>> > OK
> >>> > Time taken: 0.825 seconds
> >>> >
> >>> > Yes, I have the permission for Hive's warehouse directory and  tmp
> >>> > directory.
> >>> >
> >>> > 2010/1/28 김영우 <warwithin@gmail.com>
> >>> >>
> >>> >> Hi Fu,
> >>> >>
> >>> >> Your query seems correct but I think, It's a problem related HDFS
> >>> >> permission.
> >>> >> Did you set right permission for Hive's warehouse directory and
 tmp
> >>> >> directory?
> >>> >> Seems user 'kunlun' does not have WRITE permission for hive
> warehouse
> >>> >> directory.
> >>> >>
> >>> >> Youngwoo
> >>> >>
> >>> >> 2010/1/28 Fu Ecy <fuzhijie1985@gmail.com>
> >>> >>>
> >>> >>> 2010-01-27 12:58:22,182 ERROR ql.Driver
> >>> >>> (SessionState.java:printError(303)) - FAILED: Parse Error:
line
> 2:10
> >>> >>> cannot
> >>> >>> recognize
> >>> >>>  input ',' in column type
> >>> >>>
> >>> >>> org.apache.hadoop.hive.ql.parse.ParseException: line 2:10 cannot
> >>> >>> recognize input ',' in column type
> >>> >>>
> >>> >>>         at
> >>> >>>
> >>> >>>
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:357)
> >>> >>>         at
> org.apache.hadoop.hive.ql.Driver.compile(Driver.java:249)
> >>> >>>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:290)
> >>> >>>         at
> >>> >>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:163)
> >>> >>>         at
> >>> >>>
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:221)
> >>> >>>         at
> >>> >>> org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:335)
> >>> >>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> >>> >>> Method)
> >>> >>>         at
> >>> >>>
> >>> >>>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >>> >>>         at
> >>> >>>
> >>> >>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >>> >>>         at java.lang.reflect.Method.invoke(Method.java:597)
> >>> >>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
> >>> >>>         at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
> >>> >>>         at
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >>> >>>         at
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >>> >>>         at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
> >>> >>>
> >>> >>> 2010-01-27 12:58:40,394 ERROR hive.log
> >>> >>> (MetaStoreUtils.java:logAndThrowMetaException(570)) - Got
> exception:
> >>> >>> org.apache.hadoop
> >>> >>> .security.AccessControlException
> >>> >>> org.apache.hadoop.security.AccessControlException: Permission
> denied:
> >>> >>> user=kunlun, access=WR
> >>> >>> ITE, inode="user":hadoop:cug-admin:rwxr-xr-x
> >>> >>> 2010-01-27 12:58:40,395 ERROR hive.log
> >>> >>> (MetaStoreUtils.java:logAndThrowMetaException(571)) -
> >>> >>> org.apache.hadoop.security.Acces
> >>> >>> sControlException:
> org.apache.hadoop.security.AccessControlException:
> >>> >>> Permission denied: user=kunlun, access=WRITE, inode="us
> >>> >>> er":hadoop:cug-admin:rwxr-xr-x
> >>> >>>         at
> >>> >>> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> >>> >>> Method)
> >>> >>>         at
> >>> >>>
> >>> >>>
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> >>> >>>         at
> >>> >>>
> >>> >>>
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> >>> >>>         at
> >>> >>> java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> >>> >>>         at
> >>> >>>
> >>> >>>
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:96)
> >>> >>>         at
> >>> >>>
> >>> >>>
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:58)
> >>> >>>         at
> >>> >>> org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:831)
> >>> >>>         at
> >>> >>>
> >>> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:257)
> >>> >>>         at
> >>> >>> org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1118)
> >>> >>>         at
> >>> >>>
> org.apache.hadoop.hive.metastore.Warehouse.mkdirs(Warehouse.java:123)
> >>> >>>         at
> >>> >>>
> >>> >>>
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table(HiveMetaStore.java:505)
> >>> >>>         at
> >>> >>>
> >>> >>>
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:256)
> >>> >>>         at
> >>> >>> org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:254)
> >>> >>>         at
> >>> >>>
> org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:883)
> >>> >>>         at
> >>> >>> org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:105)
> >>> >>>         at
> org.apache.hadoop.hive.ql.Driver.execute(Driver.java:388)
> >>> >>>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:294)
> >>> >>>         at
> >>> >>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:163)
> >>> >>>         at
> >>> >>>
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:221)
> >>> >>>         at
> >>> >>> org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:335)
> >>> >>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> >>> >>> Method)
> >>> >>>         at
> >>> >>>
> >>> >>>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >>> >>>         at
> >>> >>>
> >>> >>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >>> >>>         at java.lang.reflect.Method.invoke(Method.java:597)
> >>> >>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
> >>> >>>         at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
> >>> >>>         at
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >>> >>>         at
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >>> >>>         at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
> >>> >>> Caused by: org.apache.hadoop.ipc.RemoteException:
> >>> >>> org.apache.hadoop.security.AccessControlException: Permission
> denied:
> >>> >>> user=
> >>> >>> kunlun, access=WRITE, inode="user":hadoop:cug-admin:rwxr-xr-x
> >>> >>>         at
> >>> >>>
> >>> >>>
> org.apache.hadoop.hdfs.server.namenode.PermissionChecker.check(PermissionChecker.java:176)
> >>> >>>         at
> >>> >>>
> >>> >>>
> org.apache.hadoop.hdfs.server.namenode.PermissionChecker.check(PermissionChecker.java:157)
> >>> >>>         at
> >>> >>>
> >>> >>>
> org.apache.hadoop.hdfs.server.namenode.PermissionChecker.checkPermission(PermissionChecker.java:105)
> >>> >>>         at
> >>> >>>
> >>> >>>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4400)
> >>> >>>         at
> >>> >>>
> >>> >>>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:4370)
> >>> >>>         at
> >>> >>>
> >>> >>>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:1771)
> >>> >>>         at
> >>> >>>
> >>> >>>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:1740)
> >>> >>>         at
> >>> >>>
> >>> >>>
> org.apache.hadoop.hdfs.server.namenode.NameNode.mkdirs(NameNode.java:471)
> >>> >>>         at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown
> >>> >>> Source)
> >>> >>>         at
> >>> >>>
> >>> >>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >>> >>>         at java.lang.reflect.Method.invoke(Method.java:597)
> >>> >>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
> >>> >>>         at
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)
> >>> >>>
> >>> >>>         at org.apache.hadoop.ipc.Client.call(Client.java:697)
> >>> >>>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
> >>> >>>         at $Proxy4.mkdirs(Unknown Source)
> >>> >>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> >>> >>> Method)
> >>> >>>         at
> >>> >>>
> >>> >>>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >>> >>>         at
> >>> >>>
> >>> >>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >>> >>>         at java.lang.reflect.Method.invoke(Method.java:597)
> >>> >>>         at
> >>> >>>
> >>> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
> >>> >>>         at
> >>> >>>
> >>> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
> >>> >>>         at $Proxy4.mkdirs(Unknown Source)
> >>> >>>         at
> >>> >>> org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:829)
> >>> >>>         ... 22 more
> >>> >>>
> >>> >>> Is there any problem with the input data format?
> >>> >>>
> >>> >>> CREATE TABLE collect_info (
> >>> >>>   id string,
> >>> >>>   t1 string,
> >>> >>>   t2 string,
> >>> >>>   t3 string,
> >>> >>>   t4 string,
> >>> >>>   t5 string,
> >>> >>>   collector string)
> >>> >>> ROW FORMAT DELIMITED
> >>> >>> FIELDS TERMINATED BY '\t'
> >>> >>> STORED AS TEXTFILE;
> >>> >>>
> >>> >>> 5290086045      330952255       1       2010-01-26 02:41:27
> >>> >>> 0       196050201       2010-01-26 02:41:27     2010-01-26
02:41:27
> >>> >>> qijansher93771          0       1048
> >>> >>>
> >>> >>> Fields are separated by '\t', I want to get the fields mark
by red.
> >>> >>>
> >>> >>> 2010/1/28 Eric Sammer <eric@lifeless.net>
> >>> >>>>
> >>> >>>> On 1/27/10 10:59 PM, Fu Ecy wrote:
> >>> >>>> > I want to load some files on HDFS to a hive table,
but there is
> >>> >>>> > an execption as follow:
> >>> >>>> > hive> load data inpath
> >>> >>>> > '/group/taobao/taobao/dw/stb/20100125/collect_info/*'
into table
> >>> >>>> > collect_info;
> >>> >>>> > Loading data to table collect_info
> >>> >>>> > Failed with exception addFiles: error while moving
files!!!
> >>> >>>> > FAILED: Execution Error, return code 1 from
> >>> >>>> > org.apache.hadoop.hive.ql.exec.MoveTask
> >>> >>>> >
> >>> >>>> > But, when I download the files from HDFS to local
machine, then
> >>> >>>> > load
> >>> >>>> > them into the table, it works.
> >>> >>>> > Data in '/group/taobao/taobao/dw/stb/20100125/collect_info/*'
is
> a
> >>> >>>> > little more than 200GB.
> >>> >>>> >
> >>> >>>> > I need to use the Hive to make some statistics.
> >>> >>>> > much thanks :-)
> >>> >>>>
> >>> >>>> The size of the files shouldn't really matter (move operations
> >>> >>>> affect
> >>> >>>> metadata only - the blocks aren't rewritten or anything
like
> that).
> >>> >>>> Check in your Hive log files (by default in /tmp/<user>/hive.log
> on
> >>> >>>> the
> >>> >>>> local machine you run Hive on, I believe) and you should
see a
> stack
> >>> >>>> trace with additional information.
> >>> >>>>
> >>> >>>> Regards.
> >>> >>>> --
> >>> >>>> Eric Sammer
> >>> >>>> eric@lifeless.net
> >>> >>>> http://esammer.blogspot.com
> >>> >>>
> >>> >>
> >>> >
> >>> >
> >>>
> >>>
> >>>
> >>> --
> >>> Yours,
> >>> Zheng
> >>
> >
> >
>
>
>
> --
> Yours,
> Zheng
>

Mime
View raw message