hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prasad Chakka <pcha...@facebook.com>
Subject Re: partitions not being created
Date Thu, 30 Jul 2009 18:57:09 GMT
Bill,

The real error is happening on the Hive Metastore Server or Hive Server  (depending on the
setup you are using). Error logs on it must have different stack trace. From the information
below I am guessing that the way the destination table hdfs directories that got created has
some problems. Can you drop that table (and make sure that there is no corresponding HDFS
directory for both integer and string type partitions that you created) and retry the query.

If you don't want to drop the destination table then send me the logs on Hive Server.

Prasad


________________________________
From: Bill Graham <billgraham@gmail.com>
Reply-To: <billgraham@gmail.com>
Date: Thu, 30 Jul 2009 11:47:41 -0700
To: Prasad Chakka <pchakka@facebook.com>
Cc: <hive-user@hadoop.apache.org>
Subject: Re: partitions not being created

That file contains a similar error as the Hive Server logs:

2009-07-30 11:44:21,095 WARN  mapred.JobClient (JobClient.java:configureCommandLineOptions(510))
- Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for
the same.
2009-07-30 11:44:48,070 WARN  mapred.JobClient (JobClient.java:configureCommandLineOptions(510))
- Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for
the same.
2009-07-30 11:45:27,796 ERROR metadata.Hive (Hive.java:getPartition(588)) - org.apache.thrift.TApplicationException:
get_partition failed: unknown result
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partition(ThriftHiveMetastore.java:784)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partition(ThriftHiveMetastore.java:752)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartition(HiveMetaStoreClient.java:415)
        at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:579)
        at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:466)
        at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:135)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:335)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:241)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:122)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:165)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:258)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
        at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
        at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

2009-07-30 11:45:27,797 ERROR exec.MoveTask (SessionState.java:printError(279)) - Failed with
exception org.apache.thrift.TApplicationException: get_partition failed: unknown result
org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.thrift.TApplicationException:
get_partition failed: unknown result
        at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:589)
        at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:466)
        at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:135)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:335)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:241)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:122)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:165)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:258)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
        at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
        at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
Caused by: org.apache.thrift.TApplicationException: get_partition failed: unknown result
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partition(ThriftHiveMetastore.java:784)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partition(ThriftHiveMetastore.java:752)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartition(HiveMetaStoreClient.java:415)
        at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:579)
        ... 16 more

2009-07-30 11:45:27,798 ERROR ql.Driver (SessionState.java:printError(279)) - FAILED: Execution
Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask

On Thu, Jul 30, 2009 at 11:33 AM, Prasad Chakka <pchakka@facebook.com> wrote:

The hive logs go into /tmp/$USER/hive.log not hive_job_log*.txt.


________________________________
From: Bill Graham <billgraham@gmail.com <http://billgraham@gmail.com> >
Reply-To: <billgraham@gmail.com <http://billgraham@gmail.com> >
Date: Thu, 30 Jul 2009 10:52:06 -0700
To: Prasad Chakka <pchakka@facebook.com <http://pchakka@facebook.com> >
Cc: <hive-user@hadoop.apache.org <http://hive-user@hadoop.apache.org> >, Zheng
Shao <zshao9@gmail.com <http://zshao9@gmail.com> >

Subject: Re: partitions not being created

I'm trying to set a string to a string and I'm seeing this error. I also had an attempt where
it was a string to an int, and I also saw the same error.

The /tmp/$USER/hive_job_log*.txt file doesn't contain any exceptions, but I've included it's
output below. Only the Hive server logs show the exceptions listed above. (Note that the table
I'm loading from in this log output is ApiUsageSmall, which is identical to ApiUsageTemp.
For some reason the data from ApiUsageTemp is now gone.)

QueryStart QUERY_STRING="INSERT OVERWRITE TABLE ApiUsage PARTITION (dt = "20090518") SELECT
`(requestDate)?+.+` FROM ApiUsageSmall WHERE requestDate = '2009/05/18'" QUERY_ID="app_20090730104242"
TIME="1248975752235"
TaskStart TASK_NAME="org.apache.hadoop.hive.ql.exec.ExecDriver" TASK_ID="Stage-1" QUERY_ID="app_20090730104242"
TIME="1248975752235"
TaskProgress TASK_HADOOP_PROGRESS="2009-07-30 10:42:34,783 map = 0%,  reduce =0%" TASK_NAME="org.apache.hadoop.hive.ql.exec.ExecDriver"
TASK_COUNTERS="Job Counters .Launched map tasks:1,Job Counters .Data-local map tasks:1" TASK_ID="Stage-1"
QUERY_ID="app_20090730104242" TASK_HADOOP_ID="job_200906301559_0409" TIME="1248975754785"
TaskProgress ROWS_INSERTED="apiusage~296" TASK_HADOOP_PROGRESS="2009-07-30 10:42:43,031 map
= 40%,  reduce =0%" TASK_NAME="org.apache.hadoop.hive.ql.exec.ExecDriver" TASK_COUNTERS="File
Systems.HDFS bytes read:23019,File Systems.HDFS bytes written:19178,Job Counters .Rack-local
map tasks:2,Job Counters .Launched map tasks:5,Job Counters .Data-local map tasks:3,org.apache.hadoop.hive.ql.exec.FilterOperator$Counter.PASSED:592,org.apache.hadoop.hive.ql.exec.FilterOperator$Counter.FILTERED:6,org.apache.hadoop.hive.ql.exec.FileSinkOperator$TableIdEnum.TABLE_ID_1_ROWCOUNT:296,org.apache.hadoop.hive.ql.exec.MapOperator$Counter.DESERIALIZE_ERRORS:0,Map-Reduce
Framework.Map input records:302,Map-Reduce Framework.Map input bytes:23019,Map-Reduce Framework.Map
output records:0" TASK_ID="Stage-1" QUERY_ID="app_20090730104242" TASK_HADOOP_ID="job_200906301559_0409"
TIME="1248975763033"
TaskProgress ROWS_INSERTED="apiusage~1471" TASK_HADOOP_PROGRESS="2009-07-30 10:42:44,068 map
= 100%,  reduce =100%" TASK_NAME="org.apache.hadoop.hive.ql.exec.ExecDriver" TASK_COUNTERS="File
Systems.HDFS bytes read:114068,File Systems.HDFS bytes written:95275,Job Counters .Rack-local
map tasks:2,Job Counters .Launched map tasks:5,Job Counters .Data-local map tasks:3,org.apache.hadoop.hive.ql.exec.FilterOperator$Counter.PASSED:2942,org.apache.hadoop.hive.ql.exec.FilterOperator$Counter.FILTERED:27,org.apache.hadoop.hive.ql.exec.FileSinkOperator$TableIdEnum.TABLE_ID_1_ROWCOUNT:1471,org.apache.hadoop.hive.ql.exec.MapOperator$Counter.DESERIALIZE_ERRORS:0,Map-Reduce
Framework.Map input records:1498,Map-Reduce Framework.Map input bytes:114068,Map-Reduce Framework.Map
output records:0" TASK_ID="Stage-1" QUERY_ID="app_20090730104242" TASK_HADOOP_ID="job_200906301559_0409"
TIME="1248975764071"
TaskEnd ROWS_INSERTED="apiusage~1471" TASK_RET_CODE="0" TASK_HADOOP_PROGRESS="2009-07-30 10:42:44,068
map = 100%,  reduce =100%" TASK_NAME="org.apache.hadoop.hive.ql.exec.ExecDriver" TASK_COUNTERS="File
Systems.HDFS bytes read:114068,File Systems.HDFS bytes written:95275,Job Counters .Rack-local
map tasks:2,Job Counters .Launched map tasks:5,Job Counters .Data-local map tasks:3,org.apache.hadoop.hive.ql.exec.FilterOperator$Counter.PASSED:2942,org.apache.hadoop.hive.ql.exec.FilterOperator$Counter.FILTERED:27,org.apache.hadoop.hive.ql.exec.FileSinkOperator$TableIdEnum.TABLE_ID_1_ROWCOUNT:1471,org.apache.hadoop.hive.ql.exec.MapOperator$Counter.DESERIALIZE_ERRORS:0,Map-Reduce
Framework.Map input records:1498,Map-Reduce Framework.Map input bytes:114068,Map-Reduce Framework.Map
output records:0" TASK_ID="Stage-1" QUERY_ID="app_20090730104242" TASK_HADOOP_ID="job_200906301559_0409"
TIME="1248975764199"
TaskStart TASK_NAME="org.apache.hadoop.hive.ql.exec.ConditionalTask" TASK_ID="Stage-4" QUERY_ID="app_20090730104242"
TIME="1248975764199"
TaskEnd TASK_RET_CODE="0" TASK_NAME="org.apache.hadoop.hive.ql.exec.ConditionalTask" TASK_ID="Stage-4"
QUERY_ID="app_20090730104242" TIME="1248975782277"
TaskStart TASK_NAME="org.apache.hadoop.hive.ql.exec.MoveTask" TASK_ID="Stage-0" QUERY_ID="app_20090730104242"
TIME="1248975782277"
TaskEnd TASK_RET_CODE="1" TASK_NAME="org.apache.hadoop.hive.ql.exec.MoveTask" TASK_ID="Stage-0"
QUERY_ID="app_20090730104242" TIME="1248975782473"
QueryEnd ROWS_INSERTED="apiusage~1471" QUERY_STRING="INSERT OVERWRITE TABLE ApiUsage PARTITION
(dt = "20090518") SELECT `(requestDate)?+.+` FROM ApiUsageSmall WHERE requestDate = '2009/05/18'"
QUERY_ID="app_20090730104242" QUERY_NUM_TASKS="2" TIME="1248975782474"



On Thu, Jul 30, 2009 at 10:09 AM, Prasad Chakka <pchakka@facebook.com <http://pchakka@facebook.com>
> wrote:
Are you sure you are getting the same error even with the schema below (i.e. trying to set
a string to an int column?). Can you give the full stack trace that you might see in /tmp/$USER/hive.log?


________________________________
From: Bill Graham <billgraham@gmail.com <http://billgraham@gmail.com>  <http://billgraham@gmail.com>
>
Reply-To: <hive-user@hadoop.apache.org <http://hive-user@hadoop.apache.org>  <http://hive-user@hadoop.apache.org>
>, <billgraham@gmail.com <http://billgraham@gmail.com>  <http://billgraham@gmail.com>
>

Date: Thu, 30 Jul 2009 10:02:54 -0700
To: Zheng Shao <zshao9@gmail.com <http://zshao9@gmail.com>  <http://zshao9@gmail.com>
>
Cc: <hive-user@hadoop.apache.org <http://hive-user@hadoop.apache.org>  <http://hive-user@hadoop.apache.org>
>

Subject: Re: partitions not being created


Based on these describe statements, is what I'm trying to do feasable? I'm basically trying
to load the contents of ApiUsageTemp into ApiUsage, with the ApiUsageTemp.requestdate column
becoming the ApiUsage.dt partition.


On Wed, Jul 29, 2009 at 9:28 AM, Bill Graham <billgraham@gmail.com <http://billgraham@gmail.com>
 <http://billgraham@gmail.com> > wrote:
Sure. The only difference I see is that the ApiUsage has a dt partition, instead of the requestdate
column:

hive> describe extended ApiUsage;
OK
user    string
restresource    string
statuscode      int
requesthour     int
numrequests     string
responsetime    string
numslowrequests string
dt      string

Detailed Table Information      Table(tableName:apiusage, dbName:default, owner:grahamb, createTime:1248884801,
lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:user, type:string,
comment:null), FieldSchema(name:restresource, type:string, comment:null), FieldSchema(name:statuscode,
type:int, comment:null), FieldSchema(name:requesthour, type:int, comment:null), FieldSchema(name:numrequests,
type:string, comment:null), FieldSchema(name:responsetime, type:string, comment:null), FieldSchema(name:numslowrequests,
type:string, comment:null)], location:hdfs://xxxxxxx:9000/user/hive/warehouse/apiusage <http://c18-ssa-dev40-so-qry1.cnet.com:9000/user/hive/warehouse/apiusage>
, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat,
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
parameters:{field.delim= , serialization.format= }), bucketCols:[], sortCols:[], parameters:{}),
partitionKeys:[FieldSchema(name:dt, type:string, comment:null)], parameters:{})

Time taken: 0.277 seconds
hive> describe extended ApiUsageTemp;
OK
user    string
restresource    string
statuscode      int
requestdate     string
requesthour     int
numrequests     string
responsetime    string
numslowrequests string

Detailed Table Information      Table(tableName:apiusagetemp, dbName:default, owner:grahamb,
createTime:1248466925, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:user,
type:string, comment:null), FieldSchema(name:restresource, type:string, comment:null), FieldSchema(name:statuscode,
type:int, comment:null), FieldSchema(name:requestdate, type:string, comment:null), FieldSchema(name:requesthour,
type:int, comment:null), FieldSchema(name:numrequests, type:string, comment:null), FieldSchema(name:responsetime,
type:string, comment:null), FieldSchema(name:numslowrequests, type:string, comment:null)],
location:hdfs://xxxxxxx:9000/user/hive/warehouse/apiusage <http://c18-ssa-dev40-so-qry1.cnet.com:9000/user/hive/warehouse/apiusage>
, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat,
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
parameters:{field.delim= , serialization.format= }), bucketCols:[], sortCols:[], parameters:{}),
partitionKeys:[], parameters:{last_modified_time=1248826696, last_modified_by=app})

Time taken: 0.235 seconds



On Tue, Jul 28, 2009 at 9:03 PM, Zheng Shao <zshao9@gmail.com <http://zshao9@gmail.com>
 <http://zshao9@gmail.com> > wrote:
Can you send the output of these 2 commands?

describe extended ApiUsage;
describe extended ApiUsageTemp;


Zheng

On Tue, Jul 28, 2009 at 6:29 PM, Bill Graham<billgraham@gmail.com <http://billgraham@gmail.com>
 <http://billgraham@gmail.com> > wrote:
> Thanks for the tip, but it fails in the same way when I use a string.
>
> On Tue, Jul 28, 2009 at 6:21 PM, David Lerman <dlerman@videoegg.com <http://dlerman@videoegg.com>
 <http://dlerman@videoegg.com> > wrote:
>>
>> >> hive> create table partTable (a string, b int) partitioned by (dt int);
>>
>> > INSERT OVERWRITE TABLE ApiUsage PARTITION (dt = "20090518")
>> > SELECT `(requestDate)?+.+` FROM ApiUsageTemp WHERE requestDate =
>> > '2009/05/18'
>>
>> The table has an int partition column (dt), but you're trying to set a
>> string value (dt = "20090518").
>>
>> Try :
>>
>> create table partTable (a string, b int) partitioned by (dt string);
>>
>> and then do your insert.
>>
>
>



--
Yours,
Zheng








Mime
View raw message