hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Pestritto <m...@pestritto.com>
Subject Re: Trunk runtime errors
Date Wed, 13 May 2009 16:32:54 GMT
Prasad.

My query is pretty complex so I created a simple test case for you.  I first
tried on a table with only 1 partition and that succeeded.  I then tried
with two partitions and that did not copy the data.  So it seems like it is
only for tables with more than 1 partition.

I ran this in the CLI.

drop table hive_test_src;
create table hive_test_src ( col1 string ) stored as textfile ;
load data local inpath '/home/mpestritto/hive_test/data.dat' overwrite into
table hive_test_src ;

drop table hive_test_dst;
create table hive_test_dst ( col1 string ) partitioned by ( pcol1 string ,
pcol2 string) stored as sequencefile;

insert overwrite table hive_test_dst partition ( pcol1='test_part' ,
pcol2='test_part') select col1 from hive_test_src ;
select count(1) from hive_test_dst where pcol1='test_part' and
pcol2='test_part';

mpestritto@mustique:~/hive_test$ cat data.dat
1
2
3
4
5
6


CLI - OUTPUT:

hive> drop table hive_test_src;
OK
Time taken: 0.188 seconds
hive> create table hive_test_src ( col1 string ) stored as textfile ;
OK
Time taken: 0.098 seconds
hive> load data local inpath '/home/mpestritto/hive_test/data.dat' overwrite
into table hive_test_src ;
Copying data from file:/home/mpestritto/hive_test/data.dat
Loading data to table hive_test_src
OK
Time taken: 0.36 seconds
hive>
    > drop table hive_test_dst;
OK
Time taken: 0.124 seconds
hive> create table hive_test_dst ( col1 string ) partitioned by ( pcol1
string , pcol2 string) stored as sequencefile;
OK
Time taken: 0.084 seconds
hive>
    > insert overwrite table hive_test_dst partition ( pcol1='test_part' ,
pcol2='test_part') select col1 from hive_test_src ;
Total MapReduce jobs = 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_200905111618_0098, Tracking URL =
http://mustique.ps.tld:50030/jobdetails.jsp?jobid=job_200905111618_0098
Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
-Dmapred.job.tracker=mustique.ps.tld:9001 -kill job_200905111618_0098
 map = 0%,  reduce =0%
 map = 100%,  reduce =100%
Ended Job = job_200905111618_0098
Loading data to table hive_test_dst partition {pcol1=test_part,
pcol2=test_part}
6 Rows loaded to hive_test_dst
OK
Time taken: 5.687 seconds
hive> select count(1) from hive_test_dst where pcol1='test_part' and
pcol2='test_part';
Total MapReduce jobs = 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Job need not be submitted: no output: Success
OK
Time taken: 0.41 seconds
hive>



On Wed, May 13, 2009 at 12:15 PM, Prasad Chakka <pchakka@facebook.com>wrote:

> Matt,
>
> Can you send me the query for the first problem? Also whether the directory
> for the partition exists before the query is issued?
>
> Thanks,
> Prasad
>
>
> ________________________________
> From: Matt Pestritto <matt@pestritto.com>
> Reply-To: <hive-dev@hadoop.apache.org>
> Date: Wed, 13 May 2009 09:04:38 -0700
> To: <hive-dev@hadoop.apache.org>
> Subject: Trunk runtime errors
>
> All -
>
> 1st problem.
> I was having a problem loading data into partitions when the partition did
> not exist and traced the problem to revision 772746.  Trunk also has the
> same error.
> Revision 772746 SVN Commend:  HIVE-442. Create partitions after data is
> moved in the query in order to close out an inconsistent window. (Prasad
> Chakka via athusoo)
> Revision 772012 works fine for me.
>
> Essentially the partition directories are created but the data is never
> copied over.  If I run the same job again, the data is copied to the target
> directory in HDFS.
>
> 2nd problem.
> When I try to do a select count(1) from a table I get the following
> exception and I'm not sure what the cause is.  Again, this works fine if I
> roll back to revision 772012.
> Job Submission failed with exception
> 'java.lang.IllegalArgumentException(Wrong FS: file:/tmp/hive-hive/1,
> expected: hdfs://mustique.ps.tld:9000)'
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.ExecDriver
>
> Let me know if I can facilitate further.
>
> Thanks
> -Matt
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message