hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ameet chaubal <ameetchau...@yahoo.com>
Subject Re: large sql file creating large num of columns
Date Mon, 16 Jan 2012 16:40:20 GMT
thanks,

Running it with debug, it spews the following,
stored as textfile location '<myfile>'
12/01/16 11:28:54 INFO parse.ParseDriver: Parse Completed
12/01/16 11:28:54 INFO parse.SemanticAnalyzer: Starting Semantic Analysis
12/01/16 11:28:54 INFO parse.SemanticAnalyzer: Creating table my_table position=22

and it's waiting there at SemanticAnalyzer...

 
Sincerely,


Ameet


________________________________
 From: Bejoy Ks <bejoy_ks@yahoo.com>
To: "user@hive.apache.org" <user@hive.apache.org>; Edward Capriolo <edlinuxguru@gmail.com>;
ameet chaubal <ameetchaubal@yahoo.com> 
Sent: Monday, January 16, 2012 10:24 AM
Subject: Re: large sql file creating large num of columns
 

Hey Ameet
       Please find some pointers inline.

All that hive is supposed to do is to load the definition into mysql, right? 
[Bejoy] Yes you are right

Are you suggesting that it's reading the datafile in HDFS?
[Bejoy] AFAIK it won't do that at the time of table creation. Just meta data entries happen
at this stage.


 That should not be happening since the "external table" does not need the data to be present,
right? 

[Bejoy]  Again your understanding is right.


I can't give you much of a hint on why the query takes 5 hrs cos I've never tried out such
large number of columns. What you can do at this point is enable DEBUG logging in hive and
try to get more some stack trace and see whether it is some issue with parsing the DDL or
while making data base calls for storing metadata.

Regards
Bejoy.K.S


________________________________
 From: ameet chaubal <ameetchaubal@yahoo.com>
To: Edward Capriolo <edlinuxguru@gmail.com>; "user@hive.apache.org" <user@hive.apache.org>

Sent: Monday, January 16, 2012 8:44 PM
Subject: Re: large sql file creating large num of columns
 

thanks,

this is an external table; so at the DDL stage, there is no data loading that is happening.
All that hive is supposed to do is to load the definition into mysql, right? Are you suggesting
that it's reading the datafile in HDFS? That should not be happening since the "external table"
does not need the data to be present, right?
 
Sincerely,


Ameet


________________________________
 From: Edward Capriolo <edlinuxguru@gmail.com>
To: user@hive.apache.org; ameet chaubal <ameetchaubal@yahoo.com> 
Sent: Monday, January 16, 2012 10:06 AM
Subject: Re: large sql file creating large num of columns
 

I highly doubt this will work. I think that many things in hadoop and hive will try to buffer
an entire row so even if you make it past the metastore I do not think it will be of any use. 

On Mon, Jan 16, 2012 at 9:42 AM, ameet chaubal <ameetchaubal@yahoo.com> wrote:

Hi All,
>
>
>I have a SQL file of size 30mb which is a single create table statement with about 800,000
columns, hence the size. 
>
>I am trying to execute it using hive -f <file>. Initially, hive ran the command
with 256mb heap size and gave me an OOM error. I increased the heap size using export HADOOP_HEAPSIZE
to 1 gb and eventually 2gb which made the OOM error go away. However, the hive command ran
for 5 hours without actually creating the table. The JVM was running.
>However,
>1. running a strace on the process showed that it was stuck on a futex call.
>2. I am using mysql for metastore and there were no rows added to either TBLS or COLUMNS
table.
>
>
>Question.
>1. can hive do this create table of 800k columns from a sql file of 30mb?
>2. if theoretically possible, what could be happening that's taking it over 5 hours and
still not succeeding?
>
>
>any insight is much appreciated.
> 
>Sincerely,
>
>
>Ameet
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message