hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hadoop n00b <new2h...@gmail.com>
Subject Re: Hive Error on medium sized dataset
Date Thu, 27 Jan 2011 05:09:18 GMT
We typically get this error while running complex queries on our 4-node
setup when the child JVM runs out of heap size. Would be interested in what
the experts have to say about this error.

On Thu, Jan 27, 2011 at 7:27 AM, Ajo Fod <ajo.fod@gmail.com> wrote:

> Any chance you can convert the data to a tab separated text file and try
> the same query?
>
> It may not be the SerDe, but it may be good to isolate that away as a
> potential  source of the problem.
>
> -Ajo.
>
>
> On Wed, Jan 26, 2011 at 5:47 PM, Christopher, Pat <
> patrick.christopher@hp.com> wrote:
>
>>  Hi,
>>
>> I’m attempting to load a small to medium sized log file, ~250MB, and
>> produce some basic reports from it, counts etc.  Nothing fancy.  However,
>> whenever I try and read the entire dataset, ~330k rows, I get the following
>> error:
>>
>>
>>
>>   FAILED: Execution Error, return code 2 from
>> org.apache.hadoop.hive.ql.exec.MapRedTask
>>
>>
>>
>> This result gets produced with basic queries like:
>>
>>
>>
>>   SELECT count(1) FROM medium_table;
>>
>>
>>
>> However, if do the following:
>>
>>
>>
>>   SELECT count(1) FROM ( SELECT col1 FROM medium_table LIMIT 70000 ) tbl;
>>
>>
>>
>> It works okay until I get to around 70,800ish then I get the first error
>> message again.  I’m running my HDFS system in single node, pseudo
>> distributed mode with 1.5GB of memory and 20 GB of disk as a virtual
>> machine.  And I am using a custom SerDe.  I don’t think it’s the SerDe but
>> I’m open to suggestions for how I can check if it is causing the problem.  I
>> can’t see anything in the data that would be causing it though.
>>
>>
>>
>> Anyone have any ideas of what might be causing this or something I can
>> check?
>>
>>
>>
>> Thanks,
>>
>> Pat
>>
>
>

Mime
View raw message