hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher, Pat" <>
Subject Hive Error on medium sized dataset
Date Thu, 27 Jan 2011 01:47:03 GMT
I'm attempting to load a small to medium sized log file, ~250MB, and produce some basic reports
from it, counts etc.  Nothing fancy.  However, whenever I try and read the entire dataset,
~330k rows, I get the following error:

  FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask

This result gets produced with basic queries like:

  SELECT count(1) FROM medium_table;

However, if do the following:

  SELECT count(1) FROM ( SELECT col1 FROM medium_table LIMIT 70000 ) tbl;

It works okay until I get to around 70,800ish then I get the first error message again.  I'm
running my HDFS system in single node, pseudo distributed mode with 1.5GB of memory and 20
GB of disk as a virtual machine.  And I am using a custom SerDe.  I don't think it's the SerDe
but I'm open to suggestions for how I can check if it is causing the problem.  I can't see
anything in the data that would be causing it though.

Anyone have any ideas of what might be causing this or something I can check?


View raw message