hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <j...@apache.org>
Subject [jira] Created: (HIVE-1050) Reduce the memory foot-print of HiveInputSplit
Date Wed, 13 Jan 2010 08:38:54 GMT
Reduce the memory foot-print of HiveInputSplit
----------------------------------------------

                 Key: HIVE-1050
                 URL: https://issues.apache.org/jira/browse/HIVE-1050
             Project: Hadoop Hive
          Issue Type: Improvement
            Reporter: Zheng Shao


{{HiveInputSplit}} now inherits from {{FileSplit}} just because we want {{MapTask}} to forward
the file name of the mapper:
This makes {{HiveInputSplit}} big. See MAPREDUCE-1374

{code}
  private void updateJobWithSplit(final JobConf job, InputSplit inputSplit) {
    if (inputSplit instanceof FileSplit) {
      FileSplit fileSplit = (FileSplit) inputSplit;
      job.set("map.input.file", fileSplit.getPath().toString());
      job.setLong("map.input.start", fileSplit.getStart());
      job.setLong("map.input.length", fileSplit.getLength());
      LOG.info("split: " + job.get("map.input.file")+", range: "
               + job.getLong("map.input.start", 0) + "-"
               + job.getLong("map.input.length", 0));
    }
  }

{code}

Once we move to the new MapReduce framework, we should be able to make smaller HiveInputFormat
which will reduce the amount of memory needed on {{JobClient}}.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message