hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Namit Jain (JIRA)" <j...@apache.org>
Subject [jira] Created: (HIVE-1641) add map joined table to distributed cache
Date Wed, 15 Sep 2010 22:01:37 GMT
add map joined table to distributed cache

                 Key: HIVE-1641
                 URL: https://issues.apache.org/jira/browse/HIVE-1641
             Project: Hadoop Hive
          Issue Type: Improvement
          Components: Query Processor
            Reporter: Namit Jain
             Fix For: 0.7.0

Currently, the mappers directly read the map-joined table from HDFS, which makes it difficult
to scale.
We end up getting lots of timeouts once the number of mappers are beyond a few thousand, due
concurrent mappers.

It would be good idea to put the mapped file into distributed cache and read from there instead.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message