hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liyin Tang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-15482) Provide an option to skip calculating block locations for SnapshotInputFormat
Date Fri, 18 Mar 2016 04:19:33 GMT
Liyin Tang created HBASE-15482:
----------------------------------

             Summary: Provide an option to skip calculating block locations for SnapshotInputFormat
                 Key: HBASE-15482
                 URL: https://issues.apache.org/jira/browse/HBASE-15482
             Project: HBase
          Issue Type: Improvement
          Components: mapreduce
            Reporter: Liyin Tang
            Priority: Minor


When a MR job is reading from SnapshotInputFormat, it needs to calculate the splits based
on the block locations in order to get best locality. However, this process may take a long
time for large snapshots. 

In some setup, the computing layer, Spark, Hive or Presto could run out side of HBase cluster.
In these scenarios, the block locality doesn't matter. Therefore, it will be great to have
an option to skip calculating the block locations for every job. That will super useful for
the Hive/Presto/Spark connectors.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message