hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <>
Subject [jira] [Created] (HIVE-13161) ORC: Always do sloppy overlaps for DiskRanges
Date Fri, 26 Feb 2016 00:36:18 GMT
Gopal V created HIVE-13161:

             Summary: ORC: Always do sloppy overlaps for DiskRanges
                 Key: HIVE-13161
             Project: Hive
          Issue Type: Bug
    Affects Versions: 1.3.0, 2.1.0
            Reporter: Gopal V
            Assignee: Prasanth Jayachandran

The selected columns are sometimes only a few bytes apart (particularly for nulls which compresses
tightly) and the reads aren't merged 

The WORST_UNCOMPRESSED_SLOP is only applied in the PPD case and is applied more for safety
than reducing total number of round-trip calls to filesystem.

   * Update the disk ranges to collapse adjacent or overlapping ranges. It
   * assumes that the ranges are sorted.
   * @param ranges the list of disk ranges to merge
  static void mergeDiskRanges(List<DiskRange> ranges) {
    DiskRange prev = null;
    for(int i=0; i < ranges.size(); ++i) {
      DiskRange current = ranges.get(i);
      if (prev != null && overlap(prev.offset, prev.end,
          current.offset, current.end)) {
        prev.offset = Math.min(prev.offset, current.offset);
        prev.end = Math.max(prev.end, current.end);
        i -= 1;
      } else {
        prev = current;
  private static boolean overlap(long leftA, long rightA, long leftB, long rightB) {
    if (leftA <= leftB) {
      return rightA >= leftB;
    return rightB >= leftA;


This message was sent by Atlassian JIRA

View raw message