hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chia-Ping Tsai (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
Date Wed, 04 Oct 2017 09:13:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16191023#comment-16191023
] 

Chia-Ping Tsai commented on HBASE-18752:
----------------------------------------

bq.  after this change the min and max timeRange both will be same?
No, what this patch try to fix is to correct the {{TimeRange}} in the hfile. See {{TestHStore#testTimeRangeIfSomeCellsAreDroppedInFlush}}
{code}
+  @Test
+  public void testTimeRangeIfSomeCellsAreDroppedInFlush() throws IOException {
+    init(this.name.getMethodName(), TEST_UTIL.getConfiguration(),
+        ColumnFamilyDescriptorBuilder.newBuilder(family).setMaxVersions(1).build());
+    long currentTs = 100;
+    final long minTs = currentTs;
+    // this cell won't be flushed to disk
+    this.store.add(new KeyValue(row, family, qf1, currentTs++, (byte[])null), null);
+    // this cell won't be flushed to disk
+    this.store.add(new KeyValue(row, family, qf1, currentTs++, (byte[])null), null);
+    this.store.add(new KeyValue(row, family, qf1, currentTs++, (byte[])null), null);
+    flushStore(store, id++);
+
+    Collection<HStoreFile> files = store.getStorefiles();
+    assertEquals(1, files.size());
+    HStoreFile f = files.iterator().next();
+    f.initReader();
+    StoreFileReader reader = f.getReader();
+    assertEquals(currentTs - 1, reader.timeRange.getMin());
+    assertEquals(currentTs - 1, reader.timeRange.getMax());
+  }
{code}
Before this change, the min of timerange is {{currentTs}} but the cell having the {{currentTs}}
don't be stored in the hfiles because it is dropped. That is a bug causing we can't filter
the unnecessary file before staring reading the data block. After this patch, we can get the
correct min of timerange.


> Recalculate the TimeRange in flushing snapshot to store file
> ------------------------------------------------------------
>
>                 Key: HBASE-18752
>                 URL: https://issues.apache.org/jira/browse/HBASE-18752
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Chia-Ping Tsai
>            Assignee: Chia-Ping Tsai
>             Fix For: 2.0.0-beta-1
>
>         Attachments: HBASE-18752.v0.patch
>
>
> We drop superfluous cells in flushing, hence the TimeRange from snapshot is inaccurate
for the storefile. We should recalculate the TimeRange for the storefile, but the side-effect
is the extra cost - we need to extract the timestamp from cell (ByteBufferCell).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message