hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2248) Provide new non-copy mechanism to assure atomic reads in get and scan
Date Thu, 08 Apr 2010 21:22:36 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855160#action_12855160

stack commented on HBASE-2248:

Why do this?

-      List<KeyValue> results = new ArrayList<KeyValue>();
-      store.get(get, null, results);
-      return new Result(results);
+      get.addFamily(family);
+      return get(get, null);

Is it because of this....up in HRegion:

+      // The reason why we set it up high is so that each HRegionScanner only
+      // has a single read point for all its sub-StoreScanners.
+      ReadWriteConsistencyControl.resetThreadReadPoint(rwcc);

This bit of the patch looks like its breaking the accumulation of qualifiers:

-            List<KeyValue> result = new ArrayList<KeyValue>(1);
-            Get g = new Get(kv.getRow());
-            g.setMaxVersions(count);
-            NavigableSet<byte []> qualifiers =
-              new TreeSet<byte []>(Bytes.BYTES_COMPARATOR);
-            qualifiers.add(qual);
-            get(store, g, qualifiers, result);
+            Get get = new Get(kv.getRow());
+            get.setMaxVersions(count);
+            get.addColumn(family, qual);
+            List<KeyValue> result = get(get);

Or is this no longer needed because we go back into Region.get rather than do a Store.get?

I don't like this if clause style:

+      if (w != null)
+      rwcc.completeMemstoreInsert(w);

I'd suggest you either wrap it in params or put it all on the one line. 

For example, undo changes like this I'd say:

-      if(lockid == null) releaseRowLock(lid);
+      if(lockid == null)
+        releaseRowLock(lid);


I think this needs a comment:

+    private int isScan;

or maybe where its assigned so its clear why it can't be a boolean though its named as though
it were one... maybe change its name?

I don't get this:

+      // TODO call the proper GET API
       // Get the old value:
       Get get = new Get(row);

or this:

+    //noinspection SuspiciousMethodCalls

Why this change Ryan?

-class KeyValueSkipListSet implements NavigableSet<KeyValue>, Cloneable {
+class KeyValueSkipListSet implements NavigableSet<KeyValue> {

Its crazy how much code you've removed from MemStore around #195.

Ok, thats enough for now

> Provide new non-copy mechanism to assure atomic reads in get and scan
> ---------------------------------------------------------------------
>                 Key: HBASE-2248
>                 URL: https://issues.apache.org/jira/browse/HBASE-2248
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3
>            Reporter: Dave Latham
>            Priority: Blocker
>             Fix For: 0.20.4
>         Attachments: HBASE-2248-demonstrate-previous-impl-bugs.patch, HBASE-2248-GetsAsScans3.patch,
HBASE-2248-rr-alpha1.txt, HBASE-2248-rr-alpha2.txt, HBASE-2248-rr-alpha3.txt, HBASE-2248-ryan.patch,
hbase-2248.gc, HBASE-2248.patch, hbase-2248.txt, readownwrites-lost.2.patch, readownwrites-lost.patch,
Screen shot 2010-02-23 at 10.33.38 AM.png, threads.txt
> HBASE-2037 introduced a new MemStoreScanner which triggers a ConcurrentSkipListMap.buildFromSorted
clone of the memstore and snapshot when starting a scan.
> After upgrading to 0.20.3, we noticed a big slowdown in our use of short scans.  Some
of our data repesent a time series.   The data is stored in time series order, MR jobs often
insert/update new data at the end of the series, and queries usually have to pick up some
or all of the series.  These are often scans of 0-100 rows at a time.  To load one page, we'll
observe about 20 such scans being triggered concurrently, and they take 2 seconds to complete.
 Doing a thread dump of a region server shows many threads in ConcurrentSkipListMap.biuldFromSorted
which traverses the entire map of key values to copy it.  

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message