hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2248) Provide new non-copy mechanism to assure atomic reads in get and scan
Date Thu, 08 Apr 2010 21:22:36 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855160#action_12855160
] 

stack commented on HBASE-2248:
------------------------------

Why do this?

{code}
-      List<KeyValue> results = new ArrayList<KeyValue>();
-      store.get(get, null, results);
-      return new Result(results);
+      get.addFamily(family);
+      return get(get, null);
{code}

Is it because of this....up in HRegion:

{code}
+      // The reason why we set it up high is so that each HRegionScanner only
+      // has a single read point for all its sub-StoreScanners.
+      ReadWriteConsistencyControl.resetThreadReadPoint(rwcc);
{code}

This bit of the patch looks like its breaking the accumulation of qualifiers:

{code}
-            List<KeyValue> result = new ArrayList<KeyValue>(1);
-            Get g = new Get(kv.getRow());
-            g.setMaxVersions(count);
-            NavigableSet<byte []> qualifiers =
-              new TreeSet<byte []>(Bytes.BYTES_COMPARATOR);
-            qualifiers.add(qual);
-            get(store, g, qualifiers, result);
+            Get get = new Get(kv.getRow());
+            get.setMaxVersions(count);
+            get.addColumn(family, qual);
+
+            List<KeyValue> result = get(get);
+
{code}

Or is this no longer needed because we go back into Region.get rather than do a Store.get?

I don't like this if clause style:

+      if (w != null)
+      rwcc.completeMemstoreInsert(w);

I'd suggest you either wrap it in params or put it all on the one line. 

For example, undo changes like this I'd say:

{code}
-      if(lockid == null) releaseRowLock(lid);
+      if(lockid == null)
+        releaseRowLock(lid);

{code}

I think this needs a comment:

+    private int isScan;

or maybe where its assigned so its clear why it can't be a boolean though its named as though
it were one... maybe change its name?

I don't get this:

{code}
+      // TODO call the proper GET API
       // Get the old value:
       Get get = new Get(row);
{code}

or this:

{code}
+    //noinspection SuspiciousMethodCalls
{code}

Why this change Ryan?

{code}
-class KeyValueSkipListSet implements NavigableSet<KeyValue>, Cloneable {
+class KeyValueSkipListSet implements NavigableSet<KeyValue> {
{code}

Its crazy how much code you've removed from MemStore around #195.

Ok, thats enough for now

> Provide new non-copy mechanism to assure atomic reads in get and scan
> ---------------------------------------------------------------------
>
>                 Key: HBASE-2248
>                 URL: https://issues.apache.org/jira/browse/HBASE-2248
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3
>            Reporter: Dave Latham
>            Priority: Blocker
>             Fix For: 0.20.4
>
>         Attachments: HBASE-2248-demonstrate-previous-impl-bugs.patch, HBASE-2248-GetsAsScans3.patch,
HBASE-2248-rr-alpha1.txt, HBASE-2248-rr-alpha2.txt, HBASE-2248-rr-alpha3.txt, HBASE-2248-ryan.patch,
hbase-2248.gc, HBASE-2248.patch, hbase-2248.txt, readownwrites-lost.2.patch, readownwrites-lost.patch,
Screen shot 2010-02-23 at 10.33.38 AM.png, threads.txt
>
>
> HBASE-2037 introduced a new MemStoreScanner which triggers a ConcurrentSkipListMap.buildFromSorted
clone of the memstore and snapshot when starting a scan.
> After upgrading to 0.20.3, we noticed a big slowdown in our use of short scans.  Some
of our data repesent a time series.   The data is stored in time series order, MR jobs often
insert/update new data at the end of the series, and queries usually have to pick up some
or all of the series.  These are often scans of 0-100 rows at a time.  To load one page, we'll
observe about 20 such scans being triggered concurrently, and they take 2 seconds to complete.
 Doing a thread dump of a region server shows many threads in ConcurrentSkipListMap.biuldFromSorted
which traverses the entire map of key values to copy it.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message