hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "HBase Review Board (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1025) Reconstruction log playback has no bounds on memory used
Date Thu, 24 Jun 2010 06:14:50 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12882061#action_12882061
] 

HBase Review Board commented on HBASE-1025:
-------------------------------------------

Message from: stack@duboce.net


bq.  On 2010-06-22 16:07:23, Nicolas wrote:
bq.  > couple questions after reviewing.  didn't look at previous reviews first, so sorry
if I duplicated commentary

Thanks for reviewing Nicolas.


bq.  On 2010-06-22 16:07:23, Nicolas wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/HConstants.java, line 143
bq.  > <http://review.hbase.org/r/179/diff/2/?file=1356#file1356line143>
bq.  >
bq.  >     don't you still want to keep around code to read oldlogfile.log & just remove
write path?  We're not changing Log file format between 0.20=>0.21, so a customer should
be able to cleanly upgrade.

I think the format has changed between 0.20 and 0.21, no? (we envelope all edits on a row
now, for example, whereas in 0.20 we just did edits as they came in).

So, to read in old WAL logs, we're talking migration -- reading w/ a class that understands
old format and converting to the new.   But, at least in the past, the first requirement migrating
has been a clean shutdown of old hbase cluster.  On clean shutdown, there should be no WAL
present.   In other words we've always gone out of our way for need migrating WALs across
*major* versions.


bq.  On 2010-06-22 16:07:23, Nicolas wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 1897
bq.  > <http://review.hbase.org/r/179/diff/2/?file=1358#file1358line1897>
bq.  >
bq.  >     technically, it's -1 if no outstanding log edits exist.  you store the max sequence
ID even if you skip all the edits.

Good point.


bq.  On 2010-06-22 16:07:23, Nicolas wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 1921
bq.  > <http://review.hbase.org/r/179/diff/2/?file=1358#file1358line1921>
bq.  >
bq.  >     is there a use case for putting HDFS in safe mode, then running HBase with hbase.skip.errors
do see the state of the cluster?  If so, fs.delete + fs.rename will both assert when this
is played on cluster restart.  Maybe you want to catch both and print errors?

Let me add the suggested print.

Regards what hbase does when FS under it flips out, there is https://issues.apache.org/jira/browse/HBASE-2183
that is for looking into this.


bq.  On 2010-06-22 16:07:23, Nicolas wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 1981
bq.  > <http://review.hbase.org/r/179/diff/2/?file=1358#file1358line1981>
bq.  >
bq.  >     do you want to update the currentEditSeqId even if it's from the wrong family?
 just making sure.

Yes.  I think thats right thing to do.  As we move through the log the seqid is increasing
regardless.


bq.  On 2010-06-22 16:07:23, Nicolas wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 2002
bq.  > <http://review.hbase.org/r/179/diff/2/?file=1358#file1358line2002>
bq.  >
bq.  >     do we want the option to store this HLog for post-mortem in this case?  we're
talking about CF-level, so this couldn't happen because of region splitting, right?

This condition should never happen.  Only reason it might would be if schema was edited between
log creation and new deploy.   It'd be cumbersome adding a keep log at this stage of the processing.
 Should I open an issue? 


bq.  On 2010-06-22 16:07:23, Nicolas wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 2018
bq.  > <http://review.hbase.org/r/179/diff/2/?file=1358#file1358line2018>
bq.  >
bq.  >     would it make more sense to have the interval be in seconds instead of count,
then have the update give the edit count?  Or is the difference in restoring large edits (~50k)
versus small ones inconsequential?

You are right.


- stack


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/179/#review269
-----------------------------------------------------------





> Reconstruction log playback has no bounds on memory used
> --------------------------------------------------------
>
>                 Key: HBASE-1025
>                 URL: https://issues.apache.org/jira/browse/HBASE-1025
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>             Fix For: 0.21.0
>
>         Attachments: 1025-v2.txt, 1025-v3.txt, 1025-v5.patch, 1025-v8.txt, 1025.txt
>
>
> Makes a TreeMap and just keeps adding edits without regard for size of edits applied;
could cause OOME (I've not seen a definitive case though have seen an OOME around time of
a reconstructionlog replay -- perhaps this the straw that broke the fleas antlers?)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message