hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "nkeywal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7948) client doesn't need to refresh meta while the region is opening
Date Tue, 05 Mar 2013 21:24:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13593930#comment-13593930
] 

nkeywal commented on HBASE-7948:
--------------------------------

bq. Can you please elaborate about more dangerous parts? 
I was thinking about the code that we're slowly removing with HBASE-8002. It has 3 sides effects:
1) It was decreasing the performances, it has been fixed in numerous patches, but there are
still scary comments and issues (HBASE-7247)
2) It was hiding issues. In the tests we had very low timeout, so master failover scenarios
seemed to be working. In production, we were depending on a 10 minutes timeout but we didn't
know.
3) It was causing double assignment issues, i.e. data corruption.

This was exactly the same logic (don't trust the RS), with more dramatic consequences.

bq.  In my experience it's not safe to trust anything forever as a general principle, not
because I think RS code is unreliable.
I'm not against this, but in this case we need to tackle this the standard way: watchdog the
process, and exclude the fuzzy ones from the group. Before doing this, I could like to see
the chaos monkey test working with kill -9 for a while (I doubt it does today :-( )
But I agree with your point, and we will have this soon or later (BTW, it's exactly why there
are checksums in hdfs: because you can't trust the storage).

bq.  But for client there's no data loss potential from flushing the cache, but there's potential
to be stuck forever in case of abnormal RS behavior. With remote things I prefer to be defensive
on all sides if practical 

Yeah. I'm likely biased. So, imho
- the patch is an improvement. HBase is better with this patch than without.
- it would be simpler without the RS-trust part
- to me, at the margin on degraded conditions, it would be more efficient without the RS-trust
part as well.

As we need to make progress :-), I propose:
1) Well, if you're not against the idea of removing the RS-trust part, we're done
2) If you really want to keep it, let's wait a few days if someone wants to come by. If no
one does, let's commit on Friday.

What do you think?


                
> client doesn't need to refresh meta while the region is opening
> ---------------------------------------------------------------
>
>                 Key: HBASE-7948
>                 URL: https://issues.apache.org/jira/browse/HBASE-7948
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HBASE-7948-v0.patch, HBASE-7948-v1.patch, HBASE-7948-v1.patch, HBASE-7948-v2.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message