hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Himanshu Vashishtha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6774) Immediate assignment of regions that don't have entries in HLog
Date Tue, 30 Apr 2013 21:26:17 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13646005#comment-13646005
] 

Himanshu Vashishtha commented on HBASE-6774:
--------------------------------------------

Thanks Enis.

Yes, WAL approach is also there but I think they both have their own plus and minus points.
I proposed the ServerLoad approach because it is self contained and doesn't involve any changes
in WAL/SequenceFile, etc, and re-uses existing ServerLoad object. 

In WAL meta data case, some meta data should be appended at the end of a WAL file. This involves
adding custom key-value while closing the WAL file, and a check while reading every record
(whether it is a meta record or not, etc).
Since it will be added at the end, master needs to open the reader and seek to the end of
the file. This meta data should be read for all the log files, in a sequential manner starting
from the oldest wal file in order to track a region timeline. This is in addition to reading
the last WAL file.
An application that have high write rates, a regionserver may have larger number of WALs to
replay.

Another point is, IMHO, this feature should be made configurable as there might be some workloads
which may not require this (writes distributed on all key-space, etc). With WAL approach,
it becomes little bit tricky to make this feature optional, as it is inserting meta data in
the WAL. With some meta entry in a WAL file, LogReader should always be aware of such entries,
be it ReplicationLogReaders or LogSplitter as they might be reading some old logs, etc.

bq. It seems that this can work, but the relative gain may not be that much to justify it.
This is just an alternative approach to the WAL one, and I think it is less intrusive. But
I am open to both and would like to hear more of your opinions on the above points. 

                
> Immediate assignment of regions that don't have entries in HLog
> ---------------------------------------------------------------
>
>                 Key: HBASE-6774
>                 URL: https://issues.apache.org/jira/browse/HBASE-6774
>             Project: HBase
>          Issue Type: Improvement
>          Components: master, regionserver
>    Affects Versions: 0.95.2
>            Reporter: Nicolas Liochon
>            Assignee: Himanshu Vashishtha
>         Attachments: HBase-6774-approach.pdf
>
>
> The algo is today, after a failure detection:
> - split the logs
> - when all the logs are split, assign the regions
> But some regions can have no entries at all in the HLog. There are many reasons for this:
> - kind of reference or historical tables. Bulk written sometimes then read only.
> - sequential rowkeys. In this case, most of the regions will be read only. But they can
be in a regionserver with a lot of writes.
> - tables flushed often for safety reasons. I'm thinking about meta here.
> For meta; we can imagine flushing very often. Hence, the recovery for meta, in many cases,
will be the failure detection time.
> There are different possible algos:
> Option 1)
>  A new task is added, in parallel of the split. This task reads all the HLog. If there
is no entry for a region, this region is assigned.
>  Pro: simple
>  Cons: We will need to read all the files. Add a read.
> Option 2)
>  The master writes in ZK the number of log files, per region.
>  When the regionserver starts the split, it reads the full block (64M) and decrease the
log file counter of the region. If it reaches 0, the assign start. At the end of its split,
the region server decreases the counter as well. This allow to start the assign even if not
all the HLog are finished. It would allow to make some regions available even if we have an
issue in one of the log file.
>  Pro: parallel
>  Cons: add something to do for the region server. Requites to read the whole file before
starting to write. 
> Option 3)
>  Add some metadata at the end of the log file. The last log file won't have meta data,
as if we are recovering, it's because the server crashed. But the others will. And last log
file should be smaller (half a block on average).  
> Option 4) Still some metadata, but in a different file. Cons: write are increased (but
not that much, we just need to write the region once). Pros: if we lose the HLog files (major
failure, no replica available) we can still continue with the regions that were not written
at this stage.
> I think it should be done, even if none of the algorithm above is totally convincing
yet. It's linked as well to locality and short circuit reads: with these two points reading
the file twice become much less of an issue for example. My current preference would be to
open the file twice in the region server, once for splitting as of today, once for a quick
read looking for unused regions. Who knows, may be it would even be faster this way, the quick
read thread would warm-up the different caches for the splitting thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message