hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "andychen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3040) BlockIndex readIndex too slowly in heavy write scenario
Date Sun, 26 Sep 2010 04:30:32 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914918#action_12914918
] 

andychen commented on HBASE-3040:
---------------------------------

In HFile.loadFileInfo, we can calculate all indices' size (including data block and meta block)
using trailer's information
int allIndexSize = (int)(this.fileSize - this.trailer.dataIndexOffset - FixedFileTrailer.trailerSize());

then, we add an new function--readAllIndex, in readAllIndex, we load all data and meta block
using one DFS read
byte[] dataAndMetaIndex = readAllIndex(this.istream, this.trailer.dataIndexOffset, allIndexSize);

Now, we can extract all indices data from local memory instead of remote datanode
Region server used to use readIndex to load indices data from datanode, in this function,
there may be 10000 network round trips in case of one storefile has 10000 blocks.
So, we add an other function readIndexEx to get data from local memory which returned by readAllIndex
above.
Under our test case, region server load about 1000 block indices spent several microseconds
stably

> BlockIndex readIndex too slowly in heavy write scenario
> -------------------------------------------------------
>
>                 Key: HBASE-3040
>                 URL: https://issues.apache.org/jira/browse/HBASE-3040
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.20.6
>         Environment: 1master, 7 region servers, 4 * 7 clients(all clients run on region
server host),  sequential put
>            Reporter: andychen
>
> region size is configured with 128M,  block size is 64K, the table has 5 column families
> at the beginning, when region split, master assigns daughters to new region servers,
new region server open region, readIndex of this region's storefile(about 1000 blocks) spent
30~50ms, with the data import region server spent more and more time (sometimes up to several
seconds) to load 1000 block indices
> at right now, we resolve this issue by getting all indices of one hfile within one DFS
read instead of 1000 reads.
> is there any other better resolution?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message