hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HBASE-12295) Prevent block eviction under us if reads are in progress from the BBs
Date Fri, 29 May 2015 12:15:17 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14564311#comment-14564311
] 

Anoop Sam John edited comment on HBASE-12295 at 5/29/15 12:14 PM:
------------------------------------------------------------------

bq.We'll have to dig in on why. You'd think w/ less intermediaries that it would be faster.
It should be the cost at socket layer and we will need N transfers instead of one. This one
time transfer was looking better even if we need a temp copy.

Regarding knowing whether L1 or L2 looking at key, actually this info of whether L1 or L2
is a state of HFileBlock.  We have added this with an enum L1/L2/NOT_CACHED.  Based on this
type, we decided at the HFileScanner layer (on close) whether to call return on BlockCache.
Also within the BlockCache impl, we might need to know the type. This is for CombineBC.  If
it is L2, then we call the BucketCache return and else call LRU cache return.  So if we add
the L1/L2 info also to BlockCacheKey, I am not sure whether this looks clean. BlockCacheKey
is some thing which we will be creating while fetching the block from BC. While return, we
can just pass the info by setting it in BlockCacheKey. It will just act as a carrier then.
 Or may be we can use HFileBlock object alone in the return API? Using a key we have got an
object from a cache and we return *that* object back to the cache.  It is always possible
to make the BlockCacheKey from HFileBlock.  
bq. You going to mark the object as from L2 or something
Yes. HFileBlock will contain state info whether it is from L1 or L2 or NOT_CACHED one.   When
it is CombinedBC, HFileReader ask the cache to give block and it returns the HFileBlock. So
we are not sure from where it has come L1/L2. So better set it as a state info in HFileBlock

carry the cellBlock in Result, am not sure..  At HRegion level, the get() return a Result
but the scanner returns a List of Cells.  Then in RsRpcServer level, we call in al loop to
make those many rows/results as per caching/max size limit.  Even if we make it to return
a Result in scan area also, it will make overhead of creating smaller sized cellBlock buffer
for each of the rows. So finally we will have to deal with more smaller size block buffers.
It will be better to collect all rows and then make a single cellBlock at once for the scan
case. Making sense?  Agree to your point of not passing RPC stuff even to HRegion level. We
have to see what else we can do to return this payload. 

I think I got now what is in your mind on saying finalize/close on Result and handle things
that way.  Right now, when we get a block from BC, we increase its ref count by 1, means one
scanner is working on this. So if we have to do in this suggestion, then whenever we are creating
a cell from this block, we have to again increment the ref count.  Some thing like java ref
counting way.  Only Q is Result/Cell is a client side thing and am not sure how we can add
server only BlockCache/ HFileBlock...  But this would have made max NOT copy to happen.. Thinking
more...
When the cell is written to stream, we have to close/finalize it. Also if the cell is filtered
we have to do the same.  It can get filtered out from the o/p result by filterRow(List<Cell>)
also...  If the cell is transformed by Filter#transformCell(Cell v)  then also do the same
on old..  I would say this will add more complexity and chances we miss the close.  What do
you think?



was (Author: anoop.hbase):
bq.We'll have to dig in on why. You'd think w/ less intermediaries that it would be faster.
It should be the cost at socket layer and we will need N transfers instead of one. This one
time transfer was looking better even if we need a temp copy.

Regarding knowing whether L1 or L2 looking at key, actually this info of whether L1 or L2
is a state of HFileBlock.  We have added this with an enum L1/L2/NOT_CACHED.  Based on this
type, we decided at the HFileScanner layer (on close) whether to call return on BlockCache.
Also within the BlockCache impl, we might need to know the type. This is for CombineBC.  If
it is L2, then we call the BucketCache return and else call LRU cache return.  So if we add
the L1/L2 info also to BlockCacheKey, I am not sure whether this looks clean. BlockCacheKey
is some thing which we will be creating while fetching the block from BC. While return, we
can just pass the info by setting it in BlockCacheKey. It will just act as a carrier then.
 Or may be we can use HFileBlock object alone in the return API? Using a key we have got an
object from a cache and we return *that* object back to the cache.  It is always possible
to make the BlockCacheKey from HFileBlock.  
bq. You going to mark the object as from L2 or something
Yes. HFileBlock will contain state info whether it is from L1 or L2 or NOT_CACHED one.   When
it is CombinedBC, HFileReader ask the cache to give block and it returns the HFileBlock. So
we are not sure from where it has come L1/L2. So better set it as a state info in HFileBlock

carry the cellBlock in Result, am not sure..  At HRegion level, the get() return a Result
but the scanner returns a List of Cells.  Then in RsRpcServer level, we call in al loop to
make those many rows/results as per caching/max size limit.  Even if we make it to return
a Result in scan area also, it will make overhead of creating smaller sized cellBlock buffer
for each of the rows. So finally we will have to deal with more smaller size block buffers.
It will be better to collect all rows and then make a single cellBlock at once for the scan
case. Making sense?  Agree to your point of not passing RPC stuff even to HRegion level. We
have to see what else we can do to return this payload. 

I think I got now what is in your mind on saying finalize/close on Result and handle things
that way.  Right now, when we get a block from BC, we increase its ref count by 1, means one
scanner is working on this. So if we have to do in this suggestion, then whenever we are creating
a cell from this block, we have to again increment the ref count.  Some thing like java ref
counting way.  Only Q is Result/Cell is a client side thing and am not sure how we can add
server only BlockCache/ HFileBlock...  But this would have made max NOT copy to happen.. Thinking
more...


> Prevent block eviction under us if reads are in progress from the BBs
> ---------------------------------------------------------------------
>
>                 Key: HBASE-12295
>                 URL: https://issues.apache.org/jira/browse/HBASE-12295
>             Project: HBase
>          Issue Type: Sub-task
>          Components: regionserver, Scanners
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 2.0.0
>
>         Attachments: HBASE-12295.pdf, HBASE-12295_trunk.patch
>
>
> While we try to serve the reads from the BBs directly from the block cache, we need to
ensure that the blocks does not get evicted under us while reading.  This JIRA is to discuss
and implement a strategy for the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message