hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11425) Cell/DBB end-to-end on the read-path
Date Tue, 10 Mar 2015 18:53:39 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355463#comment-14355463
] 

Anoop Sam John commented on HBASE-11425:
----------------------------------------

bq.This ain't right, is it? Usually we have folks hover just below 32G so can do compressed
pointers.
I think I have seen in mails some users have 48G also. At least some users who were trying
some PoCs.(Offline I met).. Any way can change to ~32G.  :-)
bq.Should bucket size be same as the hfile block size?
I wanted to come to this topic. This is hard coded now. Can this be made configurable? If
this can be larger value,like block size, better as per our changes
bq.Can MBB be developed in isolation with tests and refcounting tests apart from main code
base? Is that being done?
Yep. When put patches, we can make sure to do this way. Those in sub tasks.
bq.The eviction is now made more complicated because have to check for non-zero refcount?
And what if can't find necessary memory? What happens?
The eviction try evict some unused blocks. If all are like in read (worst case), the new block
can not be cached. May be that should be tried after a delay? 
bq.Why not? We copy from the LRU blocks to Cell arrays? Couldn't Cells go against the LRU
blocks directly too? Or I have it wrong?
In the LRU we cache the block object itself. It has its own underlying memory. Even if an
in read progress block is evicted, the memory area it refers to , is not freed. Only thing
is that after this read, that block will not be referenced and so the block data area too.
Am I making it clear?
bq.I don't see a downside listing that we'll be doubling the objects made when offheap reading.
Is that right?
In read say we deal with N HFileBlock, we will be having extra objects MBB objects created
for each block.  But per cell we wont create any new objects. In comparators etc, we check
hasArray() and based on that use the buffer/array based APIs.   When creating BB backed cells
from an HFileBlock which is backed by MBB, we try best to refer to original BB (and item in
MBB) and not create/duplicate extra BBs.  But  yes some etra objects will be there. (duplicated
BBs)  I can give a count based on a test scenario Stack. Was in middle of some thing else
and missed doing this.
bq. have to read from the MemStore so this means that read path can be a mix of onheap and
offheap results?
yes
bq. or maybe the holes have been plugged by 'Using getXXXArray() would throw UnSupportedOperationException.
'? And....
Yep.  If the Cell impl is backed by a BB (on heap/off heap) its getXXXArray APIs will throw
UnSupportedOperationException
bq.So, you might want to underline this point. Its BB but WE are managing the position and
length to save on object creation and to bypass BB range checking, etc
Yes. correct
bq.Client won't be offheaping? If so, could the BB APIs be mixed in to Cell on the server
only?
Some thing like a ServerCell which extend Cell? Sounds reasonable..  Have some discuss like
this also. 
bq.So, why have the switch at all? The hasArray switch? Why not BB it all the time? It would
simplify the read path. Disadvantage would be it'd be extra objects?
Yes the extra BB wrapper which has to be created every time one calls getXXXArray().  It is
an extra obj creation and some ops (like limit, pos checks) which happens in the BB classes.
That is bit costly only. Had done some Unit tests. Ram have the numbers or so?
bq.When you say this: "Note that even if the HFileBlock is on heap BB we do not support getXXXArray()
APIs. " This is only if hasArray returns false, right?
Yes when hasArray return false. The point is when the Cell is backed by a buffer then we will
have hasArray as false. (whether DBB/HBB)
bq.Tell us more about the unsafe manipulation of BBs? How's that work?
It reads data from BB bypassing the BB APIs. Directly read from memory. HBASE-12345 having
a patch which add Unsafe based compare for data in BB.  Similar way added for reading int/long
etc.  Same we do for bytes in Bytes.java


> Cell/DBB end-to-end on the read-path
> ------------------------------------
>
>                 Key: HBASE-11425
>                 URL: https://issues.apache.org/jira/browse/HBASE-11425
>             Project: HBase
>          Issue Type: Umbrella
>          Components: regionserver, Scanners
>    Affects Versions: 0.99.0
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>         Attachments: Offheap reads in HBase using BBs_final.pdf
>
>
> Umbrella jira to make sure we can have blocks cached in offheap backed cache. In the
entire read path, we can refer to this offheap buffer and avoid onheap copying.
> The high level items I can identify as of now are
> 1. Avoid the array() call on BB in read path.. (This is there in many classes. We can
handle class by class)
> 2. Support Buffer based getter APIs in cell.  In read path we will create a new Cell
with backed by BB. Will need in CellComparator, Filter (like SCVF), CPs etc.
> 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
> 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  (In read
path)
> Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message