hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16372) References to previous cell in read path should be avoided
Date Fri, 19 Aug 2016 12:32:21 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15428111#comment-15428111
] 

Anoop Sam John commented on HBASE-16372:
----------------------------------------

I am thinking abt a simple way where there is no need to copy. Did not check in details wrt
code. Here it is
We keep referring to cur block in HFileReaderImpl and once we move out of that we will add
it to prevBlocks list and change curBlock to the new one.  So there comes the  issue. The
prevCell we refer some where else in the flow might have been from this just moved block and
that might get evicted if it is returned by an in btw shipped call.  So how abt it will be
if we keep ref to curBlock and prevBlock and the oldBlocks. When the curBlock move to next,
we change the curBlock and move the old cur block to prevBlock. If there was an already block
pointed by prevBlock move that to oldBlocks.    So when the call returnBlocks(boolean returnAll)
comes, we will return only oldBlocks if param is false. If true we return from all 3 refs.

Ideally when the read flow happens and we move to the second cell of the cur block (means
the prevCell would be the 1st one from cur block), the prevBlock is ready for return. It might
be hard to impl that.   If not done we might delay the return of one block until the cur block
is completed.  But it might be also ok IMO.  Because even if return, chances of this block
getting evicted is rare as the eviction follows LRU.  This block is very recently used any
way.

One more thing to note is that in read flow, when a seek or next call result in jumping out
many blocks in btw, are we assigning curBlock with the in btw blocks? If so there is a chance
that the prevCell is not just in prevBlock but in some old block.. Ya in btw many blocks we
read but all skipped.  May be because all cells in that are deleted or so.  Then there need
some refactor in way how we handle the curBlock ref.  

> References to previous cell in read path should be avoided
> ----------------------------------------------------------
>
>                 Key: HBASE-16372
>                 URL: https://issues.apache.org/jira/browse/HBASE-16372
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Scanners
>    Affects Versions: 2.0.0
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>            Priority: Critical
>             Fix For: 2.0.0
>
>         Attachments: HBASE-16372_testcase.patch, HBASE-16372_testcase_1.patch
>
>
> Came as part of review discussion in HBASE-15554. If there are references kept to previous
cells in the read path, with the Ref count based eviction mechanism in trunk, then chances
are there to evict a block backing the previous cell but the read path still does some operations
on that garbage collected previous cell leading to incorrect results.
> Areas to target
> -> Storescanner
> -> Bloom filters (particularly in compaction path)
> Thanks to [~anoop.hbase] to point out this in bloomfilter path. But we found it could
be in other areas also.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message