hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vladimir Rodionov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7336) HFileBlock.readAtOffset does not work well with multiple threads
Date Wed, 02 Jul 2014 22:03:25 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14050777#comment-14050777
] 

Vladimir Rodionov commented on HBASE-7336:
------------------------------------------

I was not right, Lars. *DFSInputStream* overrides positional read - no locks. But there is
something else ...

There is no much sense in allowing one random scanner run in a stream mode as since, there
is no guarantee that next call to read HFile block from the "lucky" scanner will use the same
streaming API and pre-cached data will still be valid. Some other scanner might dump this
data before. Correct? 

You may try all *pread*'s, for all scanners and compare performance. I bet it will be close
to what we have right now. 

> HFileBlock.readAtOffset does not work well with multiple threads
> ----------------------------------------------------------------
>
>                 Key: HBASE-7336
>                 URL: https://issues.apache.org/jira/browse/HBASE-7336
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Performance
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Critical
>             Fix For: 0.94.4, 0.95.0
>
>         Attachments: 7336-0.94.txt, 7336-0.96.txt
>
>
> HBase grinds to a halt when many threads scan along the same set of blocks and neither
read short circuit is nor block caching is enabled for the dfs client ... disabling the block
cache makes sense on very large scans.
> It turns out that synchronizing in istream in HFileBlock.readAtOffset is the culprit.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message