hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12411) Avoid seek + read completely?
Date Sun, 16 Nov 2014 23:35:33 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14214133#comment-14214133
] 

stack commented on HBASE-12411:
-------------------------------

Is that all it takes?

What about your trick to flip to pread if we are seek+reading already? That will still work
right?  Because compactions have own file, they'll seek+read?

So, seek + read makes (slight sense -- 10 or 20% better throughput?) if long scan and only
one scan going on at a time.  Otherwise, if lots of small gets and scans, pread makes more
sense.

seek +read blocks out preads when its running till HDFS-6735 is fixed.

Compactions are long reads.  Makes sense to do seek + read for these.  Giving them their own
file means they won't interfere with ongoing gets/scans.  I'll suppose we'll open a lot of
files with NN when doing a compaction.  Could take a while if a bunch of files to open. We
open in //?  So, this could  make compactions take a bit longer... but compactions are background
task so ok?

Add a toggle so its easy to flip it on and then lets try and get some numbers?

Good stuff [~lhofhansl]

> Avoid seek + read completely?
> -----------------------------
>
>                 Key: HBASE-12411
>                 URL: https://issues.apache.org/jira/browse/HBASE-12411
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: Performance
>            Reporter: Lars Hofhansl
>         Attachments: 12411.txt
>
>
> In the light of HDFS-6735 we might want to consider refraining from seek + read completely
and only perform preads.
> For example currently a compaction can lock out every other scanner over the file which
the compaction is currently reading for compaction.
> At the very least we can introduce an option to avoid seek + read, so we can allow testing
this in various scenarios.
> This will definitely be of great importance for projects like Phoenix which parallelize
queries intra region (and hence readers will used concurrently by multiple scanner with high
likelihood.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message