hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15536) Make AsyncFSWAL as our default WAL
Date Thu, 20 Oct 2016 12:44:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15591721#comment-15591721
] 

ramkrishna.s.vasudevan commented on HBASE-15536:
------------------------------------------------

Some more updates here. I tried running lesser data like 15G with 50 threads. Even there Async
is performing slower.
Probably these are some reasons for the slowness
-> The FanOutOneBlockAsyncDFSOutput is a much sophisticated dfs client model that works
with Netty ByteBuf. Here we hold on connection to the datanodes using
Netty Channels. And the idea is to write data direclty to these channels.
AsyncHLog gets an append call. The AysncWAL uses the HBase's ByteArrayOutputSTream and so
the content of the cell is written to this BAOS and that is again
copied to the netty Bytebuf in the FanOutOneBlockAsyncDFSOutput.
So when the sync call happens this FanoutDFSoutput does the checksum calcualtion itself and
then writes the content of this buffer direclty to the DN channel.

-> In case of FSHLOg this is different. When an append call comes we direclty write the
content to the FSDataOutputStream (it is copied to this stream).
Then here internally there is a checkSum calculation that happens. when a sync call happens
there is noth ing to do except to notify the NN to flush the latest
data.

AS we can see from the above that there are two copies in AsyncWAL

-> From the Cell to the BAOS 
-> From the BAOS to the Netty byte buf
-> On sync() call, do check sum and finally flush the netty byte buf to the DN channel

In case of FSHLog
-> From cell to the FSDataoutputstream. data is copied. Check sum happens here.
-> Sync call just tries to notify the NN.

Along with this there is some thread contention with the 'waitingConsumePayloads'  on every
append call. Where as the ringbuffer is better here. Not checked the internal impl of the
RingBuffer here. Will check that too.

Will be back here after reading some Netty code.
Above all pls do correct me if am wrong in any of these points.



> Make AsyncFSWAL as our default WAL
> ----------------------------------
>
>                 Key: HBASE-15536
>                 URL: https://issues.apache.org/jira/browse/HBASE-15536
>             Project: HBase
>          Issue Type: Sub-task
>          Components: wal
>    Affects Versions: 2.0.0
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>            Priority: Critical
>             Fix For: 2.0.0
>
>         Attachments: HBASE-15536-v1.patch, HBASE-15536-v2.patch, HBASE-15536-v3.patch,
HBASE-15536-v4.patch, HBASE-15536-v5.patch, HBASE-15536.patch, latesttrunk_asyncWAL_50threads_10cols.jfr,
latesttrunk_defaultWAL_50threads_10cols.jfr
>
>
> As it should be predicated on passing basic cluster ITBLL



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message