Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 776901015F for ; Sun, 26 Jan 2014 08:01:00 +0000 (UTC) Received: (qmail 7284 invoked by uid 500); 26 Jan 2014 08:00:58 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 7030 invoked by uid 500); 26 Jan 2014 08:00:55 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 6966 invoked by uid 99); 26 Jan 2014 08:00:53 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 26 Jan 2014 08:00:53 +0000 Date: Sun, 26 Jan 2014 08:00:53 +0000 (UTC) From: "Hudson (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882220#comment-13882220 ] Hudson commented on HBASE-8755: ------------------------------- FAILURE: Integrated in HBase-TRUNK-on-Hadoop-1.1 #65 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/65/]) HBASE-10156 FSHLog Refactor (WAS -> Fix up the HBASE-8755 slowdown when low contention) (stack: rev 1561450) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FailedLogCloseException.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FailedSyncBeforeLogCloseException.java * /hbase/trunk/hbase-server/pom.xml * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSWALEntry.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtil.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/RingBufferTruck.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SyncFuture.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALCoprocessorHost.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestParallelPut.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/HLogPerformanceEvaluation.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestDurability.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRollAbort.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRollingNoCluster.java * /hbase/trunk/pom.xml > A new write thread model for HLog to improve the overall HBase write throughput > ------------------------------------------------------------------------------- > > Key: HBASE-8755 > URL: https://issues.apache.org/jira/browse/HBASE-8755 > Project: HBase > Issue Type: Improvement > Components: Performance, wal > Reporter: Feng Honghua > Assignee: Feng Honghua > Priority: Critical > Fix For: 0.98.0, 0.99.0 > > Attachments: 8755-syncer.patch, 8755trunkV2.txt, 8755v8.txt, 8755v9.txt, HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, HBASE-8755-0.96-v0.patch, HBASE-8755-trunk-V0.patch, HBASE-8755-trunk-V1.patch, HBASE-8755-trunk-v4.patch, HBASE-8755-trunk-v6.patch, HBASE-8755-trunk-v7.patch, HBASE-8755-v5.patch, thread.out > > > In current write model, each write handler thread (executing put()) will individually go through a full 'append (hlog local buffer) => HLog writer append (write to hdfs) => HLog writer sync (sync hdfs)' cycle for each write, which incurs heavy race condition on updateLock and flushLock. > The only optimization where checking if current syncTillHere > txid in expectation for other thread help write/sync its own txid to hdfs and omitting the write/sync actually help much less than expectation. > Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi proposed a new write thread model for writing hdfs sequence file and the prototype implementation shows a 4X improvement for throughput (from 17000 to 70000+). > I apply this new write thread model in HLog and the performance test in our test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 RS, from 22000 to 70000 for 5 RS), the 1 RS write throughput (1K row-size) even beats the one of BigTable (Precolator published in 2011 says Bigtable's write throughput then is 31002). I can provide the detailed performance test results if anyone is interested. > The change for new write thread model is as below: > 1> All put handler threads append the edits to HLog's local pending buffer; (it notifies AsyncWriter thread that there is new edits in local buffer) > 2> All put handler threads wait in HLog.syncer() function for underlying threads to finish the sync that contains its txid; > 3> An single AsyncWriter thread is responsible for retrieve all the buffered edits in HLog's local pending buffer and write to the hdfs (hlog.writer.append); (it notifies AsyncFlusher thread that there is new writes to hdfs that needs a sync) > 4> An single AsyncFlusher thread is responsible for issuing a sync to hdfs to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread that sync watermark increases) > 5> An single AsyncNotifier thread is responsible for notifying all pending put handler threads which are waiting in the HLog.syncer() function > 6> No LogSyncer thread any more (since there is always AsyncWriter/AsyncFlusher threads do the same job it does) -- This message was sent by Atlassian JIRA (v6.1.5#6160)