Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F19E718808 for ; Thu, 13 Aug 2015 12:04:45 +0000 (UTC) Received: (qmail 60704 invoked by uid 500); 13 Aug 2015 12:04:45 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 60656 invoked by uid 500); 13 Aug 2015 12:04:45 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 60640 invoked by uid 99); 13 Aug 2015 12:04:45 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Aug 2015 12:04:45 +0000 Date: Thu, 13 Aug 2015 12:04:45 +0000 (UTC) From: "Enis Soztutar (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695115#comment-14695115 ] Enis Soztutar commented on HBASE-11368: --------------------------------------- I was reading HBASE-4552 and RegionScannerImpl code again to try to understand why we need the write lock for multi-CF bulk loads in the first place. It seems that it was put there to ensure atomicity, but that should be guaranteed with the seqId / mvcc combination and not via region write lock. However, the bulk load files obtain a seqId, and acquiring the region write lock will block all flushes which may be the reason. On bulk load, we call HStore.notifyChangedReadersObservers(), which resets the KVHeap, but we never reset the RegionScanner from my reading of code. Is this a bug? The current scanners should not see the new bulk loaded data (via mvcc) so maybe it is ok? > Multi-column family BulkLoad fails if compactions go on too long > ---------------------------------------------------------------- > > Key: HBASE-11368 > URL: https://issues.apache.org/jira/browse/HBASE-11368 > Project: HBase > Issue Type: Bug > Reporter: stack > Assignee: Qiang Tian > Attachments: hbase-11368-0.98.5.patch, hbase11368-master.patch, key_stacktrace_hbase10882.TXT, performance_improvement_verification_98.5.patch > > > Compactions take a read lock. If a multi-column family region, before bulk loading, we want to take a write lock on the region. If the compaction takes too long, the bulk load fails. > Various recipes include: > + Making smaller regions (lame) > + [~victorunique] suggests major compacting just before bulk loading over in HBASE-10882 as a work around. > Does the compaction need a read lock for that long? Does the bulk load need a full write lock when multiple column families? Can we fail more gracefully at least? -- This message was sent by Atlassian JIRA (v6.3.4#6332)