Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 76DDA1721F for ; Wed, 30 Sep 2015 07:24:06 +0000 (UTC) Received: (qmail 57849 invoked by uid 500); 30 Sep 2015 07:24:04 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 57765 invoked by uid 500); 30 Sep 2015 07:24:04 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 57635 invoked by uid 99); 30 Sep 2015 07:24:04 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Sep 2015 07:24:04 +0000 Date: Wed, 30 Sep 2015 07:24:04 +0000 (UTC) From: "Bhupendra Kumar Jain (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HBASE-14520) Optimnize the number of calls for tags creation in bulk load MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Bhupendra Kumar Jain created HBASE-14520: -------------------------------------------- Summary: Optimnize the number of calls for tags creation in bulk load Key: HBASE-14520 URL: https://issues.apache.org/jira/browse/HBASE-14520 Project: HBase Issue Type: Improvement Affects Versions: 2.0.0 Reporter: Bhupendra Kumar Jain Assignee: Bhupendra Kumar Jain At present, ttl and Visibility expr is one per tsv line i.e. the values and the tags remain same for all the columns present in that line. As per the code, List of tags are created for each cell, Instead of creating new tags for each cell, tags created once for the line can be reused by other cells. Assume 1Million rows and 1000 columns. Currently tags creation will happen for 1M * 1000 times. If reuse the tags, the tags creation can reduce to 1M times. (i.e. one per tsv line). This is applicable in both TsvImporterMapper and TextSortReducer logic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)