Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 1092E200B7C for ; Thu, 8 Sep 2016 10:15:27 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 0F482160ABD; Thu, 8 Sep 2016 08:15:27 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 568C7160AA5 for ; Thu, 8 Sep 2016 10:15:26 +0200 (CEST) Received: (qmail 14690 invoked by uid 500); 8 Sep 2016 08:15:25 -0000 Mailing-List: contact issues-help@carbondata.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@carbondata.incubator.apache.org Delivered-To: mailing list issues@carbondata.incubator.apache.org Received: (qmail 14681 invoked by uid 99); 8 Sep 2016 08:15:25 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Sep 2016 08:15:25 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 1B792180616 for ; Thu, 8 Sep 2016 08:15:25 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -4.646 X-Spam-Level: X-Spam-Status: No, score=-4.646 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.426] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id imylQIcmCKst for ; Thu, 8 Sep 2016 08:15:22 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with SMTP id BE42A5FADC for ; Thu, 8 Sep 2016 08:15:21 +0000 (UTC) Received: (qmail 13748 invoked by uid 99); 8 Sep 2016 08:15:21 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Sep 2016 08:15:21 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id D47752C1B79 for ; Thu, 8 Sep 2016 08:15:20 +0000 (UTC) Date: Thu, 8 Sep 2016 08:15:20 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: issues@carbondata.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CARBONDATA-229) Array Index of bound exception thrown from dictionary look up while writing sort index file MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 08 Sep 2016 08:15:27 -0000 [ https://issues.apache.org/jira/browse/CARBONDATA-229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15473172#comment-15473172 ] ASF GitHub Bot commented on CARBONDATA-229: ------------------------------------------- GitHub user manishgupta88 opened a pull request: https://github.com/apache/incubator-carbondata/pull/143 [CARBONDATA-229] Array Index of bound exception thrown from dictionary look up while writing sort index file Problem: Array Index of bound exception thrown from dictionary look up while writing sort index file Analysis: Whenever we load dictionary data into memory, then in case of populating reverse dictionary object sometimes a chunk which has no value is also getting added to the dictionary chunk list. This is happening because the logic for dictionary chunk distribution in case of forward dictionary is not implemented for reverse dictionary and 0 size dictionary chunks are not getting removed while adding to the list of dictionary chunks. Solution: Add the same distribution logic we have in forward dictionary for populating reverse dictionary object Impact area: Sort index generation You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishgupta88/incubator-carbondata dictionary_chunk_addition_issue Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/143.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #143 ---- commit 1102eed75692b6c0d4094a6e7f71e83464453d13 Author: manishgupta88 Date: 2016-09-08T07:38:56Z Problem: Array Index of bound exception thrown from dictionary look up while writing sort index file Analysis: Whenever we load dictionary data into memory, then in case of populating reverse dictionary object sometimes a chunk which has no value is also getting added to the dictionary chunk list. This is happening because the logic for dictionary chunk distribution in case of forward dictionary is not implemented for reverse dictionary and 0 size dictionary chunks are not getting removed while adding to the list of dictionary chunks. Solution: Add the same distribution logic we have in forward dictionary for populating reverse dictionary object Impact area: Sort index generation ---- > Array Index of bound exception thrown from dictionary look up while writing sort index file > ------------------------------------------------------------------------------------------- > > Key: CARBONDATA-229 > URL: https://issues.apache.org/jira/browse/CARBONDATA-229 > Project: CarbonData > Issue Type: Bug > Reporter: Manish Gupta > Assignee: Manish Gupta > > Whenever we load dictionary data into memory, then in case of populating reverse dictionary object sometimes a chunk which has no value is also getting added to the dictionary chunk list. This is happening because the logic for dictionary chunk distribution in case of forward dictionary is not implemented for reverse dictionary and 0 size dictionary chunks are not getting removed while adding to the list of dictionary chunks. > java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 > at java.util.ArrayList.rangeCheck(ArrayList.java:653) > at java.util.ArrayList.get(ArrayList.java:429) > at org.apache.carbondata.core.cache.dictionary.DictionaryChunksWrapper.next(DictionaryChunksWrapper.java:92) > at org.apache.carbondata.core.writer.sortindex.CarbonDictionarySortInfoPreparator.prepareDictionarySortModels(CarbonDictionarySortInfoPreparator.java:120) > at org.apache.carbondata.core.writer.sortindex.CarbonDictionarySortInfoPreparator.getDictionarySortInfo(CarbonDictionarySortInfoPreparator.java:51) > at org.apache.carbondata.spark.tasks.SortIndexWriterTask.execute(SortIndexWriterTask.scala:44) > at org.apache.carbondata.spark.rdd.CarbonGlobalDictionaryGenerateRDD$$anon$1.(CarbonGlobalDictionaryRDD.scala:387) > at org.apache.carbondata.spark.rdd.CarbonGlobalDictionaryGenerateRDD.compute(CarbonGlobalDictionaryRDD.scala:294) -- This message was sent by Atlassian JIRA (v6.3.4#6332)