Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9EC07200BA9 for ; Sun, 23 Oct 2016 15:38:20 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 9D459160AFC; Sun, 23 Oct 2016 13:38:20 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E45AC160AD8 for ; Sun, 23 Oct 2016 15:38:19 +0200 (CEST) Received: (qmail 48004 invoked by uid 500); 23 Oct 2016 13:38:19 -0000 Mailing-List: contact issues-help@carbondata.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@carbondata.incubator.apache.org Delivered-To: mailing list issues@carbondata.incubator.apache.org Received: (qmail 47995 invoked by uid 99); 23 Oct 2016 13:38:19 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 23 Oct 2016 13:38:19 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id AD428C0AF5 for ; Sun, 23 Oct 2016 13:38:18 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -7.019 X-Spam-Level: X-Spam-Status: No, score=-7.019 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-2.999] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id tkReYejiV6Ip for ; Sun, 23 Oct 2016 13:38:16 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with SMTP id 013F15FACD for ; Sun, 23 Oct 2016 13:38:14 +0000 (UTC) Received: (qmail 47770 invoked by uid 99); 23 Oct 2016 13:37:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 23 Oct 2016 13:37:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 5AA3F2C001E for ; Sun, 23 Oct 2016 13:37:58 +0000 (UTC) Date: Sun, 23 Oct 2016 13:37:58 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: issues@carbondata.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CARBONDATA-221) Store inverted index in metadata MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Sun, 23 Oct 2016 13:38:20 -0000 [ https://issues.apache.org/jira/browse/CARBONDATA-221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15599701#comment-15599701 ] ASF GitHub Bot commented on CARBONDATA-221: ------------------------------------------- GitHub user Zhangshunyu reopened a pull request: https://github.com/apache/incubator-carbondata/pull/222 [CARBONDATA-221] Fix the bug of inverted index that store inverted index in metadata by using Encoding.INVERTED_INDEX. ## Why raise this pr? 1. Problem: In current code, inverted index in ddl info is not stored into store, and when we restart the cluster, query might mismatch. 2. To fix problem 1, current code set always true to use inverted index, and we can not configure inverted index now. We should fix this problem from its root cause. ## How to solve? Using the Encoding as the indentifier to check whether using inverted index, this Encoding is in thrift format now, so we no need to modify the thrift format. Here it is the same to the query logic in CompressedDimensionChunkFileBasedReader: ``` if (CarbonUtil.hasEncoding(dimensionColumnChunk.get(blockIndex).getEncodingList(), Encoding.INVERTED_INDEX)) { invertedIndexes = CarbonUtil .getUnCompressColumnIndex(dimensionColumnChunk.get(blockIndex).getRowIdPageLength(), fileReader.readByteArray(filePath, dimensionColumnChunk.get(blockIndex).getRowIdPageOffset(), dimensionColumnChunk.get(blockIndex).getRowIdPageLength()), numberComressor); // get the reverse index invertedIndexesReverse = getInvertedReverseIndex(invertedIndexes); } ``` it also use Encoding.INVERTED_INDEX to check whether one column is use inverted index. ## How to test? Pass all the test cases. You can merge this pull request into a Git repository by running: $ git pull https://github.com/Zhangshunyu/incubator-carbondata fix_index Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/222.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #222 ---- commit c27a8a9e33529e53020c477c70d0c079724070d2 Author: Zhangshunyu Date: 2016-09-08T07:48:03Z Save useInvertedIndex info into thrift store commit 3c8da81869e1a8eca8bdde3d82bc0a9d185bdc3d Author: Zhangshunyu Date: 2016-09-08T07:48:15Z Save useInvertedIndex info into thrift store commit b834e4889f5c5eadcee1c232c1a6070df0c1bf60 Author: Zhangshunyu Date: 2016-09-08T09:46:12Z Fix the judge of no_dic_col commit e8b338c2a7a9e3e28a591bdfe57a5f704f1496d6 Author: Zhangshunyu Date: 2016-09-08T10:04:20Z add commont ---- > Store inverted index in metadata > -------------------------------- > > Key: CARBONDATA-221 > URL: https://issues.apache.org/jira/browse/CARBONDATA-221 > Project: CarbonData > Issue Type: Bug > Reporter: kumar vishal > Assignee: zhangshunyu > > Store useInvertedIndex in carbondata metadata and while reading need to set in column schema and in filter we need to handle no inverted index scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)