Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 8AD1F200D0C for ; Wed, 20 Sep 2017 10:48:06 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 836861609E3; Wed, 20 Sep 2017 08:48:06 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 9A49F1609E1 for ; Wed, 20 Sep 2017 10:48:05 +0200 (CEST) Received: (qmail 47492 invoked by uid 500); 20 Sep 2017 08:48:04 -0000 Mailing-List: contact issues-help@carbondata.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@carbondata.apache.org Delivered-To: mailing list issues@carbondata.apache.org Received: (qmail 47482 invoked by uid 99); 20 Sep 2017 08:48:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Sep 2017 08:48:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 3C1E61841D2 for ; Wed, 20 Sep 2017 08:48:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id Mvf_wn_9WYq9 for ; Wed, 20 Sep 2017 08:48:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 9B74F5FB62 for ; Wed, 20 Sep 2017 08:48:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 721BAE06C4 for ; Wed, 20 Sep 2017 08:48:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 28E022140C for ; Wed, 20 Sep 2017 08:48:00 +0000 (UTC) Date: Wed, 20 Sep 2017 08:48:00 +0000 (UTC) From: "Venkata Ramana G (JIRA)" To: issues@carbondata.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (CARBONDATA-45) Support MAP type MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 20 Sep 2017 08:48:06 -0000 [ https://issues.apache.org/jira/browse/CARBONDATA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venkata Ramana G updated CARBONDATA-45: --------------------------------------- Description: {code:sql} >>CREATE TABLE table1 ( deviceInformationId int, channelsId string, props map) STORED BY 'org.apache.carbondata.format' >>insert into table1 select 10,'channel1', map(1,'user1',101, 'root') {code} format of data to be read from csv, with '$' as level 1 delimiter and map keys terminated by '#' {code:sql} >>load data local inpath '/tmp/data.csv' into table1 options ('COMPLEX_DELIMITER_LEVEL_1'='$', 'COMPLEX_DELIMITER_LEVEL_2'=':', 'COMPLEX_DELIMITER_FOR_KEY'='#') 20,channel2,2#user2$100#usercommon 30,channel3,3#user3$100#usercommon 40,channel4,4#user3$100#usercommon >>select channelId, props[100] from table1 where deviceInformationId > 10; 20, usercommon 30, usercommon 40, usercommon >>select channelId, props from table1 where props[2] == 'user2'; 20, {2,'user2', 100, 'usercommon'} {code} Following cases needs to be handled: ||Sub feature||Pending activity||Remarks|| |Basic Maptype support|Develop| Create table DDL, Load map data from CSV, select * from maptable| |Maptype lookup in projection and filter|Develop|Projection and filters needs execution at spark| |NULL values, UDFs, Describe support|Develop|| |Compaction support | Test + fix | As compaction works at byte level, no changes required. Needs to add test-cases| |Insert into table| Develop | Source table data containing Map data needs to convert from spark datatype to string , as carbon takes string as input row | |Support DDL for Map fields Dictionary include and Dictionary Exclude | Develop | Also needs to handle CarbonDictionaryDecoder to handle the same. | |Support multilevel Map | Develop | currently DDL is validated to allow only 2 levels, remove this restriction| |Support Map value to be a measure | Develop | Currently array and struct supports only dimensions which needs change| |Support Alter table to add and remove Map column | Develop | implement DDL and requires default value handling | |Projections of Map loopup push down to carbon | Develop | this is an optimization, when more number of values are present in Map | |Filter map loolup push down to carbon | Develop | this is an optimization, when more number of values are present in Map | |Update Map values | Develop | update map value| h4. Design suggestion: Map can be represented internally stored as Array>, So that conversion of data is required to Map data type while giving to spark. Schema will have new column of map type similar to Array. was: {code:sql} >>CREATE TABLE table1 ( deviceInformationId int, channelsId string, props map) STORED BY 'org.apache.carbondata.format' >>insert into table1 select 10,'channel1', map(1,'user1',101, 'root') {code} format of data to be read from csv, with '$' as level 1 delimiter and map keys terminated by '#' {code:sql} >>load data local inpath '/tmp/data.csv' into table1 options ('COMPLEX_DELIMITER_LEVEL_1'='$', 'COMPLEX_DELIMITER_LEVEL_2'=':', 'COMPLEX_DELIMITER_FOR_KEY'='#') 20,channel2,2#user2$100#usercommon 30,channel3,3#user3$100#usercommon 40,channel4,4#user3$100#usercommon >>select channelId, props[100] from table1 where deviceInformationId > 10; 20, usercommon 30, usercommon 40, usercommon >>select channelId, props from table1 where props[2] == 'user2'; 20, {2,'user2', 100, 'usercommon'} {code} Following cases needs to be handled: ||Sub feature||Pending activity||Remarks|| |Basic Maptype support|Develop| Create table DDL, Load map data from CSV, select * from maptable| |Maptype lookup in projection and filter|Develop|Projection and filters needs execution at spark| |NULL values, UDFs, Describe support|Develop|| |Compaction support | Test + fix | As compaction works at byte level, no changes required. Needs to add test-cases| |Insert into table| Develop | Source table data containing Map data needs to convert from spark datatype to string , as carbon takes string as input row | |Support DDL for Map fields Dictionary include and Dictionary Exclude | Develop | Also needs to handle CarbonDictionaryDecoder to handle the same. | |Support multilevel Map | Develop | currently DDL is validated to allow only 2 levels, remove this restriction| |Support Map value to be a measure | Develop | Currently supports only dimensions | |Support Alter table to add and remove Map column | Develop | implement DDL and requires default value handling | |Projections of Map loopup push down to carbon | Develop | this is an optimization, when more number of values are present in Map | |Filter map loolup push down to carbon | Develop | this is an optimization, when more number of values are present in Map | |Update Map values | Develop | update map value| h4. Design suggestion: Map can be represented internally stored as Array>, So that conversion of data is required to Map data type while giving to spark. Schema will have new column of map type similar to Array. > Support MAP type > ---------------- > > Key: CARBONDATA-45 > URL: https://issues.apache.org/jira/browse/CARBONDATA-45 > Project: CarbonData > Issue Type: New Feature > Components: core, sql > Reporter: cen yuhai > Assignee: Venkata Ramana G > Fix For: 1.3.0 > > > {code:sql} > >>CREATE TABLE table1 ( > deviceInformationId int, > channelsId string, > props map) > STORED BY 'org.apache.carbondata.format' > >>insert into table1 select 10,'channel1', map(1,'user1',101, 'root') > {code} > format of data to be read from csv, with '$' as level 1 delimiter and map keys terminated by '#' > {code:sql} > >>load data local inpath '/tmp/data.csv' into table1 options ('COMPLEX_DELIMITER_LEVEL_1'='$', 'COMPLEX_DELIMITER_LEVEL_2'=':', 'COMPLEX_DELIMITER_FOR_KEY'='#') > 20,channel2,2#user2$100#usercommon > 30,channel3,3#user3$100#usercommon > 40,channel4,4#user3$100#usercommon > >>select channelId, props[100] from table1 where deviceInformationId > 10; > 20, usercommon > 30, usercommon > 40, usercommon > >>select channelId, props from table1 where props[2] == 'user2'; > 20, {2,'user2', 100, 'usercommon'} > {code} > Following cases needs to be handled: > ||Sub feature||Pending activity||Remarks|| > |Basic Maptype support|Develop| Create table DDL, Load map data from CSV, select * from maptable| > |Maptype lookup in projection and filter|Develop|Projection and filters needs execution at spark| > |NULL values, UDFs, Describe support|Develop|| > |Compaction support | Test + fix | As compaction works at byte level, no changes required. Needs to add test-cases| > |Insert into table| Develop | Source table data containing Map data needs to convert from spark datatype to string , as carbon takes string as input row | > |Support DDL for Map fields Dictionary include and Dictionary Exclude | Develop | Also needs to handle CarbonDictionaryDecoder to handle the same. | > |Support multilevel Map | Develop | currently DDL is validated to allow only 2 levels, remove this restriction| > |Support Map value to be a measure | Develop | Currently array and struct supports only dimensions which needs change| > |Support Alter table to add and remove Map column | Develop | implement DDL and requires default value handling | > |Projections of Map loopup push down to carbon | Develop | this is an optimization, when more number of values are present in Map | > |Filter map loolup push down to carbon | Develop | this is an optimization, when more number of values are present in Map | > |Update Map values | Develop | update map value| > h4. Design suggestion: > Map can be represented internally stored as Array>, So that conversion of data is required to Map data type while giving to spark. Schema will have new column of map type similar to Array. -- This message was sent by Atlassian JIRA (v6.4.14#64029)