carbondata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liang Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CARBONDATA-888) Dictionary include / exclude option in dataframe writer
Date Sun, 09 Apr 2017 03:45:41 GMT

    [ https://issues.apache.org/jira/browse/CARBONDATA-888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15962030#comment-15962030
] 

Liang Chen commented on CARBONDATA-888:
---------------------------------------

Sure, please let me know your jira account email id, i will give your right.

> Dictionary include / exclude option in dataframe writer
> -------------------------------------------------------
>
>                 Key: CARBONDATA-888
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-888
>             Project: CarbonData
>          Issue Type: Improvement
>          Components: spark-integration
>    Affects Versions: 1.2.0-incubating
>         Environment: HDP 2.5, Spark 1.6
>            Reporter: Sanoj MG
>            Priority: Minor
>             Fix For: 1.2.0-incubating
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> While creating a Carbondata table from dataframe, currently it is not possible to specify
columns that needs to be included in or excluded from the dictionary. An option is required
to specify it as below : 
> df.write.format("carbondata")
>   .option("tableName", "test")
>   .option("compress","true")
>   .option("dictionary_include","incol1,intcol2")
>   .option("dictionary_exclude","stringcol1,stringcol2")
>   .mode(SaveMode.Overwrite)
> .save()
> We have lot of integer columns that are dimensions, dataframe.save is used to quickly
create tables instead of writing ddls, and it would be nice to have this feature to execute
POCs.  
>  
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message