carbondata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liang Chen (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CARBONDATA-465) Spark streaming dataframe support
Date Tue, 06 Dec 2016 11:14:58 GMT

     [ https://issues.apache.org/jira/browse/CARBONDATA-465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Liang Chen updated CARBONDATA-465:
----------------------------------
    Description: 
Carbondata-1.0.0 support load data with spark data frame api. There is a limit that kettle
is still required since DataFrameLoaderRDD still depends on kettle. We provide NewDataFrameLoaderRDD
 to load data with new flow .

Also,we discovered some bugs:

1. CarbonMetastoreCatalog.createTableFromThrift

```
/**
     * schemaFilePath starts with file:// will not create meta files successfully
     * while thriftWriter will have no complains.
     * This will cause some weired error eg. No table found.
     */
    val thriftWriter = new ThriftWriter(schemaFilePath, false)
    thriftWriter.open()
    thriftWriter.write(thriftTableInfo)
    thriftWriter.close()
``` 

2. There are some exceptions raised  even when you have set useKettle to false.

  was:
Carbondata-0.3.0 support load data with spark data frame api. There is a limit that kettle
is still required since DataFrameLoaderRDD still depends on kettle. We provide NewDataFrameLoaderRDD
 to load data with new flow .

Also,we discovered some bugs:

1. CarbonMetastoreCatalog.createTableFromThrift

```
/**
     * schemaFilePath starts with file:// will not create meta files successfully
     * while thriftWriter will have no complains.
     * This will cause some weired error eg. No table found.
     */
    val thriftWriter = new ThriftWriter(schemaFilePath, false)
    thriftWriter.open()
    thriftWriter.write(thriftTableInfo)
    thriftWriter.close()
``` 

2. There are some exceptions raised  even when you have set useKettle to false.


> Spark streaming dataframe support
> ---------------------------------
>
>                 Key: CARBONDATA-465
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-465
>             Project: CarbonData
>          Issue Type: Improvement
>          Components: data-load
>    Affects Versions: 1.0.0-incubating
>            Reporter: WilliamZhu
>            Assignee: WilliamZhu
>            Priority: Minor
>             Fix For: 1.0.0-incubating
>
>          Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Carbondata-1.0.0 support load data with spark data frame api. There is a limit that kettle
is still required since DataFrameLoaderRDD still depends on kettle. We provide NewDataFrameLoaderRDD
 to load data with new flow .
> Also,we discovered some bugs:
> 1. CarbonMetastoreCatalog.createTableFromThrift
> ```
> /**
>      * schemaFilePath starts with file:// will not create meta files successfully
>      * while thriftWriter will have no complains.
>      * This will cause some weired error eg. No table found.
>      */
>     val thriftWriter = new ThriftWriter(schemaFilePath, false)
>     thriftWriter.open()
>     thriftWriter.write(thriftTableInfo)
>     thriftWriter.close()
> ``` 
> 2. There are some exceptions raised  even when you have set useKettle to false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message