phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Mahonin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-2745) The spark savemode not work correctly
Date Fri, 04 Mar 2016 16:56:40 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-2745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15180137#comment-15180137
] 

Josh Mahonin commented on PHOENIX-2745:
---------------------------------------

Very interesting [~lichenglingl]. Do you have any examples of other data sources exhibiting
this 'drop' -> 'reload' behaviour for SaveMode.Overwrite?

When I'd first written the integration, 'Overwrite' seemed to me the most correct, based on
this documentation of the SaveMode class [1]:
"if data/table already exists, existing data is expected to be overwritten by the contents
of the DataFrame"

However, if other data sources use 'Append' to the same effect, it might be best to use that
as the default behaviour. Follow-up work would then be to look at doing a DROP or DELETE in
the case of SaveMode.Overwrite.

[1] https://spark.apache.org/docs/1.6.0/api/java/org/apache/spark/sql/SaveMode.html

> The spark savemode not work correctly
> -------------------------------------
>
>                 Key: PHOENIX-2745
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2745
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.6.0
>            Reporter: lichenglin
>
> When saving a dataframe to spark with the mode SaveMode.Overwrite 
> spark will drop the table  first and load the new dataframe 
> but phoinex just replace the old data to the new data according to the primary key 
> the old datas still exsits.
> the overwrite actually work as append



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message