spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (SPARK-20794) Spark show() command on dataset does not retrieve consistent rows from DASHDB data source
Date Thu, 18 May 2017 07:26:04 GMT

     [ https://issues.apache.org/jira/browse/SPARK-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sean Owen resolved SPARK-20794.
-------------------------------
    Resolution: Invalid

It's a question, so belongs on the mailing list. I think it's a DASHDB question. show is just
picking from the first partition of the underlying data source.

> Spark show() command on dataset does not retrieve consistent rows from DASHDB data source
> -----------------------------------------------------------------------------------------
>
>                 Key: SPARK-20794
>                 URL: https://issues.apache.org/jira/browse/SPARK-20794
>             Project: Spark
>          Issue Type: Question
>          Components: Spark Core
>    Affects Versions: 2.0.0
>            Reporter: Sahana HA
>            Priority: Minor
>
> When the user creates the dataframe from DASHDB data source (which is a relational database)
and executes df.show(5) it returns different result sets or rows during each execution. We
are aware that show(5) will pick the first 5 rows from existing partition and hence it is
not guaranteed to be consistent across each execution. 
> However when we try the same show(5) command against S3 storage or bluemixobject store
(non-relational data source) we always get the same result sets or rows in order, across each
execution.
> We just wanted to confirm why the difference between DASHDB and other data source like
S3/Bluemixobjectstore ? Is the issue with spark or DASHDB alone ? or is the inconsistent rows
behavior is there for all relational data source ?
> Repro snippet:
> -- Load the data from dashdb
> val dashdb = sqlContext.read.format("packageName").options(dashdbreadOptions).load
> -- execution #1
> dashdb.show(5)
> +--------------------+------------+-----------------+-------+-----+-------------+------+---+--------------+------------+
> |        PRODUCT_LINE|PRODUCT_TYPE|CUST_ORDER_NUMBER|   CITY|STATE|      COUNTRY|GENDER|AGE|MARITAL_STATUS|
 PROFESSION|
> +--------------------+------------+-----------------+-------+-----+-------------+------+---+--------------+------------+
> |Personal Accessories|     Eyewear|           107861|Rutland|   VT|United States|   
 F| 39|       Married|       Sales|
> |   Camping Equipment|    Lanterns|           189003| Sydney|  NSW|    Australia|   
 F| 20|        Single| Hospitality|
> |   Camping Equipment|Cooking Gear|           107863| Sydney|  NSW|    Australia|   
 F| 20|        Single| Hospitality|
> |Personal Accessories|     Eyewear|           189005|Villach|   NA|      Austria|   
 F| 37|       Married|Professional|
> |Personal Accessories|     Eyewear|           107865|Villach|   NA|      Austria|   
 F| 37|       Married|Professional|
> +--------------------+------------+-----------------+-------+-----+-------------+------+---+--------------+------------+
> only showing top 5 rows
> -- execution #2
> dashdb.show(5)
> +--------------------+------------+-----------------+------------+-----+--------------+------+---+--------------+-----------+
> |        PRODUCT_LINE|PRODUCT_TYPE|CUST_ORDER_NUMBER|        CITY|STATE|       COUNTRY|GENDER|AGE|MARITAL_STATUS|
PROFESSION|
> +--------------------+------------+-----------------+------------+-----+--------------+------+---+--------------+-----------+
> |Mountaineering Eq...|       Tools|           112835|  Portsmouth|   NA|United Kingdom|
    M| 24|        Single|      Other|
> |   Camping Equipment|Cooking Gear|           193902|Jacksonville|   FL| United States|
    F| 22|        Single|Hospitality|
> |   Camping Equipment|       Packs|           112837|Jacksonville|   FL| United States|
    F| 22|        Single|Hospitality|
> |Mountaineering Eq...|        Rope|           193904|Jacksonville|   FL| United States|
    F| 31|       Married|      Other|
> |      Golf Equipment|     Putters|           112839|Jacksonville|   FL| United States|
    F| 31|       Married|      Other|
> +--------------------+------------+-----------------+------------+-----+--------------+------+---+--------------+-----------+
> only showing top 5 rows



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message