Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id EE2AF200C78 for ; Thu, 18 May 2017 09:20:08 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id ECCAC160BC4; Thu, 18 May 2017 07:20:08 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 3B834160BB0 for ; Thu, 18 May 2017 09:20:08 +0200 (CEST) Received: (qmail 56721 invoked by uid 500); 18 May 2017 07:20:07 -0000 Mailing-List: contact issues-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@spark.apache.org Received: (qmail 56712 invoked by uid 99); 18 May 2017 07:20:07 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 May 2017 07:20:07 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id F0F62C061B for ; Thu, 18 May 2017 07:20:06 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id uYIbPbyGL577 for ; Thu, 18 May 2017 07:20:06 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 959625FBE6 for ; Thu, 18 May 2017 07:20:05 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 11608E015A for ; Thu, 18 May 2017 07:20:05 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 8B8B82193C for ; Thu, 18 May 2017 07:20:04 +0000 (UTC) Date: Thu, 18 May 2017 07:20:04 +0000 (UTC) From: "Sahana HA (JIRA)" To: issues@spark.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (SPARK-20794) Spark show() command on dataset does not retrieve consistent rows from DASHDB data source MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 18 May 2017 07:20:09 -0000 Sahana HA created SPARK-20794: --------------------------------- Summary: Spark show() command on dataset does not retrieve consistent rows from DASHDB data source Key: SPARK-20794 URL: https://issues.apache.org/jira/browse/SPARK-20794 Project: Spark Issue Type: Question Components: Spark Core Affects Versions: 2.0.0 Reporter: Sahana HA Priority: Minor When the user creates the dataframe from DASHDB data source (which is a relational database) and executes df.show(5) it returns different result sets or rows during each execution. We are aware that show(5) will pick the first 5 rows from existing partition and hence it is not guaranteed to be consistent across each execution. However when we try the same show(5) command against S3 storage or bluemixobject store (non-relational data source) we always get the same result sets or rows in order, across each execution. We just wanted to confirm why the difference between DASHDB and other data source like S3/Bluemixobjectstore ? Is the issue with spark or DASHDB alone ? or is the inconsistent rows behavior is there for all relational data source ? Repro snippet: -- Load the data from dashdb val dashdb = sqlContext.read.format("packageName").options(dashdbreadOptions).load -- execution #1 dashdb.show(5) +--------------------+------------+-----------------+-------+-----+-------------+------+---+--------------+------------+ | PRODUCT_LINE|PRODUCT_TYPE|CUST_ORDER_NUMBER| CITY|STATE| COUNTRY|GENDER|AGE|MARITAL_STATUS| PROFESSION| +--------------------+------------+-----------------+-------+-----+-------------+------+---+--------------+------------+ |Personal Accessories| Eyewear| 107861|Rutland| VT|United States| F| 39| Married| Sales| | Camping Equipment| Lanterns| 189003| Sydney| NSW| Australia| F| 20| Single| Hospitality| | Camping Equipment|Cooking Gear| 107863| Sydney| NSW| Australia| F| 20| Single| Hospitality| |Personal Accessories| Eyewear| 189005|Villach| NA| Austria| F| 37| Married|Professional| |Personal Accessories| Eyewear| 107865|Villach| NA| Austria| F| 37| Married|Professional| +--------------------+------------+-----------------+-------+-----+-------------+------+---+--------------+------------+ only showing top 5 rows -- execution #2 dashdb.show(5) +--------------------+------------+-----------------+------------+-----+--------------+------+---+--------------+-----------+ | PRODUCT_LINE|PRODUCT_TYPE|CUST_ORDER_NUMBER| CITY|STATE| COUNTRY|GENDER|AGE|MARITAL_STATUS| PROFESSION| +--------------------+------------+-----------------+------------+-----+--------------+------+---+--------------+-----------+ |Mountaineering Eq...| Tools| 112835| Portsmouth| NA|United Kingdom| M| 24| Single| Other| | Camping Equipment|Cooking Gear| 193902|Jacksonville| FL| United States| F| 22| Single|Hospitality| | Camping Equipment| Packs| 112837|Jacksonville| FL| United States| F| 22| Single|Hospitality| |Mountaineering Eq...| Rope| 193904|Jacksonville| FL| United States| F| 31| Married| Other| | Golf Equipment| Putters| 112839|Jacksonville| FL| United States| F| 31| Married| Other| +--------------------+------------+-----------------+------------+-----+--------------+------+---+--------------+-----------+ only showing top 5 rows -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org