spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keiji Yoshida (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-26335) Add an option for Dataset#show not to care about wide characters when padding them
Date Wed, 12 Dec 2018 07:53:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-26335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Keiji Yoshida updated SPARK-26335:
----------------------------------
    Description: 
h2. Issue

https://issues.apache.org/jira/browse/SPARK-25108 makes Dataset#show care about wide characters
when padding them. That is useful for humans to read a result of Dataset#show. On the other
hand, that makes it impossible for programs to parse a result of Dataset#show because each
cell's length can be different from its header's length. My company develops and manages
a Jupyter/Apache Zeppelin-like visualization tool named "OASIS" ([https://databricks.com/session/oasis-collaborative-data-analysis-platform-using-apache-spark]).
On this application, a result of Dataset#show on a Scala or Python process is parsed to visualize
it as an HTML table format. ()
h2. Solution

Add an option for Dataset#show not to care about wide characters when padding them.

  was:
https://issues.apache.org/jira/browse/SPARK-25108 makes Dataset#show care about wide characters
when padding them. That is useful for humans to read a result of Dataset#show. On the other
hand, that makes it impossible for programs to parse a result of Dataset#show because each
cell's length can be difference from its header's length. My company develops and manages
a Jupyter/Apache Zeppelin-like visualization tool named "OASIS" ([https://databricks.com/session/oasis-collaborative-data-analysis-platform-using-apache-spark]).
On this application, a result of Dataset#show is parsed to visualize it as an HTML table
format.

So, it is preferable to add an option for Dataset#show not to care about wide characters
when padding them by adding a parameter such as "fixedColLength" to Dataset#show.


> Add an option for Dataset#show not to care about wide characters when padding them
> ----------------------------------------------------------------------------------
>
>                 Key: SPARK-26335
>                 URL: https://issues.apache.org/jira/browse/SPARK-26335
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Keiji Yoshida
>            Priority: Major
>         Attachments: Screen Shot 2018-12-11 at 17.53.54.png
>
>
> h2. Issue
> https://issues.apache.org/jira/browse/SPARK-25108 makes Dataset#show care about wide
characters when padding them. That is useful for humans to read a result of Dataset#show.
On the other hand, that makes it impossible for programs to parse a result of Dataset#show
because each cell's length can be different from its header's length. My company develops
and manages a Jupyter/Apache Zeppelin-like visualization tool named "OASIS" ([https://databricks.com/session/oasis-collaborative-data-analysis-platform-using-apache-spark]).
On this application, a result of Dataset#show on a Scala or Python process is parsed to visualize
it as an HTML table format. ()
> h2. Solution
> Add an option for Dataset#show not to care about wide characters when padding them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message