spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "A Bradbury (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-23792) Documentation improvements for datetime functions
Date Sun, 25 Mar 2018 08:52:00 GMT
A Bradbury created SPARK-23792:
----------------------------------

             Summary: Documentation improvements for datetime functions
                 Key: SPARK-23792
                 URL: https://issues.apache.org/jira/browse/SPARK-23792
             Project: Spark
          Issue Type: Documentation
          Components: Documentation, SQL
    Affects Versions: 2.3.0
            Reporter: A Bradbury


Added details about the supported column input types, the column return type, behaviour on
invalid input, supporting examples and clarifications to the datetime functions in `org.apache.spark.sql.functions`
for Java/Scala. 

These changes stemmed from confusion over behaviour of the `date_add` method. On first use
I thought it would add the specified days to the input timestamp, but it also truncated (cast)
the input timestamp to a date, loosing the time part. 

Some examples:
 * Noted that the week definition for `dayofweek` method starts on a Sunday
 * Corrected documentation for methods such as `last_day` that only listed one type of input
i.e. "date column" changed to "date, timestamp or string"
 * Renamed the parameters of the `months_between` method to match those of the `datediff`
method and to indicate which parameter is expected to be before then other chronologically
 * `from_unixtime` documentation referenced the "given format" when there was no format parameter
 * Documentation for `to_timestamp` methods detailed that a unix timestamp in seconds would
be returned (implying 1521926327) when they would actually return the input cast to a timestamp
type 

Some observations:
 * The first day of the week by the `dayofweek` method is a Sunday, but by the `weekofyear`
method it is a Monday
 * The `datediff` method returns a integer value, even with timestamp input, whereas the `months_between`
method returns a double, which seems inconsistent

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message