spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-23291) SparkR : substr : In SparkR dataframe , starting and ending position arguments in "substr" is giving wrong result when the position is greater than 1
Date Sun, 06 May 2018 09:05:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-23291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16465030#comment-16465030
] 

Apache Spark commented on SPARK-23291:
--------------------------------------

User 'HyukjinKwon' has created a pull request for this issue:
https://github.com/apache/spark/pull/21250

> SparkR : substr : In SparkR dataframe , starting and ending position arguments in "substr"
is giving wrong result  when the position is greater than 1
> ------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-23291
>                 URL: https://issues.apache.org/jira/browse/SPARK-23291
>             Project: Spark
>          Issue Type: Bug
>          Components: SparkR
>    Affects Versions: 2.1.2, 2.2.0, 2.2.1, 2.3.0
>            Reporter: Narendra
>            Assignee: Liang-Chi Hsieh
>            Priority: Major
>             Fix For: 2.4.0
>
>
> Defect Description :
> -----------------------------
> For example ,an input string "2017-12-01" is read into a SparkR dataframe "df" with column
name "col1".
>  The target is to create a a new column named "col2" with the value "12" which is inside
the string ."12" can be extracted with "starting position" as "6" and "Ending position" as
"7"
>  (the starting position of the first character is considered as "1" )
> But,the current code that needs to be written is :
>  
>  df <- withColumn(df,"col2",substr(df$col1,7,8)))
> Observe that the first argument in the "substr" API , which indicates the 'starting position',
is mentioned as "7" 
>  Also, observe that the second argument in the "substr" API , which indicates the 'ending
position', is mentioned as "8"
> i.e the number that should be mentioned to indicate the position should be the "actual
position + 1"
> Expected behavior :
> ----------------------------
> The code that needs to be written is :
>  
>  df <- withColumn(df,"col2",substr(df$col1,6,7)))
> Note :
> -----------
>  This defect is observed with only when the starting position is greater than 1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message