spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [spark] cloud-fan commented on issue #26626: [SPARK-29986][SQL] Introduce java like string trim to UTF8String
Date Thu, 21 Nov 2019 15:21:48 GMT
cloud-fan commented on issue #26626: [SPARK-29986][SQL] Introduce java like string trim to
UTF8String
URL: https://github.com/apache/spark/pull/26626#issuecomment-557132578
 
 
   From SQL standard
   ```
   let SRC be <trim source>. TRIM ( SRC ) is equivalent to TRIM ( BOTH ' ' FROM SRC
).
   ```
   ```
   cast specification
   If SD is character string, then SV is replaced by SV with any leading or trailing <space>s
removed.
   ```
   some related information
   ```
   L( <left bracket> <colon> SPACE <colon> <right bracket> )
   is the set of all character strings of length 1 (one) that are the <space> character.
   r) L( <left bracket> <colon> WHITESPACE <colon> <right bracket>
)
   is the set of all character strings of length 1 (one) that are white space characters.
   ...
   white space
   consecutive sequences of one or more characters that have no glyphs
   ```
   
   So space means `' '`, and white space means all chars whose ascii code <= 32. `trim`
and `cast` should only remove spaces.
   
   However, seems most of the DBs don't follow the cast part, and we rely on `Double.valueOf`
so hard to change this behavior. I think it's OK to trim white spaces in cast.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message