spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [spark] srowen commented on a change in pull request #26027: [SPARK-24540][SQL] Support for multiple delimiter in Spark CSV read
Date Fri, 04 Oct 2019 19:51:10 GMT
srowen commented on a change in pull request #26027: [SPARK-24540][SQL] Support for multiple
delimiter in Spark CSV read
URL: https://github.com/apache/spark/pull/26027#discussion_r331660072
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVExprUtils.scala
 ##########
 @@ -79,4 +81,39 @@ object CSVExprUtils {
         throw new IllegalArgumentException(s"Delimiter cannot be more than one character:
$str")
     }
   }
+
+  /**
+   * Helper method that converts string representation of a character sequence to actual
+   * delimiter characters. The input is processed in "chunks", and each chunk is converted
+   * by calling [[CSVExprUtils.toChar()]].  A chunk is either:
+   * <ul>
+   *   <li>a backslash followed by another character</li>
+   *   <li>a non-backslash character by itself</li>
+   * </ul>
+   * , in that order of precedence. The result of the converting all chunks is returned as
+   * a [[String]]
+   *
+   * @param str the string representing the sequence of separator characters
+   * @return a [[String]] representing the multi-character delimiter
+   * @throws IllegalArgumentException if any of the individual input chunks are illegal
+   */
+  def toDelimiterStr(str: String): String = {
+    import scala.collection.mutable.StringBuilder
+    var idx = 0
+
+    val delimiters = new StringBuilder()
 
 Review comment:
   To clarify, this isn't allowing for multiple delimiters, but a single multi-char delimiter,
right? If so I wonder if the description and docs should be updated to clarify that. This
val is probably just `delimiter` right?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message