commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sebb <seb...@gmail.com>
Subject Re: [CSV] Lexer and Character conversion
Date Sat, 13 Oct 2012 13:31:31 GMT
On 13 October 2012 12:55, sebb <sebbaz@gmail.com> wrote:
> Before r1397883, the Lexer operated only on char fields; it's now been
> converted to use Character, which means that unboxing is needed.
>
> Also, the Character fields need to be checked for null before use.
>
> It has just occurred to me that there is a genuine illegal char value
> for everything except the delimiter - that is, the delimiter itself.
> It does not make sense for there to be no delimiter, nor does it make
> sense for any other meta-character to be the same as the delimiter.
>
> So rather than having
>
>     Character escape;
> ...
>     boolean isEscape(final int c) {
>         return escape != null && c == escape.charValue();
>     }
>
> one could use the simpler (and more efficient)
>
>     char escape;
> ...
>     boolean isEscape(final int c) { // similarly for isEncapsulator etc.
>         return escape != delimiter;
>     }

Sorry, that's rubbish; it needs to be:

     boolean isEscape(final int c) { // similarly for isEncapsulator etc.
         return escape != delimiter && c == escape;
     }

which is hardly better than before.

However, if the Lexer ctor ensures that escape (etc) can never be the
same as the delimiter, then the check can be simplified to:

     boolean isEscape(final int c) { // similarly for isEncapsulator etc.
         return c == escape;
     }

> This would have the added bonus of automatically disallowing delimiter
> as the escape (or encapsulator etc.) because they would not be
> recognised.
> [At present the code does not check this]
>
> The Lexer ctor would need to be changed to convert a null escape
> Character (comment etc) to the delimiter.

and it would need to throw IAE or similar if any of the meta chars
match the delimiter.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message