commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haswell, Joe" <>
Subject [Commons-Lang] StrTokenizer behavior on CSVs
Date Wed, 06 Oct 2010 22:07:42 GMT

The behavior of StrTokenizer on CSV lines is producing an unexpected result, and I'm wondering
if it's a defect or intended behavior.
If it's intended, my hope is that someone can guide me to a work-around.

Consider the CSV line:

Field1, field2,{x}"this should
All, be escaped"

If x is empty, the line gets tokenized as expected; that is, it produces the results:
Field1,field2,"this should{newline}All, be escaped"

If x is whitespace, the line gets tokenized unexpectedly:
Field1, field2,["this should All],[be escaped"]
(brackets indicate complete field; quotation marks are fragmented)

Any clarification provided would be greatly appreciated.  The tokenizer is configured to use
the appropriate delimiters and quote characters.


Joe Haswell | HP Software

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message