commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haswell, Joe" <josiah.d.hasw...@hp.com>
Subject [Commons-Lang] StrTokenizer behavior on CSVs
Date Wed, 06 Oct 2010 22:07:42 GMT
Hello,

The behavior of StrTokenizer on CSV lines is producing an unexpected result, and I'm wondering
if it's a defect or intended behavior.
If it's intended, my hope is that someone can guide me to a work-around.


Consider the CSV line:

Field1, field2,{x}"this should
All, be escaped"

If x is empty, the line gets tokenized as expected; that is, it produces the results:
Field1,field2,"this should{newline}All, be escaped"

If x is whitespace, the line gets tokenized unexpectedly:
Field1, field2,["this should All],[be escaped"]
(brackets indicate complete field; quotation marks are fragmented)

Any clarification provided would be greatly appreciated.  The tokenizer is configured to use
the appropriate delimiters and quote characters.

Thanks!


Joe Haswell | HP Software



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message