commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sebb <seb...@gmail.com>
Subject Re: StrTokenizer not handling quotes correctly?
Date Wed, 15 Apr 2009 14:32:05 GMT
On 15/04/2009, Jacek Furmankiewicz <jacek99@gmail.com> wrote:
> I am trying to use StrTokenizer for some parsing and I am probably not using
>  it correctly.
>
>  Let's say I have this string:
>
>  11"a,b"11,22"c,d"22"
>
>  I would like to split it by the comma ",", but ignoring any commas embedded
>  in quotes. I try this:
>
>         String test = "11\"a,b\"11,22\"c,d\"22";
>         StrTokenizer str = new StrTokenizer(test,',','"');
>         String[] tokens = str.getTokenArray();
>
>         for(String t: tokens) {
>             System.out.println(t);
>         }
>
>  and expect to have two strings print out:
>
>  11"a,b"11
>  22"c,d"22
>
>  but instead I get 4 :
>
>  11"a
>  b"11
>  22"c
>  d"22
>
>  It seems the tokenizer is splitting on the comma, even if it is embedded in
>  quotes.

Quotes are only allowed in quoted strings. From the Javadoc:

"Each token may be surrounded by quotes. The quote matcher specifies
the quote character(s). A quote may be escaped within a quoted section
by duplicating itself. "

>  I tried different options on the StrTokenizer, but not been able to get it
>  to work correctly.
>
>  Any idea as to what am I doing wrong? Using latest version 2.4.

The input needs to look like this:

"11""a,b""11","22""c,d""22""

>  Thanks, Jacek
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Mime
View raw message