lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonik Seeley <>
Subject Re: "Advanced" query language
Date Tue, 06 Dec 2005 18:26:09 GMT
> Are you aware, though, of an existing Unicode serialization/markup
> mechanism without XML's gaps?

No, but I'm not advocating anything other than XML.  I'm just pointing
out a problem that needs to be solved.

> Base64 is frequently used as an escape mechanism for binary data in XML.

Yeah, but it's not necessarily binary data.  I just want to be able to
express all of unicode.

> One possible solution to the escaping issue is a standard optional
> attribute named "encoding",

It's an application level convention, not a standard, and it's still
not clear what is being encoded in base64.  Is it UTF-8, Java
characters, or true binary?

For normal text data, with valid unicode characters that aren't legal
XML, I'd rather have a simple escaping mechanism.  Something like
backslash escaping that is easily understood.  Maybe something as
simple as \00 for &#0; (backslash followed by two hex digits).


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message