lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: Problems Indexing/Parsing Tibetan Text
Date Fri, 30 Mar 2012 17:09:16 GMT
On Fri, Mar 30, 2012 at 1:03 PM, Denis Brodeur <denisbrodeur@gmail.com> wrote:
> Thanks Robert.  That makes sense.  Do you have a link handy where I can
> find this information? i.e. word boundary/punctuation for any unicode
> character set?
>

yeah, usually i use
http://unicode.org/cldr/utility/list-unicodeset.jsp?a=[\u0f10-\u0f19]&g=

you can then click on a character and see all of its properties easily.

(site seems to have some issues today)

-- 
lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message