incubator-lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From goran kent <>
Subject Re: [lucy-user] Different UTF-8 behaviour between perl 5.8.8 (indexes ok) and 5.10.1 (indexing fails)
Date Wed, 12 Oct 2011 15:57:33 GMT
On Wed, Oct 12, 2011 at 4:37 PM, Peter Karman <> wrote:
> If you really don't need to preserve your UTF-8 text, look at
> Search::Tools::Transliterate. Search::Tools::UTF8 is also helpful for
> debugging these kinds of issues.

Thanks for the suggestions - you've given me food for thought.

> It sounds like, without seeing a reproduce-able test case, that Lucy is
> choking appropriately on malformed UTF-8.

Absolutely.  What's interesting is that the same Lucy code does not
choke on the other machines with the older Perl.  Of course, this may
not be the only factor which is different - just the most obvious (eg,
perl modules, libraries, etc will also differ).

Anyway, I like the idea of rolling my own perl to be absolutely sure
of coherence across my machines.  This is something I've avoided up
until now.

View raw message