lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ankur Goel" <>
Subject CJK Analyzer in lucene 1.3 final
Date Fri, 27 Feb 2004 12:12:08 GMT

In the lucene-1.3-final version's CHANGES.txt it is written that "Fix
StandardTokenizer's handling of CJK characters (Chinese, Japanese and Korean

Does it mean that for CJK characters we now do not need to use any separate
analyzer, standard analyzer will be sufficient?? 


-----Original Message-----
From: Otis Gospodnetic [] 
Sent: Friday, February 20, 2004 6:56 PM
To: Lucene Users List
Subject: Re: Concurrency

> Ive just got a couple of questions which i cant quite work
> out...wondered if 
> someone could help me with them:
> 1. What happens if i make a backup (copy) of an index while documents
> are 
> being added? Can it cause problems, and if so is there a way to
> safely do 
> this?

You should be okay.  When new documents are added, they are added to
new segments.  A 'table of contents' of all valid segments is in
'segments' file.  Even if you copy extra segments, your index will
still work, it's just that your searches may not search newly created
segments, whose existence was not registered in segments file, when you
copied the index.

> 2. When I create a new IndexSearcher, what method does Lucene use to
> take a 
> 'snapshot' of the index (because if i add documents after the search
> object 
> is created they dont appear in the search results)?

This is related to the answer under 1.  New documents are not seen with
an old IndexSearcher, because the old IndexSearcher is not aware of new
It would have to re-read the segments file and read any new segments
found, in order to become aware of new segments and documents in them.


To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message