Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 70718 invoked from network); 27 Feb 2004 12:12:55 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 27 Feb 2004 12:12:55 -0000 Received: (qmail 98414 invoked by uid 500); 27 Feb 2004 12:12:49 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 98274 invoked by uid 500); 27 Feb 2004 12:12:48 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 98260 invoked from network); 27 Feb 2004 12:12:48 -0000 Received: from unknown (HELO plain.rackshack.net) (207.218.248.80) by daedalus.apache.org with SMTP; 27 Feb 2004 12:12:48 -0000 Received: from neo ([61.11.105.10]) by plain.rackshack.net (8.11.6/8.11.6) with ESMTP id i1RD5Ws14003 for ; Fri, 27 Feb 2004 07:05:33 -0600 Message-Id: <200402271305.i1RD5Ws14003@plain.rackshack.net> Reply-To: From: "Ankur Goel" To: "'Lucene Users List'" Subject: CJK Analyzer in lucene 1.3 final Date: Fri, 27 Feb 2004 17:42:08 +0530 Keywords: Support Organization: Brickred MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.5510 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 Thread-Index: AcP3vDwO7FLRdtOuQ0i81NfzQKpCHAFbjjLg In-Reply-To: <20040220132604.84569.qmail@web12706.mail.yahoo.com> X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Hi, In the lucene-1.3-final version's CHANGES.txt it is written that "Fix StandardTokenizer's handling of CJK characters (Chinese, Japanese and Korean ideograms)." Does it mean that for CJK characters we now do not need to use any separate analyzer, standard analyzer will be sufficient?? Regards, Ankur -----Original Message----- From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com] Sent: Friday, February 20, 2004 6:56 PM To: Lucene Users List Subject: Re: Concurrency > Ive just got a couple of questions which i cant quite work > out...wondered if > someone could help me with them: > > 1. What happens if i make a backup (copy) of an index while documents > are > being added? Can it cause problems, and if so is there a way to > safely do > this? You should be okay. When new documents are added, they are added to new segments. A 'table of contents' of all valid segments is in 'segments' file. Even if you copy extra segments, your index will still work, it's just that your searches may not search newly created segments, whose existence was not registered in segments file, when you copied the index. > 2. When I create a new IndexSearcher, what method does Lucene use to > take a > 'snapshot' of the index (because if i add documents after the search > object > is created they dont appear in the search results)? This is related to the answer under 1. New documents are not seen with an old IndexSearcher, because the old IndexSearcher is not aware of new segments. It would have to re-read the segments file and read any new segments found, in order to become aware of new segments and documents in them. Otis --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org