Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0EAF1F65F for ; Fri, 12 Dec 2014 15:30:06 +0000 (UTC) Received: (qmail 54441 invoked by uid 500); 12 Dec 2014 15:30:04 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 54379 invoked by uid 500); 12 Dec 2014 15:30:04 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 54368 invoked by uid 99); 12 Dec 2014 15:30:02 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Dec 2014 15:30:02 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of a_rexha@hotmail.com designates 157.55.1.175 as permitted sender) Received: from [157.55.1.175] (HELO DUB004-OMC2S36.hotmail.com) (157.55.1.175) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Dec 2014 15:29:34 +0000 Received: from DUB123-W10 ([157.55.1.138]) by DUB004-OMC2S36.hotmail.com over TLS secured channel with Microsoft SMTPSVC(7.5.7601.22751); Fri, 12 Dec 2014 07:29:33 -0800 X-TMN: [ZLjDko1jLpguP204ZAk6C/43a2HcrKrc] X-Originating-Email: [a_rexha@hotmail.com] Message-ID: Content-Type: multipart/alternative; boundary="_1a3ba8c0-c146-4dc8-a60a-16f322372483_" From: andi rexha To: "java-user@lucene.apache.org" Subject: Getting new token stream from analyzer for legacy projects! Date: Fri, 12 Dec 2014 16:29:33 +0100 Importance: Normal MIME-Version: 1.0 X-OriginalArrivalTime: 12 Dec 2014 15:29:33.0877 (UTC) FILETIME=[6E48BE50:01D01620] X-Virus-Checked: Checked by ClamAV on apache.org --_1a3ba8c0-c146-4dc8-a60a-16f322372483_ Content-Type: text/plain; charset="iso-8859-2" Content-Transfer-Encoding: quoted-printable Hi=2C=20 I have a legacy problem with the token stream. In my application I create a= batch of documents from a unique analyzer (this due to configuration). I a= dd the field using the tokenStream from the analyzer(for internal reasons).= In a pseudo code this translates in :=20 Analyzer analyzer =3D getFromConfig()=3B Collection docsToIndex=3B=20 for (int i =3D 0=3B i < batchSize()=3B i ++) { Documet doc =3D new Document()=3B doc.add(field=2C analyzer.tokenStream("fieldName"=2C currentReader))= =3B docsToIndex.add(doc)=3B } for (Document d : docToIndex) { indexWriter.add(doc)=3B } I get always an exception : TokenStream contract violation: reset()/close() call missing=2C reset() cal= led multiple times.... I understand that the analyzer creates one TokenStream per thread and that = the TokenStream is used from the DefaultIndexingChain during the add docume= nts=2C so the TokenStream is shared.=20 Is there a clean way I can overcome this problem? One possible way of cours= e would be to get the token stream from a separate thread=2C but that would= be a dirty solution.=20 = --_1a3ba8c0-c146-4dc8-a60a-16f322372483_--