Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: java-user@lucene.apache.org
Received-SPF: pass (nike.apache.org: local policy)
From: "Newman, Billy" <Billy.Newman@itt.com>
To: "java-user@lucene.apache.org" <java-user@lucene.apache.org>
Date: Tue, 21 Apr 2009 09:12:52 -0400
Subject: RE: IndexWriter update method
Thread-Topic: IndexWriter update method
Thread-Index: AcnCYORmKehOL0/CS9GpRbqU6KfyLwAIagBg
Message-ID: 
 <F0DB6EA8DC0F594B83F2CA60C7E4E8BA2D368AB247@IIWCBO-EMB1.iiw.de.ittind.com>
References: 
 <F0DB6EA8DC0F594B83F2CA60C7E4E8BA2D3697D79B@IIWCBO-EMB1.iiw.de.ittind.com>
	 <499888440904171705s467e3bap95598f3f870c20b3@mail.gmail.com>
	 <F0DB6EA8DC0F594B83F2CA60C7E4E8BA2D3697D79D@IIWCBO-EMB1.iiw.de.ittind.com>
	 <359a92830904171907j2b8fdb91kbd762c905240b3df@mail.gmail.com>
	 <F0DB6EA8DC0F594B83F2CA60C7E4E8BA2D368AB230@IIWCBO-EMB1.iiw.de.ittind.com>
	 <359a92830904201628o2fa6684cr99561a97f0442002@mail.gmail.com>
 <e05f0fd10904210209obecd8c1s56731755d90ba963@mail.gmail.com>
In-Reply-To: <e05f0fd10904210209obecd8c1s56731755d90ba963@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0

Yeah I was hoping to change the code to use the update method after I upgra=
ded from 1.4.3 but doesn't look feasible.   I will just continue to find th=
e doc and delete it, then re-insert it.  Thanks for all the help guys!

-----Original Message-----
From: Doron Cohen [mailto:cdoronc@gmail.com]=20
Sent: Tuesday, April 21, 2009 3:09 AM
To: java-user@lucene.apache.org
Subject: Re: IndexWriter update method

*IndexWriter.deleteDocuments<http://lucene.apache.org/java/2_4_1/api/core/o=
rg/apache/lucene/index/IndexWriter.html#deleteDocuments%28org.apache.lucene=
.search.Query%29>
*(Query<http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/sear=
ch/Query.html>
 query) may be handy too
(but note that it will delete *all* docs that match the query).
Doron

On Tue, Apr 21, 2009 at 2:28 AM, Erick Erickson <erickerickson@gmail.com>wr=
ote:

> I don't think you *can* create a Term that spans two fields. Perhaps
> you'd be better off just doing a search, getting the doc ID back then
> adding a new version of the document.
>
> You *could* think about reindexing your corpus and indexing an
> additional field that was the concatenation of the two fields you
> want to update by if that's a better solution.
>
> Best
> Erick
>
> On Mon, Apr 20, 2009 at 11:40 AM, Newman, Billy <Billy.Newman@itt.com
> >wrote:
>
> > What if you're unique id is a composite of two field when you create th=
e
> > document?
> >
> > I.E.
> > doc.add(new Field("partno", "123345",
> > Field.Store.whatever, Field.Index.UN_TOKENIZED);
> > doc.add(new Field("storeLoc", "Springfield",
> > Field.Store.whatever, Field.Index.UN_TOKENIZED);
> >
> > How do you create a Term for this?  Is this possible?  Is this the
> correct
> > way to create a document that has two fields?  If so I am a little lost
> on
> > how to create a term to correctly find this.
> >
> > Thanks,
> > Billy
> >
> > -----Original Message-----
> > From: Erick Erickson [mailto:erickerickson@gmail.com]
> > Sent: Friday, April 17, 2009 8:08 PM
> > To: java-user@lucene.apache.org
> > Subject: Re: IndexWriter update method
> >
> > What you're missing is that the example has no unique ID, it wasn't
> created
> > with update in mind.
> >
> > There's no hidden magic for Lucene knowing *what* document you want
> > to have updated, you have to provide it yourself, and it should be
> unique.
> >
> > Imagine a parts catalog, or an index of a directory tree. In the parts
> > catalog,
> > you could identify the document by its part number, so you'd probably
> index
> > it something like doc.add(new Field("partno", "123345",
> > Field.Store.whatever, Field.Index.UN_TOKENIZED);
> > Indexing a directory tree you could use the complete file path similarl=
y.
> >
> > Now, each document will have one (and only one) partno, and it'll be
> unique
> > (you really
> > don't want to tokenize this).
> >
> > To update, you'd form your term on the field "partno" and value "123345=
",
> > thus uniquely
> > identifying the document you want replaced, and use that term in your
> > update
> > statement.
> > Think of the Term as a unique key for the document that *you've*
> > deliberately put there.
> >
> > I'm pretty sure (but not positive) that if you update a document where
> the
> > term doesn't
> > have any matches, you'll get a simple insert, but I won't guarantee it.
> >
> > HTH
> > Erick
> >
> >
> > On Fri, Apr 17, 2009 at 9:28 PM, Newman, Billy <Billy.Newman@itt.com>
> > wrote:
> >
> > > Ok I am still confused.
> > >
> > > Looking at the examples to index a document I would do something like
> the
> > > following:
> > >        Document document =3D new Document();
> > >        document.add(Field.UnStored("article", article));
> > >        document.add(Field.Text("comments", comments));
> > >        Analyzer analyzer  =3D new StandardAnalyzer();
> > >        IndexWriter writer =3D new IndexWriter(indexDirectory, analyze=
r,
> > > false);
> > >        writer.addDocument(document);
> > >        writer.optimize();
> > >        writer.close();
> > >
> > > Now lets say that the comments can change and when they do I want to
> > update
> > > that document to contain the newly updated comments.
> > >
> > > So I would have to go back and check my index to see if that book
> already
> > > exists.
> > > Query q =3D new QueryParser("article", analyzer).parse(querystr);
> > > int hitsPerPage =3D 10;
> > > IndexSearcher searcher =3D new IndexSearcher(index);
> > > TopDocCollector collector =3D new TopDocCollector(hitsPerPage);
> > > searcher.search(q, collector);
> > > ScoreDoc[] hits =3D collector.topDocs().scoreDocs;
> > > if (hits!=3D null && hits.length > 0) {
> > >  // ?
> > >  // Then this already exists and I just want to update the comments
> > section
> > > }
> > >
> > > Does that make sense?  Am I going about this wrong?
> > >
> > > Billy
> > >
> > >
> > > ________________________________________
> > > From: Tim Williams [williamstw@gmail.com]
> > > Sent: Friday, April 17, 2009 6:05 PM
> > > To: java-user@lucene.apache.org
> > > Subject: Re: IndexWriter update method
> > >
> > > On Fri, Apr 17, 2009 at 7:27 PM, Newman, Billy <Billy.Newman@itt.com>
> > > wrote:
> > > > I am looking for info on how to use the IndexWriter.update method. =
 A
> > > short example of how to add a document and then later update would
> > > > be very helpful.  I get lost because I can add a document with just
> the
> > > document, but I need a document and a Term.  I am not really sure
> > > > what a Term is since I did not use a Term to create the document no=
r
> do
> > I
> > > see it in any of the examples of searching/adding.
> > >
> > > When you index the document, add an ID field that is unique.  Then
> > > when you go to update the document the "Term" will be the ID of the
> > > document you wish to update.  For example, you might add a URL as the
> > > unique ID, then to update it might look something like:
> > >
> > > writer.update(new Term("id","http://apache.org/lucene/index.htm"),
> doc)
> > >
> > >
> > > --tim
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> > > This e-mail and any files transmitted with it may be proprietary and
> are
> > > intended solely for the use of the individual or entity to whom they
> are
> > > addressed. If you have received this e-mail in error please notify th=
e
> > > sender.
> > > Please note that any views or opinions presented in this e-mail are
> > solely
> > > those of the author and do not necessarily represent those of ITT
> > > Corporation. The recipient should check this e-mail and any attachmen=
ts
> > for
> > > the presence of viruses. ITT accepts no liability for any damage caus=
ed
> > by
> > > any virus transmitted by this e-mail.
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org