lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: How to modify a document Field before the document is indexed?
Date Tue, 20 Jul 2010 00:07:58 GMT
One subtlety you might be able to use to advantage... This is
where getPositionIncrementGap in your analyzer can be used
to separate the two bits of data in the same field. If I have my
own analyzer (which could be a trivial override of an existing one)
that returns, say 10,000 from getPositionIncrementGap Now, if
you wanted to insure that proximity queries only matched in a particular
add to your "content" field, you could specify that all the terms had
to occur within 10,000 of each other...

FWIW
Erick

On Mon, Jul 19, 2010 at 7:56 PM, Joe Hansen <joe.hansen.at@gmail.com> wrote:

> Thanks for your reply Koji! Your suggestion worked fine. I thought
> adding a field named "contents" to a document, even though it contains
> a field already named "contents" would NOT do anything. But looks like
> I am wrong!
>
> Thank you for your kind help! :)
>
> Regards,
> Joe
>
> On Mon, Jul 19, 2010 at 5:12 PM, Koji Sekiguchi <koji@r.email.ne.jp>
> wrote:
> > (10/07/20 7:31), Joe Hansen wrote:
> >>
> >> Hey All,
> >>
> >> I am using Apache Lucene (2.9.1) and its fast and it works great! I
> >> have a question in connection with Apache PDFBox.
> >>
> >> The following command creates a Lucent Document from a PDF file:
> >> Document document =
> >>
> >>
> org.apache.pdfbox.searchengine.lucene.LucenePDFDocument.getDocument(docFile);
> >>
> >> The Lucene Document, document, has a bunch of fields. Among those
> >> fields, is a field named, "content". I need to add some more data to
> >> that field. For example, I would like to add some description and
> >> keywords. How do I go about doing that? Any pointers would be greatly
> >> welcome! :)
> >>
> >> Thanks for your time!
> >>
> >> Regards,
> >> Joe
> >>
> >>
> >
> > Joe,
> >
> > You can add your data to the document object:
> >
> > Document document =
> >
> org.apache.pdfbox.searchengine.lucene.LucenePDFDocument.getDocument(docFile);
> > document.add( new Field( "content", "your data", Store.YES,
> Index.ANALYZED )
> > );
> >
> >
> http://lucene.apache.org/java/2_9_3/api/all/org/apache/lucene/document/Document.html#add%28org.apache.lucene.document.Fieldable%29
> >
> > Koji
> >
> > --
> > http://www.rondhuit.com/en/
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message