lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sol myr <solmy...@yahoo.com>
Subject Re: [Lucene] Frequencies and positions - are they stored per field?
Date Mon, 10 Oct 2011 10:10:04 GMT
Thanks so much, this helped a lot :)



----- Original Message -----
From: Uwe Schindler <uwe@thetaphi.de>
To: java-user@lucene.apache.org; 'sol myr' <solmyr72@yahoo.com>
Cc: 
Sent: Tuesday, October 4, 2011 12:14 PM
Subject: RE: [Lucene] Frequencies and positions - are they stored per field?

Hi,

Term Vectors are somehow duplicate information. It is used to get quickly *per document* all
vectors for *one field*. This means you get the positions, offsets, and frequencies for the
requested document as one blob like a stored field that can be used e.g. for more like this
or highlighting (FastVectorHighligter also needs term vectors).

It's identical to the difference between indexed fields and stored field (in fact the information
stored if you enable TermVectors during indexing is similar to stored fields, see it like
a binary stored field containing all vectors for the corresponding document).

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: sol myr [mailto:solmyr72@yahoo.com]
> Sent: Tuesday, October 04, 2011 12:08 PM
> To: java-user@lucene.apache.org
> Subject: Re: [Lucene] Frequencies and positions - are they stored per field?
> 
> Thanks a lot.
> But then what's the added value of Field.TermVector?
> 
> Can't it be deduced from the overall Lucene index? Or is it just inefficient to
> deduce?
> 
> Thanks again :)
> 
> 
> 
> ----- Original Message -----
> From: Uwe Schindler <uwe@thetaphi.de>
> To: java-user@lucene.apache.org; 'sol myr' <solmyr72@yahoo.com>
> Cc:
> Sent: Tuesday, October 4, 2011 11:53 AM
> Subject: RE: [Lucene] Frequencies and positions - are they stored per field?
> 
> Lucene always uses a field, a query using a term without a field is impossible.
> See each field as a parallel inverted index; all statistics are per field, too. If you
> pass a query without a field name to QueryParser it will chose the default field,
> that’s given when creating the QueryParser.
> 
> Uwe
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> 
> > -----Original Message-----
> > From: sol myr [mailto:solmyr72@yahoo.com]
> > Sent: Tuesday, October 04, 2011 11:46 AM
> > To: lucene
> > Subject: [Lucene] Frequencies and positions - are they stored per field?
> >
> >
> >
> > Hi,
> >
> > I use Lucene, but an not familiar with its internals.
> > I'd appreciate help understanding whether Term Frequences and
> > Positions -
> are
> > stored  per Document of per Field?
> > On the one hand, I never ask for "Field.TermVector" because I read
> > it's
> only
> > required for "MoreLikeThis" (which I don't need).
> > On the other hand, my searches *are* based on fields...
> >
> > Here's my code:
> > // Write (without Field.TermVector):
> >
> > Document doc=new Document();
> > doc.add(new Field("subject",  "Requisition request", Store.YES,
> > Index.ANALYZED)); doc.add(new Field("body",  "Attached is an Urgent
> > requisition request", Store.YES, Index.ANALYZED));
> > write.addDocument(doc);
> >
> > // And my Query:
> > Query query=parser.parse("subject : urgent");
> >
> > Now how does Lucene manage this query?
> > I asked it to search the "subject" Field.
> > But if the "inverted index" doesn't keep fields, it would only
> > remember
> that
> > "The term 'Urgent' appears in SOME FIELD of document#1 "...
> > Isn't it true?
> >
> > If so, how would it make sure to retrieve only documents that match in
> > the Subject ?
> >
> > Thanks.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message