lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jochen" <lucenel...@quontis.com>
Subject RE: New Query Type(s)
Date Wed, 07 Jan 2004 18:01:47 GMT
Please disregard my prior post. I see that I outed myself as stupid.

Thanks, and sorry for the traffic.

> -----Original Message-----
> From: Jochen [mailto:lucenelist@quontis.com]
> Sent: Wednesday, January 07, 2004 8:48 AM
> To: 'Lucene Developers List'
> Subject: New Query Type(s)
> 
> Lucene Gurus:
> 
> 	After looking at and trying out lucene for quite some time (and
> liking it), I would like to create some advanced queries to speed up our
> system. The first one I need to be as follows:
> 
> 	(+"a b c" +"d e")~10
> 
> 	In other words, I need to run a query in where two phrases (for
> right now an exact match will be fine) are in some defined proximity (in
> this example, I need "a b c" somewhere close to "d e").
> 
> 	The indexes created nicely support this kind of functionality, and
> the pieces of are all implemented (PhraseQuery, BooleanQuery, PhraseQuery
> with Slop). However, I believe that they cannot be stringed together with
> the current lucene version, to give me what I need.
> 
> 	I have studied the code and I will write the code to create this
> type of query (and make it available, if I get it working), but I would
> very
> much appreciate a high level roadmap from more experienced people (i.e.
> create a new Query Object, change this and that object to do such and such
> ...).
> 
> 	Cheers!
> 		Jochen
> 
> > -----Original Message-----
> > From: Robert Engels [mailto:rengels@ix.netcom.com]
> > Sent: Tuesday, January 06, 2004 1:17 PM
> > To: Lucene-Dev
> > Subject: normalization BAD DESIGN ?
> >
> > The design & implementation of the document/field normalization is very
> > poor.
> >
> > It requires a byte[] with as (number of documents * number of fields)
> > elements!
> >
> > With a document store of 100 million documents, with multiple fields,
> the
> > memory required is staggering.
> >
> > IndexReader has the following method definition,
> >
> > public abstract byte[] norms(String field) throws IOException;
> >
> > which is the source of the problem.
> >
> > Even returning null from this method does not help, as the PhraseScorer
> > and
> > derived classes, maintain a reference, and do not perform a null check.
> >
> > I have modified 105 of PhraseScorer to be
> >
> > if(norms!=null)
> >     score *= Similarity.decodeNorm(norms[first.doc]); // normalize
> >
> > Would it not be a better design, to define a method in IndexReader
> >
> > float getNorm(String fieldname,int docnum);
> >
> > so a implementation could cache this information in some fashion, or
> > always
> > return 1.0 if it didn't care?
> >
> > Robert Engels
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message