lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joshua O'Madadhain" <jmad...@ics.uci.edu>
Subject Re: Empty phrase search
Date Tue, 17 Dec 2002 23:10:45 GMT
Minh:

Assuming that the fields aren't binary (and thus capable of having any
arbitrary string in them), you should be able to come up with a short
string that will never appear in the field ("emptystring", "xyxyqu23",
etc.) even if there is no single character that would work as a marker.  
As an extra layer of insurance, you could throw out any documents whose
field only contained that string _as a substring_.

This may not be completely bulletproof, but it's pretty close.  :)

Joshua

 jmadden@ics.uci.edu...Obscurium Per Obscurius...www.ics.uci.edu/~jmadden
  Joshua O'Madadhain: Information Scientist, Musician, Philosopher-At-Tall
 It's that moment of dawning comprehension that I live for--Bill Watterson
My opinions are too rational and insightful to be those of any organization.

On Tue, 17 Dec 2002, Minh Kama Yie wrote:

> Yep, thought of that and having argued it out with the lead on this, it
> would be suitable in my opinion but even I would concede that it is a hack
> in our case since there is no limit to the variety of characters that could
> appear as the value for the field which may be an empty string. Hence there
> isn't the 100% guarantee that there will never be the circumstance when this
> special character or String of characters we choose to represent empty
> fields will appear as a valid value.
> 
> Thanks anyway though.
> 
> Minh
> 
> ----- Original Message -----
> From: "Joshua O'Madadhain" <jmadden@ics.uci.edu>
> To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> Sent: Tuesday, December 17, 2002 6:03 PM
> Subject: Re: Empty phrase search
> 
> 
> > Minh:
> >
> > Why not just use a special character or string (one that won't appear in a
> > non-empty field) to represent an empty field?  It doesn't have to appear
> > in the user interface, if that's a concern; you could convert a query
> > including "FieldA:NULL" (or whatever) into a query containing
> > "FieldA:emptyfield" automatically.
> >
> > This allows you to finesse the entire issue of adding something to
> > Lucene--which may be for the best anyway, since this is really just a
> > special case of looking for fields whose contents have a specific
> > characteristic.
> >
> > Good luck--
> >
> > Joshua O'Madadhain
> >
> >  jmadden@ics.uci.edu...Obscurium Per Obscurius...www.ics.uci.edu/~jmadden
> >   Joshua O'Madadhain: Information Scientist, Musician, Philosopher-At-Tall
> >  It's that moment of dawning comprehension that I live for--Bill Watterson
> > My opinions are too rational and insightful to be those of any
> organization.
> >
> > On Tue, 17 Dec 2002, Minh Kama Yie wrote:
> >
> > > Thanks for that Peter.
> > >
> > > Unfortunately I'm not looking for "all" documents but rather documents
> where
> > > the fields can be empty.
> > > Hence using the universal field wouldn't quite work.
> > > The "empty:true" approach is interesting however it effectively doubles
> the
> > > number of indexable fields for a document.
> > >
> > > Need to assess whether or not we want to support this feature I guess.
> > >
> > > Currently, considering Lucene's architecture this feature needs to be
> > > inherently supported rather than worked around with various fields I
> think
> > > for it to be used/done properly...
> > >
> > > Would anyone be able to point me in the general direction of where to
> look
> > > in the Lucene code to attempt this?
> > > Hopefully this way I can give a proper cost/benefit analysis as to
> whether
> > > or not to support this feature...and if all goes well, release something
> > > back for the great work you guys do....
> > >
> > >
> > > ----- Original Message -----
> > > From: "Peter Carlson" <carlson@bookandhammer.com>
> > > To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> > > Sent: Tuesday, December 17, 2002 4:25 PM
> > > Subject: Re: Empty phrase search
> > >
> > >
> > > > I don't think so.
> > > >
> > > > One approach to look for everything, or not something is to add a
> field
> > > > to each document which is a constant value. Like a field named exists
> > > > and a value of true.
> > > >
> > > > Then you can do search like
> > > >
> > > > exists:true NOT microsoft
> > > >
> > > > This will find all documents without the term microsoft in them.
> > > >
> > > > Just to have it find documents with nothing might be a little tricky.
> > > > You might want to put a field in the document which indicates the size
> > > > or something like that. Or just create an empty field and look for
> > > >
> > > > empty:true
> > > >
> > > > I hope this rambling helps
> > > >
> > > > --Peter
> > > >
> > > > On Monday, December 16, 2002, at 03:24 PM, Minh Kama Yie wrote:
> > > >
> > > > > Hi guys,
> > > > >
> > > > > Just wondering if lucene indexes empty strings and if so, how to
> > > > > search for this using the query language?
> > > > >
> > > > >
> > > > > Regards,
> > > > >
> > > > > Minh Kama Yie
> > > > >
> > > > > This message is intended only for the named recipient.
> > > > > If you are not the intended recipient you are notified that
> > > > > disclosing, copying, distributing or taking any action
> > > > > in reliance on the contents of this information is strictly
> > > > > prohibited.
> > > >
> > > >
> > > > --
> > > > To unsubscribe, e-mail:
> > > <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > > > For additional commands, e-mail:
> > > <mailto:lucene-user-help@jakarta.apache.org>
> > >
> > >
> > > --
> > > To unsubscribe, e-mail:
> <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > > For additional commands, e-mail:
> <mailto:lucene-user-help@jakarta.apache.org>
> > >
> >
> >
> > --
> > To unsubscribe, e-mail:
> <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > For additional commands, e-mail:
> <mailto:lucene-user-help@jakarta.apache.org>
> 
> 
> --
> To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
> 


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message