lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Minh Kama Yie" <m...@nuix.com.au>
Subject Re: Empty phrase search
Date Wed, 18 Dec 2002 22:19:33 GMT
Unfortunately bulletproof is exactly what we need :)
But I agree that in the right circumstance this solution would be ideal.

Thanks for all your help guys :)
Looking at Doug's suggestion of constructing the query manually...


----- Original Message -----
From: "Joshua O'Madadhain" <jmadden@ics.uci.edu>
To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Sent: Wednesday, December 18, 2002 10:10 AM
Subject: Re: Empty phrase search


> Minh:
>
> Assuming that the fields aren't binary (and thus capable of having any
> arbitrary string in them), you should be able to come up with a short
> string that will never appear in the field ("emptystring", "xyxyqu23",
> etc.) even if there is no single character that would work as a marker.
> As an extra layer of insurance, you could throw out any documents whose
> field only contained that string _as a substring_.
>
> This may not be completely bulletproof, but it's pretty close.  :)
>
> Joshua
>
>  jmadden@ics.uci.edu...Obscurium Per Obscurius...www.ics.uci.edu/~jmadden
>   Joshua O'Madadhain: Information Scientist, Musician, Philosopher-At-Tall
>  It's that moment of dawning comprehension that I live for--Bill Watterson
> My opinions are too rational and insightful to be those of any
organization.
>
> On Tue, 17 Dec 2002, Minh Kama Yie wrote:
>
> > Yep, thought of that and having argued it out with the lead on this, it
> > would be suitable in my opinion but even I would concede that it is a
hack
> > in our case since there is no limit to the variety of characters that
could
> > appear as the value for the field which may be an empty string. Hence
there
> > isn't the 100% guarantee that there will never be the circumstance when
this
> > special character or String of characters we choose to represent empty
> > fields will appear as a valid value.
> >
> > Thanks anyway though.
> >
> > Minh
> >
> > ----- Original Message -----
> > From: "Joshua O'Madadhain" <jmadden@ics.uci.edu>
> > To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> > Sent: Tuesday, December 17, 2002 6:03 PM
> > Subject: Re: Empty phrase search
> >
> >
> > > Minh:
> > >
> > > Why not just use a special character or string (one that won't appear
in a
> > > non-empty field) to represent an empty field?  It doesn't have to
appear
> > > in the user interface, if that's a concern; you could convert a query
> > > including "FieldA:NULL" (or whatever) into a query containing
> > > "FieldA:emptyfield" automatically.
> > >
> > > This allows you to finesse the entire issue of adding something to
> > > Lucene--which may be for the best anyway, since this is really just a
> > > special case of looking for fields whose contents have a specific
> > > characteristic.
> > >
> > > Good luck--
> > >
> > > Joshua O'Madadhain
> > >
> > >  jmadden@ics.uci.edu...Obscurium Per
Obscurius...www.ics.uci.edu/~jmadden
> > >   Joshua O'Madadhain: Information Scientist, Musician,
Philosopher-At-Tall
> > >  It's that moment of dawning comprehension that I live for--Bill
Watterson
> > > My opinions are too rational and insightful to be those of any
> > organization.
> > >
> > > On Tue, 17 Dec 2002, Minh Kama Yie wrote:
> > >
> > > > Thanks for that Peter.
> > > >
> > > > Unfortunately I'm not looking for "all" documents but rather
documents
> > where
> > > > the fields can be empty.
> > > > Hence using the universal field wouldn't quite work.
> > > > The "empty:true" approach is interesting however it effectively
doubles
> > the
> > > > number of indexable fields for a document.
> > > >
> > > > Need to assess whether or not we want to support this feature I
guess.
> > > >
> > > > Currently, considering Lucene's architecture this feature needs to
be
> > > > inherently supported rather than worked around with various fields I
> > think
> > > > for it to be used/done properly...
> > > >
> > > > Would anyone be able to point me in the general direction of where
to
> > look
> > > > in the Lucene code to attempt this?
> > > > Hopefully this way I can give a proper cost/benefit analysis as to
> > whether
> > > > or not to support this feature...and if all goes well, release
something
> > > > back for the great work you guys do....
> > > >
> > > >
> > > > ----- Original Message -----
> > > > From: "Peter Carlson" <carlson@bookandhammer.com>
> > > > To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> > > > Sent: Tuesday, December 17, 2002 4:25 PM
> > > > Subject: Re: Empty phrase search
> > > >
> > > >
> > > > > I don't think so.
> > > > >
> > > > > One approach to look for everything, or not something is to add a
> > field
> > > > > to each document which is a constant value. Like a field named
exists
> > > > > and a value of true.
> > > > >
> > > > > Then you can do search like
> > > > >
> > > > > exists:true NOT microsoft
> > > > >
> > > > > This will find all documents without the term microsoft in them.
> > > > >
> > > > > Just to have it find documents with nothing might be a little
tricky.
> > > > > You might want to put a field in the document which indicates the
size
> > > > > or something like that. Or just create an empty field and look for
> > > > >
> > > > > empty:true
> > > > >
> > > > > I hope this rambling helps
> > > > >
> > > > > --Peter
> > > > >
> > > > > On Monday, December 16, 2002, at 03:24 PM, Minh Kama Yie wrote:
> > > > >
> > > > > > Hi guys,
> > > > > >
> > > > > > Just wondering if lucene indexes empty strings and if so, how
to
> > > > > > search for this using the query language?
> > > > > >
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > Minh Kama Yie
> > > > > >
> > > > > > This message is intended only for the named recipient.
> > > > > > If you are not the intended recipient you are notified that
> > > > > > disclosing, copying, distributing or taking any action
> > > > > > in reliance on the contents of this information is strictly
> > > > > > prohibited.
> > > > >
> > > > >
> > > > > --
> > > > > To unsubscribe, e-mail:
> > > > <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > > > > For additional commands, e-mail:
> > > > <mailto:lucene-user-help@jakarta.apache.org>
> > > >
> > > >
> > > > --
> > > > To unsubscribe, e-mail:
> > <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > > > For additional commands, e-mail:
> > <mailto:lucene-user-help@jakarta.apache.org>
> > > >
> > >
> > >
> > > --
> > > To unsubscribe, e-mail:
> > <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > > For additional commands, e-mail:
> > <mailto:lucene-user-help@jakarta.apache.org>
> >
> >
> > --
> > To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>
> >
>
>
> --
> To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message