lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "daniel rosher" <daniel.ros...@hotonline.com>
Subject Re: Search for null
Date Wed, 25 Jul 2007 17:12:20 GMT
In this case you should look at the source for RangeFilter.java. 

Using this you could create your own filter using TermEnum and TermDocs
to find all documents that had some value for the field. 

You would then flip this filter (perhaps write a FlipFilter.java, that
takes an existing filter in it's constructor, for reuse) to get all
documents then didn't have a value for this field (i.e. null values). 

Depending on the time it takes to generate these filters, you could then
cache this filter with CachingWrappingFilter for subsequent searches.

Dan

On Wed, 2007-07-25 at 08:57 -0700, Jay Yu wrote:
> what if I do not know all possible values of that field which is a 
> typical case in a free text search?
> 
> daniel rosher wrote:
> > You will be unable to search for fields that do not exist which is what
> > you originally wanted to do, instead you can do something like:
> > 
> > -Establish the query that will select all non-null values
> > 
> > TermQuery tq1 = new TermQuery(new Term("field","value1"));
> > TermQuery tq2 = new TermQuery(new Term("field","value2"));
> > ...
> > TermQuery tqn = new TermQuery(new Term("field","valuen"));
> > BooleanQuery query = new BooleanQuery();
> > booleanQuery.add(tq1,BooleanClause.Occur.SHOULD);
> > booleanQuery.add(tq2,BooleanClause.Occur.SHOULD);
> > ...
> > booleanQuery.add(tqn,BooleanClause.Occur.SHOULD);
> > 
> > OR perhaps a range query if your values are contiguous
> > 
> > Term start = new Term("field","198805");
> > Term end = new Term("field","198810");
> > Query query = new RangeQuery(start, end, true);
> > ;
> > 
> > OR just use the QueryParser
> > 
> > Query query = QueryParser.parse(parseCriteria,
> > "field", new StandardAnalyzer());
> > 
> > -Create the QueryFilter
> > 
> > QueryFilter queryFilter = new QueryFilter(query);
> > 
> > -flip the bits
> > 
> > final BitSet filterBitSet = queryFilter.bits(reader);
> > filterBitSet.flip(0,filterBitSet.size());
> > 
> > Now you have a filter that contains document matching the opposite of
> > that specified by the query, and can use in subsequent queries
> > 
> > Dan
> > 
> > 
> > 
> > On Tue, 2007-07-24 at 09:40 -0700, Jay Yu wrote:
> >> daniel rosher wrote:
> >>> Perhaps you can use a filter in the following way.
> >>>
> >>> -Create a filter (via QueryFilter) that would contain all document that
> >>> do not have null values for the field
> >> Interesting: what does the QueryFilter look like? Isn't it just as hard 
> >> as finding out what docs have the null values for the field?
> >> I really like to know your trick here.
> >>> -flip the bits of the filter so that it now contains documents that have
> >>> null values for a field
> >>> -Use the filter in conjunction with subsequent queries.
> >>>
> >>> This would also help with performance as filters are simply bitsets and
> >>> can cheaply be stored, generated once and used often.
> >>>
> >>> Dan
> >>>
> >>> On Mon, 2007-07-23 at 13:57 -0700, Jay Yu wrote:
> >>>> If you want performance, a better way might be to assign some special

> >>>> string/value (if it's easy to create) to the missing field of docs and

> >>>> index the field without tokenizing it. Then you may search for that

> >>>> special value to find the docs.
> >>>>
> >>>> Jay
> >>>>
> >>>> Les Fletcher wrote:
> >>>>> Does this particular range query have any significant performance
issues?
> >>>>>
> >>>>> Les
> >>>>>
> >>>>> Erik Hatcher wrote:
> >>>>>> On Jul 23, 2007, at 11:32 AM, testn wrote:
> >>>>>>> Is it possible to search for the document that specified
field 
> >>>>>>> doesn't exist
> >>>>>>> or such field value is null?
> >>>>>> This is from Solr, so I'm not sure off the top of my head if
this mojo 
> >>>>>> applies by itself, but a search for -fieldname:[* TO *] will
result in 
> >>>>>> all documents that do not have the specified field.
> >>>>>>
> >>>>>>     Erik
> >>>>>>
> >>>>>>
> >>>>>> ---------------------------------------------------------------------
> >>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>
> >>>>
> >>>>
> >>>> <<This email has been scanned for virus and spam content>>
> >>> Daniel Rosher
> >>> Developer
> >>>
> >>>
> >>> d: 0207 3489 912
> >>> t: 0870 2020 121
> >>> f: 0870 2020 131
> >>> m: 
> >>> http://www.hotonline.com/
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - -
> >>> This message is sent in confidence for the addressee only. It may contain
privileged 
> >>> information. The contents are not to be disclosed to anyone other than the
addressee. 
> >>> Unauthorised recipients are requested to preserve this confidentiality and
to advise 
> >>> us of any errors in transmission. Thank you.
> >>>
> >>> hotonline ltd is registered in England & Wales. Registered office: One
Canada Square, 
> >>> Canary Wharf, London E14 5AP. Registered No: 1904765.
> >>>
> >>>
> >>> This message has been scanned for viruses by BlackSpider MailControl - www.blackspider.com
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> > Daniel Rosher
> > Developer
> > 
> > 
> > d: 0207 3489 912
> > t: 0870 2020 121
> > f: 0870 2020 131
> > m: 
> > http://www.hotonline.com/
> > 
> > 
> > 
> > 
> > 
> > 
> > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - -
> > This message is sent in confidence for the addressee only. It may contain privileged

> > information. The contents are not to be disclosed to anyone other than the addressee.

> > Unauthorised recipients are requested to preserve this confidentiality and to advise

> > us of any errors in transmission. Thank you.
> > 
> > hotonline ltd is registered in England & Wales. Registered office: One Canada
Square, 
> > Canary Wharf, London E14 5AP. Registered No: 1904765.
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> > 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
Daniel Rosher
Developer


d: 0207 3489 912
t: 0870 2020 121
f: 0870 2020 131
m: 
http://www.hotonline.com/






- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - -
This message is sent in confidence for the addressee only. It may contain privileged 
information. The contents are not to be disclosed to anyone other than the addressee. 
Unauthorised recipients are requested to preserve this confidentiality and to advise 
us of any errors in transmission. Thank you.

hotonline ltd is registered in England & Wales. Registered office: One Canada Square,

Canary Wharf, London E14 5AP. Registered No: 1904765.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message