lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Tignor <ctig...@thinkmap.com>
Subject Re: customized SpanQuery Payload usage
Date Wed, 25 Nov 2009 14:26:40 GMT
The problem is that I need to be able to match spans resulting from a a
SpanNearQuery with the Term they came from so I can eliminate using Payloads
from certain Terms on a query-by-query basis.

I still need this term to effect the results of a NearSpanQuery as per the
usual logic, I just need to know when iterating over the resulting Spans
that when I hit one originating from a certain Term not to load it's payload
data.

I recently solved the problem fairly simply after doing much research into
the source.  When I am building the query and encounter a term I don't want
to recover payload data from, I add my own anonymous sub-type of
SpanTermQuery to my developing SpanNearQuery that itself creates an
anonymous sub-type of SpanTerm which simply returns an empty Collection for
it's payload data.

new SpanTermQuery(new Term(QueryVocabTracker.CONTENT_FIELD, tagToken)) {
                                @Override
                                public Spans getSpans(IndexReader reader)
                                        throws IOException {
                                    return new
TermSpans(reader.termPositions(term), term) {

                                        @Override
                                        public Collection getPayload()
                                                throws IOException {
                                            // no payload data for this
TermSpan
                                            return Collections.emptyList();
                                        }
                                    };
                                }
                            }

thanks,

C>T>



On Wed, Nov 25, 2009 at 8:10 AM, Grant Ingersoll <gsingers@apache.org>wrote:

>
> On Nov 24, 2009, at 9:56 AM, Christopher Tignor wrote:
>
> > Hello,
> >
> > For certain span queries I construct problematically by piecing together
> my
> > own SpanTermQueries I would like to enforce that Payload data is not
> > returned for matches on those specific terms used by the constituent
> > SapnTermQueries.
>
> I'm not sure I follow.  For those terms you don't want payloads, why can't
> you just avoid getting payloads?  Span queries themselves do not require
> payloads for execution.  Can you share your code for iterating over the
> spans?
>
> >
> > For exmaple if I search for a position match with a SpanQuery referencing
> > the tokens "_n" and "work" and there is Payload data for each (there
> needs
> > to be for other types of queries) I would like to be able to screen out
> the
> > payload data originating from any matched "_n" tokens.
> >
> > I thought for the tokens I am not interested in receiving payload data
> from
> > I might simply create (anonymously) my own subclass of SpanTermQuery
> which
> > overrides getSpans and returns another custom class which extends
> TermSpans
> > but there simply overrides isPayloadAvailable to return false:
> >
> > new SpanTermQuery(new Term(myField, myTokenString)) {
> >
> >
> >
> >                                public Spans getSpans(IndexReader reader)
> >                                        throws IOException {
> >                                    return new
> > TermSpans(reader.termPositions(term), term) {
> >
> >                                        public boolean
> isPayloadAvailable()
> > {
> >                                            return false;
> >                                        }
> >
> >                                    };
> >                                }
> >                            });
> >
> > This however seems to eliminating payload data for all matches though I'm
> > not sure why and am tracing through the code, looking at
> NearSpansUnordered.
> >
> > Any thoughts?
> >
> > thanks so much,
> >
> > C>T>
> >
> >
> > --
> > TH!NKMAP
> >
> > Christopher Tignor | Senior Software Architect
> > 155 Spring Street NY, NY 10012
> > p.212-285-8600 x385 f.212-285-8999
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
> Solr/Lucene:
> http://www.lucidimagination.com/search
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
TH!NKMAP

Christopher Tignor | Senior Software Architect
155 Spring Street NY, NY 10012
p.212-285-8600 x385 f.212-285-8999

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message