lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: Best way to gather span/token positions from query?
Date Thu, 30 Apr 2009 22:10:05 GMT
I've been thinking about how to add spans to Solr, but haven't  
actually codified it yet.  I see no reason why a query parser can't  
support some syntax and the "dump spans" method approach can't be co- 
opted to write out the spans to the response.  Seems like it would  
need to be an additional part of the QueryComponent, plus some  
addition to the query parsers.  We can more easily add it to the  
Dismax parser, but if we add it to the Lucene one, then we should make  
that change in Lucene.


On Apr 29, 2009, at 7:06 PM, Sean O'Connor wrote:

> Hello,
>   I'm trying to find a decent approach for getting token positions  
> out of (or is that into?) solr query results. Is the best approach  
> to extend a QueryComponent and/or HighlightComponent? I'm new to  
> solr, and still on fairly shaky ground soany pointers or suggestions  
> are quite welcome.
>   As a little BACKGROUND:
>   I am trying to migrate a custom  lucene-only content anaylsys  
> project to solr. The 'old' system programmatically runs a few  
> thousand predefined queries against a corpus, and then analyzes the  
> results. The lucene score is good, but the actual position of the  
> hits is also quite important.
>   My previous system did a simple query parsing to create  
> SpanQuerys, and then used a modified dumpSpans() to get the token  
> position from the spans. Now I am trying to find how to use solr's  
> goodness (and MemoryIndex approach?) to get the span positions in a  
> more logical manner. I think the answer is in the highlighter, but  
> I'm getting a little twisted around, and could use a pointer.
>   I am using a recent Solr nightly snapshot, grails, Aduna Aperture,  
> and Intellij (if any of that matters)
> Thanks,
> Sean

Grant Ingersoll

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

View raw message