lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean O'Connor <sean...@gmail.com>
Subject Best way to gather span/token positions from query? (mis-posted to dev list...)
Date Thu, 30 Apr 2009 13:56:22 GMT
Hello,
    I'm trying to find a decent approach for getting token positions out
of (or is that into?) Solr query results. Is the best approach to extend
a QueryComponent and/or HighlightComponent? I'm new to solr, and still
on fairly shaky ground so any pointers or suggestions are quite welcome.

    As a little BACKGROUND:
    I am trying to migrate a custom  lucene-only content anaylsis
project to solr. The 'old' system programmatically runs a few thousand
predefined queries against a corpus, and then analyzes the results. The
lucene score is good, but the actual position of the hits is also quite
important.

    My previous system did a simple query parsing to create SpanQuerys,
and then used a modified dumpSpans() to get the token position from the
spans. Now I am trying to find how to use solr's goodness (and
MemoryIndex approach?) to get the span positions in a more logical
manner. I think the answer is in the highlighter, but I'm getting a
little twisted around, and could use a pointer.

    I am using a recent Solr nightly snapshot, grails, Aduna Aperture,
and Intellij (if any of that matters). Also, I posted this to the dev
list, incorrectly I believe; apologies for the cross posting.
Thanks,

Sean



Mime
View raw message