lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joel Bernstein (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-5773) CollapsingQParserPlugin problem with ElevateComponent
Date Mon, 03 Mar 2014 18:26:24 GMT

    [ https://issues.apache.org/jira/browse/SOLR-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13918369#comment-13918369
] 

Joel Bernstein commented on SOLR-5773:
--------------------------------------

David,

I agree that the elevated document should become the group head. I'll begin working on a patch
for this. I'm thinking of handling this during the finish() stage rather then the collect
stage. I hope to have something to test this week.

Joel 

> CollapsingQParserPlugin problem with ElevateComponent
> -----------------------------------------------------
>
>                 Key: SOLR-5773
>                 URL: https://issues.apache.org/jira/browse/SOLR-5773
>             Project: Solr
>          Issue Type: Improvement
>          Components: query parsers
>    Affects Versions: 4.6.1
>            Reporter: David
>            Assignee: Joel Bernstein
>              Labels: collapse, solr
>             Fix For: 4.8
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> Hi Joel,
> I sent you an email but I'm not sure if you received it or not. I ran into a bit of trouble
using the CollapsingQParserPlugin with elevated documents. To explain it simply, I want to
exclude grouped documents when one of the members of the group are contained in the elevated
document set. I'm not sure this is possible currently because as you explain above elevated
documents are added to the request context after the original query is constructed.
> To try to better illustrate the problem. If I have 2 documents docid=1 and docid=2 and
both have a groupid of 'a'. If a grouped query scores docid 2 first in the results but I have
elevated docid 1 then both documents are shown in the results when I really only want the
elevated document to be shown in the results.
> Is this something that would be difficult to implement? Any help is appreciated.
> I think the solution would be to remove the documents from liveDocs that share the same
groupid in the getBoostDocs() function. Let me know if this makes any sense. I'll continue
working towards a solution in the meantime.
> {code}
> private IntOpenHashSet getBoostDocs(SolrIndexSearcher indexSearcher, Set<String>
boosted) throws IOException {
>       IntOpenHashSet boostDocs = null;
>       if(boosted != null) {
>         SchemaField idField = indexSearcher.getSchema().getUniqueKeyField();
>         String fieldName = idField.getName();
>         HashSet<BytesRef> localBoosts = new HashSet(boosted.size()*2);
>         Iterator<String> boostedIt = boosted.iterator();
>         while(boostedIt.hasNext()) {
>           localBoosts.add(new BytesRef(boostedIt.next()));
>         }
>         boostDocs = new IntOpenHashSet(boosted.size()*2);
>         List<AtomicReaderContext>leaves = indexSearcher.getTopReaderContext().leaves();
>         TermsEnum termsEnum = null;
>         DocsEnum docsEnum = null;
>         for(AtomicReaderContext leaf : leaves) {
>           AtomicReader reader = leaf.reader();
>           int docBase = leaf.docBase;
>           Bits liveDocs = reader.getLiveDocs();
>           Terms terms = reader.terms(fieldName);
>           termsEnum = terms.iterator(termsEnum);
>           Iterator<BytesRef> it = localBoosts.iterator();
>           while(it.hasNext()) {
>             BytesRef ref = it.next();
>             if(termsEnum.seekExact(ref)) {
>               docsEnum = termsEnum.docs(liveDocs, docsEnum);
>               int doc = docsEnum.nextDoc();
>               if(doc != -1) {
>                 //Found the document.
>                 boostDocs.add(doc+docBase);
>                *// HERE REMOVE ANY DOCUMENTS THAT SHARE THE GROUPID NOT ONLY THE DOCID
//*
>                 it.remove();
>               }
>             }
>           }
>         }
>       }
>       return boostDocs;
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message