lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ralf Heyde" <ralf.he...@gmx.de>
Subject Lucene 8 / Facets and Nested Documents / ToParentBlockJoinQuery
Date Wed, 10 Apr 2019 12:02:57 GMT
Hey Lucene Geeks,
 
I have a more or less tricky Question. I'm currently trying to get my brain clear about Facets
and Nested / Parent-Child Document relations.
A research in the internet showed me quite some examples with Solr and Elasticsearch (and
yes, I have heavily used this in the past) - but in this project, we cannot use any of them. 
 
I was digging through lucene source code and tests in Github (https://github.com/apache/lucene-solr/tree/master/lucene)
and also Lucene Documentation (https://lucene.apache.org/core/8_0_0/index.html[https://lucene.apache.org/core/8_0_0/index.html]) -
but I did not find any idea about the direction to go.
 
What do I have: 
I am able to query and filter with quity heavy queries, as I want / need it.
 
What I dont have:
What I'm now trying to solve is, to get the facets for the parent AND the facets of the child
documents.
For the parent document, the facets are working fine. 
For the child documents null is returned - which is somehow understandable, as they dont appear
in the parent document. 
 
I already (also) pushed the Facetfields into the parent document, this works - but as far
as I understand, this will result in wrong results - as the facets will appear, eventhough
one of n subcriterias are not fulfilled.
 
Question: 
Can anybody point me into the right direction? 
 
Given I solve this, I'm open to share the solution that we can put this onto github / Lucene.
 
 
Thanks and Cheers from Berlin,
Ralf
 
 
Code Below:
 
 
My Code looks (more or less / i stripped code a little bit) like this:
 
 
// ----- Writing the indices ----- //  
final Iterable<Document> documents = documentMapper.map(indexDocument);

final FacetsConfig facetsConfig = new FacetsConfig();

final Directory taxonomyDir = newDirectory("taxonomy");
final DirectoryTaxonomyWriter taxonomyWriter = new DirectoryTaxonomyWriter(taxonomyDir);

final Directory indexDir = newDirectory("index");
final IndexWriter writer = indexWriter(indexDir);

for (final Document document : documents) {
    // adding document and update taxonomy 
    writer.addDocument(
        facetsConfig.build(taxonomyWriter, document)
    );
}

writer.close();
taxonomyWriter.close();

 
// ----- Reading index and taxo ----- //
final IndexReader reader = indexReader(indexDir);
final DirectoryTaxonomyReader taxonomyReader = new DirectoryTaxonomyReader(taxonomyDir);

final IndexSearcher searcher = new IndexSearcher(reader);

// Create a filter that defines "parent" documents in the index
final BitSetProducer parentsFilter = new QueryBitSetProducer(parentTermQuery());

// Define child document criteria (finds atoms, filtered as we dont need a score)
final Query atomQuery =
    new BooleanQuery.Builder()
        .add(atomChildQuery(), BooleanClause.Occur.FILTER)
        .build();

// Define child document criteria (finds details, filtered as we dont need a score)
final Query detailQuery =
    new BooleanQuery.Builder()
        .add(detailsChildTermQuery(), BooleanClause.Occur.FILTER)
        .build();

// Wrap the child document query to 'join' any matches 
final ToParentBlockJoinQuery atomChildJoinQuery = new ToParentBlockJoinQuery(atomQuery, parentsFilter,
ScoreMode.None);
final ToParentBlockJoinQuery detailChildJoinQuery = new ToParentBlockJoinQuery(detailQuery,
parentsFilter, ScoreMode.None);

// Combine the parent and nested child queries into a single query for a candidate
final BooleanQuery fullQuery =
    new BooleanQuery.Builder()
        .add(new BooleanClause(parentQuery, BooleanClause.Occur.SHOULD))
        .add(new BooleanClause(atomChildJoinQuery, BooleanClause.Occur.FILTER))
        .add(new BooleanClause(detailChildJoinQuery, BooleanClause.Occur.FILTER))
        .build();

final TopDocs topDocs = searcher.search(fullQuery, 2);
// topdocs returns results correct results which are not exposed currently
// ... 

// ----- Facets ----- //
// Taxos
final FacetsCollector facetsCollector = new FacetsCollector();
final TopDocs search = FacetsCollector.search(searcher, fullQuery, 10, facetsCollector);

// get dimensions
final Facets facets = new FastTaxonomyFacetCounts(taxonomyReader, facetsConfig, facetsCollector);

// --> is working as expected
// Document Type (parent, child, ...) 
results.add(facets.getTopChildren(10, EField._TYPE.fieldName()));
results.add(facets.getTopChildren(10, EField.SYMMETRY_GROUP_ID.fieldName()));
results.add(facets.getTopChildren(10, EField.CHEMICAL_FORMULA_ELEMENT_COUNT.fieldName()));
results.add(facets.getTopChildren(10, EField.CHEMICAL_FORMULA_ELEMENT_SUM.fieldName()));
results.add(facets.getTopChildren(10, EField.TOPOLOGICAL_CLASSIFICATION_NAME.fieldName()));
results.add(facets.getTopChildren(10, EField.TOPOLOGICAL_SUB_CLASSIFICATION_NAME.fieldName()));

// --> returns null
results.add(facets.getTopChildren(10, EField.ATOMS__ID.fieldName()));
results.add(facets.getTopChildren(10, EField.ATOMS__AMOUNT.fieldName()));
results.add(facets.getTopChildren(10, EField.ATOMS__SYMBOL.fieldName()));


reader.close();
indexDir.close();

 
 
 
 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message