lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gilad Barkai (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3262) Facet benchmarking
Date Sun, 09 Oct 2011 09:26:29 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123650#comment-13123650
] 

Gilad Barkai commented on LUCENE-3262:
--------------------------------------

Doron, great patch!

I ran it and was somewhat surprised at the large overhead of the facet indexing. Digging deeper,
I found the number of random facets to be 1-120 per document, with depth of 1-8. I believe
those are overkill requirements. I reduced those to 1-*20* per document with depth of 1-*3*
and got results I could live with.
Those number are scenario dependent but I think most cases I encountered are closer to my
proposed numbers. What do you think?

Also, I changed the alg to consume the entire content source.

I would suggest renaming max.facet.length (in the alg) & maxFacetLengh (in the code) to
max.facet.*depth* and maxFacetDepth. Depth seems more appropriate. 

Other than that - I'm thrilled to have a working benchmark with facets - thanks!
                
> Facet benchmarking
> ------------------
>
>                 Key: LUCENE-3262
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3262
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: modules/benchmark, modules/facet
>            Reporter: Shai Erera
>            Assignee: Doron Cohen
>         Attachments: CorpusGenerator.java, LUCENE-3262.patch, LUCENE-3262.patch, LUCENE-3262.patch,
TestPerformanceHack.java
>
>
> A spin off from LUCENE-3079. We should define few benchmarks for faceting scenarios,
so we can evaluate the new faceting module as well as any improvement we'd like to consider
in the future (such as cutting over to docvalues, implement FST-based caches etc.).
> Toke attached a preliminary test case to LUCENE-3079, so I'll attach it here as a starting
point.
> We've also done some preliminary job for extending Benchmark for faceting, so I'll attach
it here as well.
> We should perhaps create a Wiki page where we clearly describe the benchmark scenarios,
then include results of 'default settings' and 'optimized settings', or something like that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message