lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Wellnhofer <wellnho...@aevum.de>
Subject Re: [lucy-user] Lucy Benchmarking
Date Wed, 01 Feb 2017 12:42:51 GMT
On 01/02/2017 01:44, Kasi Lakshman Karthi Anbumony wrote:
> (1)  Is Lucy multithreaded or single threaded?

Single-threaded.

> (2) Are "C" runtime and bindings stable?

Yes.

> (2) Is there preexisting benchmark code written in "C" to measure Lucy performance?

No.

> (3) I am seeing one under devel/benchmarks/indexers/LuceneIndexer.java. But this one
is written in Java and looks like benchmarking Lucene not Lucy. Am I right in my observation?

The corresponding Perl benchmark script for Lucy is lucy_indexer.plx:

 
https://git1-us-west.apache.org/repos/asf?p=lucy.git;a=tree;f=devel/benchmarks/indexers;h=77626c37285602941376c5e5950a20e50683da40;hb=HEAD

> (4) I was thinking of modifying the lucy/c/sample applications as benchmarking application.
Is this a good strategy.
> Btw is there a good way to build sample files. I have to modify the Makefile in luc/c/
directory to build the sample files and  I am not sure if this is the correct way.

You can find some guidance on how to compile Lucy applications in the comment 
on top of getting_started.c:

 
https://git1-us-west.apache.org/repos/asf?p=lucy.git;a=blob;f=c/sample/getting_started.c;h=6d6193d772f2ceaac86c67cc49169878b4d4d2f6;hb=HEAD

Basically, you have to run the Clownfish compiler "cfc" to generate header 
files, then you can compile your code and link against libclownfish and liblucy.

Benchmark results for the indexer will largely depend on the particular 
Analyzer chain and the total size of your index. The default EasyAnalyzer 
consists of

- StandardTokenizer
- Unicode Normalizer
- SnowballStemmer

StandardTokenizer is pretty fast, but Normalizer and Stemmer are 
CPU-intensive. Last time I checked, they account for about two-thirds of the 
processing time for small indices.

A better benchmarking framework would be a much needed contribution.

Nick


Mime
View raw message