lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shahab Mohammed <shahab.n...@gmail.com>
Subject Re: [lucy-user] Is there any benchmarking details about how fast is lucy indexing
Date Thu, 04 Dec 2014 13:42:34 GMT
Dear Nick
Thank you so much for your reply. This helps a lot to me.
Kind Regards
Shahab


On Thu, Dec 4, 2014 at 5:35 AM, Nick Wellnhofer <wellnhofer@aevum.de> wrote:

> On 03/12/2014 16:15, Shahab Mohammed wrote:
>
>> I will like to know what is rate of indexing .. ?? MB/sec that can be
>> indexed. If some one has done such benchmarking please share the info with
>> me.
>>
>
> This depends on a lot of factors like the schema and analysis chain you
> use, the total size of your index, and the hardware. But if you want a
> ballpark figure, I'd say about 1-2 MB/s.
>
> Here is some data for one of our production systems running on a typical
> VPS:
>
> Total fields: 3
> Full text field: 2
> Highlightable fields: 2
> Documents: 20,000
> Raw input size: 35 MB
> Index size: 80 MB
> Analysis chain:
>   StandardTokenizer
>   Normalizer
>   SnowballStopFilter
>   SnowballStemmer
> Total time to reindex: 30s
>
> This includes the time to pull all of the data out of a PostgreSQL
> database, prepare it for indexing, and some other unrelated operations
> which shouldn't have a large impact.
>
> Nick
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message