lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Wellnhofer <wellnho...@aevum.de>
Subject Re: [lucy-user] LUCY_Folder_Open_Out_IMP at core/Lucy/Store/Folder.c line 119
Date Fri, 09 Dec 2016 15:20:34 GMT
On 09/12/2016 15:01, Gupta, Rajiv wrote:
> I'm getting this error very frequently now :(
>
> BasicFlexGroup0_io_workload/pm_io/.lucyindex/1 :  input 47 too high
> S_fibonacci at core/Lucy/Index/IndexManager.c line 129
>
> Is there any workaround?
>
> I'm using LightMergeManager I'm not sure if it is because of that. Should I stop that?
>
> Please help. Very frequently I'm getting it now.

I committed a fix to the 0.4, 0.5, and 0.6 branches. Your best option is to 
get one of these branches with Git and recompile Lucy. If you can't do that, 
either stop using LightMergeManager, or try the following untested workaround.

Modify LightMergeManager to not call SUPER::recycle:

     package LightMergeManager;
     use base qw( Lucy::Index::IndexManager );

     sub recycle {
         my ( $self, %args ) = @_;
         my $seg_readers = $args{reader}->get_seg_readers;
         @$seg_readers = grep { $_->doc_max < 10 } @$seg_readers;
         return $seg_readers;
     }

Make BackgroundMerger always "optimize" the index before committing:

     $bg_merger->optimize;
     $bg_merger->commit;

> However, the search is now slower (after adding PolyReader/IndexReader). I used PolyReader
as in one of the forum it was mentioned that PolyReader has protection against some mem leak
issue.
>
> Any tips I can improve performance while using IndexReader?

Using PolyReader or IndexReader shouldn't make a difference performance-wise. 
The performance drop is probably caused by supplying an IndexManager to 
IndexReader or PolyReader which results in additional overhead from read 
locks. You should move the index to a local filesystem if you're concerned 
about performance.

> However, since I'm searching and indexing the files from the same process and same system
should they need to be unique? Should I append something like <hostname>_search, <hostname>_index,
<hostname>_delete?

No, simply use the hostname without a suffix.

Nick

Mime
View raw message