lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gupta, Rajiv" <Rajiv.Gu...@netapp.com>
Subject RE: [lucy-user] LUCY_Folder_Open_Out_IMP at core/Lucy/Store/Folder.c line 119
Date Fri, 09 Dec 2016 18:16:44 GMT
Thanks Nick for your help and workaround. I will ask my infra team to pick up latest 0.6 and
install. I hope 0.6 works out better than 0.4. 

I stopped using LightMergeManager and I did not get that error any more however now performance
more sucks. I'm going to try few things now:

1. Try the workaround provided by you. (I don't use background merger)
2. Try to use background merge in an another loop with above option.
2. Try to store information in-memory/storable/db instead of using search everytime. I think
when I'm merging search with doc indexing under same process it is creating problems. If other
system using search I don't see any problem. 
3. Try to serialize the index directories to avoid overlap anyway they all are running as
parallel process.  

Hope one of above should work out. 

Thanks,
Rajiv

-----Original Message-----
From: Nick Wellnhofer [mailto:wellnhofer@aevum.de] 
Sent: Friday, December 09, 2016 8:51 PM
To: user@lucy.apache.org
Subject: Re: [lucy-user] LUCY_Folder_Open_Out_IMP at core/Lucy/Store/Folder.c line 119

On 09/12/2016 15:01, Gupta, Rajiv wrote:
> I'm getting this error very frequently now :(
>
> BasicFlexGroup0_io_workload/pm_io/.lucyindex/1 :  input 47 too high 
> S_fibonacci at core/Lucy/Index/IndexManager.c line 129
>
> Is there any workaround?
>
> I'm using LightMergeManager I'm not sure if it is because of that. Should I stop that?
>
> Please help. Very frequently I'm getting it now.

I committed a fix to the 0.4, 0.5, and 0.6 branches. Your best option is to get one of these
branches with Git and recompile Lucy. If you can't do that, either stop using LightMergeManager,
or try the following untested workaround.

Modify LightMergeManager to not call SUPER::recycle:

     package LightMergeManager;
     use base qw( Lucy::Index::IndexManager );

     sub recycle {
         my ( $self, %args ) = @_;
         my $seg_readers = $args{reader}->get_seg_readers;
         @$seg_readers = grep { $_->doc_max < 10 } @$seg_readers;
         return $seg_readers;
     }

Make BackgroundMerger always "optimize" the index before committing:

     $bg_merger->optimize;
     $bg_merger->commit;

> However, the search is now slower (after adding PolyReader/IndexReader). I used PolyReader
as in one of the forum it was mentioned that PolyReader has protection against some mem leak
issue.
>
> Any tips I can improve performance while using IndexReader?

Using PolyReader or IndexReader shouldn't make a difference performance-wise. 
The performance drop is probably caused by supplying an IndexManager to IndexReader or PolyReader
which results in additional overhead from read locks. You should move the index to a local
filesystem if you're concerned about performance.

> However, since I'm searching and indexing the files from the same process and same system
should they need to be unique? Should I append something like <hostname>_search, <hostname>_index,
<hostname>_delete?

No, simply use the hostname without a suffix.

Nick

Mime
View raw message