lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gupta, Rajiv" <>
Subject RE: [lucy-user] LUCY_Folder_Open_Out_IMP at core/Lucy/Store/Folder.c line 119
Date Mon, 19 Dec 2016 03:21:39 GMT
Thanks Nick for your reply and taking time on this. One quick question before you lost on below
email. In release 0.6.1 we have fix for below bug right?

> BasicFlexGroup0_io_workload/pm_io/.lucyindex/1 :  input 47 too high 
> S_fibonacci at core/Lucy/Index/IndexManager.c line 129

Rajiv g

-----Original Message-----
From: Nick Wellnhofer [] 
Sent: Saturday, December 17, 2016 2:52 AM
Subject: Re: [lucy-user] LUCY_Folder_Open_Out_IMP at core/Lucy/Store/Folder.c line 119

On 13/12/2016 18:05, Gupta, Rajiv wrote:
> After I create directory by myself I'm getting this error:

Which directory do you try to create? I wouldn't try to make manual changes inside Lucy's
index directory. This will only make things worse.

        $indexer = Lucy::Index::Indexer->new(
                index    => $saveindexlocation,
                schema   => $schema,
                manager  => Lucy::Index::IndexManager->new(host=>$self->{_hostname}),
                create   => $dir_create_flag,
                truncate => 0,

The "create" flag initially set to 1 so that $saveindexlocation can get created after I got
the error I make sure directory is created and made create flag always 0.

> Can't open '/u/smoke/presub/logs/cit-fg-adr-neg-rtp.rajivg.1481473130.41339_cmode_1of1/.lucyindex/1/seg_fd/lexicon-7.ixix':
Invalid argument
> 20161211 182109 [] *    LUCY_FSFolder_Local_Open_FileHandle_IMP at core/Lucy/Store/FSFolder.c
line 118
> 20161211 182109 [] *    LUCY_Folder_Local_Open_In_IMP at core/Lucy/Store/Folder.c line
> 20161211 182109 [] *    LUCY_Folder_Open_In_IMP at core/Lucy/Store/Folder.c line 75
> There are two more failures they also failed due so similar reasons
> rename from 
> '/u/smoke/presub/logs/cit-fg-adr-ndo-rtp.rajivg.1481473039.49384_cmode
> _1of1/.lucyindex/1/schema.temp' to 
> '/u/smoke/presub/logs/cit-fg-adr-ndo-rtp.rajivg.1481473039.49384_cmode
> _1of1/.lucyindex/1/schema_e4.json' failed: No such file or directory
> Can't delete 'lexicon-3.ix'
> I believe all three are related to race condition while doing parallel indexing and should
go away with retries. However my retries started failing with different error which is strange
to me as if directory already exists shouldn't it skip from create attempt.
> 20161211 182109 [] * FAIL: [FAILED]: Retrying to add doc at path: /u/smoke/presub/logs/cit-fg-adr-neg-rtp.rajivg.1481473130.41339_cmode_1of1/.lucyindex/1
:  Couldn't create directory '/u/smoke/presub/logs/cit-fg-adr-neg-rtp.rajivg.1481473130.41339_cmode_1of1/.lucyindex/1':
No such file or directory
> 20161211 182109 [] *    LUCY_FSFolder_Initialize_IMP at core/Lucy/Store/FSFolder.c line
> So my all retry attempts were also failed.

These errors still look like multiple processes are modifying the index at the same time.
Are you really sure that every indexer is created with an IndexManager and that every IndexManager
is created with a `host` argument that is unique to each machine?

Rajiv>>>All parallel processes are child process of one process and running from
the same host. Would you think giving host name uniqueness with some random number would help
for multiple processes. 

All these errors mean that there's something fundamentally wrong with your code or that you
hit a bug in Lucy. The only type of error where it makes sense to retry is LockErr. All other
errors are mostly fatal and could result in index corruption. Retrying will only mask an underlying
problem in this case.

Unfortunately, I'm unable to help unless you provide some kind of self-contained, reproducible
test case. I'm aware that this isn't easy, especially with multiple clients writing to a shared

As I already hinted at, you might want to reconsider your architecture and use some kind of
search server that uses an index on a local filesystem. There are ready-made platforms on
top of Lucy like Dezi, but it isn't too hard to roll your own solution. This should result
in better performance and makes behavior of your code more predictable.

Rajiv>>> Going to local file system is not possible for my case. This is a test framework
that generate lot of logs and I'm doing indexing per test runs and all these logs needs to
be on shared volume for other triaging purpose. The next thing I'm going to try is create
a watcher per directory and index all files under that directory serially. Currently I'm creating
watchers on all the files and some time multiple files in the same directory may try to get
indexed at the same time.  And as you stated this might be the issue. I'm not sure how it
will perform with the current time limits. Creating Indexer manager adding overhead to the
search process. 


View raw message