lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Karman <pe...@peknet.com>
Subject Re: [lucy-user] Chinese support?
Date Tue, 21 Feb 2017 05:03:53 GMT
Hao Wu wrote on 2/20/17 10:18 PM:
> Hi Peter,
>
> Thanks for reply.
>
> That could be a problem. But probably not in my case.
>
> I removed the old index.
>
> run the program with 'ChineseAnalyzer' and truncate => 0  twice. the second
> time, will give me the error.
>
> 'body' assigned conflicting FieldType
>         LUCY_Schema_Spec_Field_IMP at cfcore/Lucy/Plan/Schema.c line 124
>         at /home/hwu/perl5/lib/perl5/x86_64-linux-gnu-thread-multi/Lucy.pm line 118.
>         Lucy::Index::Indexer::new('Lucy::Index::Indexer', 'index',
> '/home/hwu/data/lucy/mitbbs.index', 'schema',
> 'Lucy::Plan::Schema=SCALAR(0x1c56798)', 'create', 1) called at mitbbs_index.pl
> <http://mitbbs_index.pl> line 26
>
> run the program with 'ChineseAnalyzer' and truncate => 0  twice, no error. but I
> want to update the index.
>
> run the program with 'StandardTokenizer', with  truncate 0 or 1, both work fine.
>
> So, this make me think I must miss something in the 'ChineseAnalyzer' I have.
>


This is not your default, I don't think. This seems like a bug.

Here's a smaller gist demonstrating the problem:

https://gist.github.com/karpet/d8fe12085246b8419f9e4ab44930c1cc

With the 2 files in the gist, I get this result:

[karpet@pekmac:~/tmp/chinese-analyzer]$ perl indexer.pl test-index
Building prefix dict from the default dictionary ...
Loading model from cache 
/var/folders/r3/yk7hmbb9125fnsdf9bqs6lrm0000gp/T/jieba.cache
Loading model cost 0.553 seconds.
Prefix dict has been built succesfully.
Finished.

[karpet@pekmac:~/tmp/chinese-analyzer]$ perl indexer.pl test-index
'body' assigned conflicting FieldType
	LUCY_Schema_Spec_Field_IMP at cfcore/Lucy/Plan/Schema.c line 124
	at /usr/local/perl/5.24.0/lib/site_perl/5.24.0/darwin-2level/Lucy.pm line 118.
	Lucy::Index::Indexer::new("Lucy::Index::Indexer", "index", "test-index", 
"schema", Lucy::Plan::Schema=SCALAR(0x7f9b0b004a18), "create", 1) called at 
indexer.pl line 23
Segmentation fault: 11



I would expect the code to work as you wrote it, so maybe someone else can spot 
what's going wrong.

Here's what the schema_1.json file looks like after the initial index creation:

{
   "_class": "Lucy::Plan::Schema",
   "analyzers": [
     null,
     {
       "_class": "ChineseAnalyzer"
     }
   ],
   "fields": {
     "body": {
       "analyzer": "1",
       "type": "fulltext"
     }
   }
}


-- 
Peter Karman  .  https://peknet.com/  .  https://keybase.io/peterkarman

Mime
View raw message