lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mandy G√ľnther <mandy.guent...@googlemail.com>
Subject What's the best way to store metadata?
Date Wed, 11 Feb 2009 18:05:45 GMT
Hi all,
I want to use Lucene in my project but I have the following problem:
My goal is to store metadata in the index.

Her is an example of the stuff I need to index

1; My; determiner;			
2; project; noun; 1
3; uses; verb; 1
4; Lucene; noun; 1
6; I; noun;		
7; need; verb; 6
8; help; noun; 6

It's a file containing rows with 4 columns.
The first column is a sequential number. The second column is the text
that need to be indexed. In my original file it's more than one word.
The third column
classifies the second column and the fourth column tells me where the
sentence of the word in the second column starts.
I will have large datafiles(up to 30GB) to index. Indexing and
searching performance is very important.
I tried to store the meta data(column no. 3 and 4) over the field names, e.g.

doc.add(new Field("1|determiner", My, ...));
doc.add(new Field("1|noun", project, ...));
...
doc.add(new Field("2|noun", I, ...));

But that's not a classy way!

I was thinking about using Payloads. But the API says the Payload
Status is experimantal and the performance is getting bad using
payloads.
I am not even sure if Payloads are meant to be used to store that kind of data.
Is it possible to store that kind of metadata as payload?
Is there another way to store the meta data. Should I use a database
to store the extra information?
Is somebody there who had the same problem or has experience with Payloads?
So what would be the best way to store metadata?

Thanks for your advice.
Mandy

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message