lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ilias Flaounas" <ilia...@gmail.com>
Subject Re: Get term id from dictionary
Date Wed, 31 Oct 2007 12:10:51 GMT
I want to have IDs for the terms (words) not the documents!
Also, I need the same ID for a word if it appears in more than one documents.

Example:
Doc1: The sea is blue
Doc2: Sky is blue

For these two docs the dictionary would be [the]->1 [sea]->2 [is]->3
[blue]->4 [sky]->5

So I want to represent these docs by word-ids like this:
Doc1: 1 2 3 4
Doc2: 5 3 4

Is there a way to use Lucene for this? I mean Lucene stores an
internal dictionary. How can I access it?

Thank you,
Ilias


On 10/31/07, Mark Miller <markrmiller@gmail.com> wrote:
> The id does change. You need to index your own "id" field with the document.
>
>
> Ilias Flaounas wrote:
> > Dear experts,
> >
> > I need to store and index a string of text into Lucene, and later I
> > want to get the Id of each term inside this string. Is it possible?
> > How can I do that?
> >
> > I want a unique association, term (in my case a word) -> Id. I know,
> > that If I delete a document, the dictionary changes. Does the term id
> > change?
> >
> >
> > Thanks a lot
> > Ilias
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message