lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niels Ott <n...@sfs.uni-tuebingen.de>
Subject Re: Deleting from Index by URL field: is it safe?
Date Mon, 01 Dec 2008 10:14:48 GMT
Hi all,

German Kondolf schrieb:
> It works exactly as it does when you search of that term.
> 
> Review in your index creation, if you store it without analyzing it
> (Index.UN_TOKENIZED), it will only match that document when you have an
> exact URL.

Is that also true if I simply use the KeywordAnalyzer?

The reason why I want to do it this way is that I have a special 
Analyzer that encapsulates the "knowledge" on how to treat each field. 
In a way something like the PerFieldAnalyzerWrapper but more 
specialized. I want to use the very same Analyzer for querying as well, 
so it appears to me that it is good to have the "knowledge" about the 
treatment of fields in that single place.

> It's possible that the URL is not unique enought in your domain, there is no
> other unique identifier that you could use?

I think the URL is unique enough for my cases. The system is still a 
prototype so I can change that later, if it turns out that it doesn't do 
the job for me.

> I suggest you create a test and try it on a RAMDirectory and see exactly
> what happens and what you want!

This looks like a good idea to me. Thank you for the hint.

Best,

    Niels

-- 
Niels Ott
Computational Linguist (B.A.)
http://www.drni.de/niels/

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message