lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ard Schrijvers" <a.schrijv...@hippo.nl>
Subject RE: storing the document URI in the index
Date Tue, 12 Jun 2007 13:02:25 GMT
Hello Erik, 

thanks for the fast answer (sry for my mail not indenting but must use webmail :-( ), but
the problem I am facing is that I do not see solr storing the location of the documents it
indexed. So, I need to store the location of a document in a field, but I do not see where
solr would do this. Fetching the document will be done with the simple cocoon generator, so
that is no problem, but of course, I need the url/uri to be in the index. I know I need it
as a UN_TOKENIZED STORED field, but just see with LUKE that the location is not present in
lucene index when solr "crawls" some directory with xml files,

Regards Ard Schrijvers


Yes.  Set the field to be store and non-indexed, field type "string"  
is what I use.

> Or is everybody used to storing the contents of a document in the  
> lucene index (doesn't this imply a much larger index though?), so  
> instead of retrieving the document's content through a seperate  
> fetch over http/filesystem just show the result from the stored  
> content field?

This all depends on the needs of your project.  Its perfectly fine to  
store the text outside of the index, and that is the way it really  
has to be done for very large indexes where as few fields as possible  
are "stored".

If you're also asking about Solr fetching the remote resource, that  
is a different story altogether, and no it does not do that.  [though  
with the streaming capability you can feed in a document entirely  
from a URL, but I haven't experimented with that feature yet myself]

	Erik





Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message