lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <>
Subject Re: Re: Schema model to store additional field metadata
Date Sat, 08 Sep 2012 20:27:16 GMT
You might be confusing indexing and storing. When you
specify index="true" in your field definition, the input
is tokenized, transformed, etc and the results of this
(see the admin/analysis) page is what is searched.

But when you specify stored="true", a literal, verbatim
copy of the text is put in a distinct file, and when you
return data (e.g. fl=field1, field2...) then the verbatim
copy is returned.

If you specify both index="true" and stored="true" both
things happen, but they're entirely separate operations
even though they're on the same field....

So, let's assume you want to provide links to the images.
Having a field (multiValued?) with index="false" and stored="true"
would allow you to store all the img urls in a single field.

All that said, now it's up to your application layer that
constructs the pages for presentation to the user to
"do the right thing" with the returned fields to allow
images (or whatever) to be displayed.


On Fri, Sep 7, 2012 at 12:03 PM,  <> wrote:
>> Why would you store the actual images in SOLR?
> No, the images are files on the filesystem. Only the path to the image should be stored
in Solr.
>> And you are most likely looking at dynamic fields as the solution
>> 1) Define *_Path, *_Size, *_Alt as a dynamic field with appropriate types
>> 2) During indexing, write those properties as Image_1_Path,
>> Image_1_Size, Image_1_Alt or some such
>> 3) Make sure that whatever search algorithm you have looks at those or
>> do a copyField to aggregate them into AllImage_Alt, etc.
> I was also thinking of a solution with dynamic fields, but I am very new to Solr and
I am not sure if it is a good solution to solve this modelling issue. For example I thought
about introducing two multiValued dynamic fields (image_src_*, image_alt_*) and store image
data like file path on disc and alt-attribute like this:
> title:     An article about Foo and Bar
> content:   This is some text about Foo and Bar.
> published: 2012.09.07T19:23
> image_src_1: 2012/09/foo.png
> image_alt_1: Foo. Waiting for the bus.
> image_src_2: 2012/04/images/bar.png
> image_src_3: 2012/02/abc.png
> image_alt_3: Foo and Bar at the beach
> Of course a alt attribute for some images could be missing. I don't know if this is a
good or better solution for this. It feels clumsy to me, like a workaround. But maybe this
is the way to model this data, I don't know?

View raw message