lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: metadata about result sets?
Date Fri, 10 Mar 2006 06:44:30 GMT

: I like the idea of the wiki page; I think I will attempt to set one
: up after this email, but I wanted to see if I could do a little bit
: better job of fleshing out how pulling metadata out might work (in my

I finally got a chance to look at your ideas.

first off: as far as i know, there isn't any spcial edit permissions
neccessary to modify the TaskList ... if the edit link wasn't showing up
for you after you logged in, it might just be that the page was cached,
try a force-reload.

Okay, on to the topic at hand..

: We add suggestable metadata as part of the product schema, so we
: could have something like

There's a difference between the index schema, and the "xml schema/dtd"
for adding documents.  You seem to be suggesting a change to the xml used
when adding documents to indicate wether a field should be suggestable or
not, but that syntax is tied directly to the underlyng lucene API for
Documents/Fields -- where would the suggestable/preceding info be stored?

: Once we reindex, we do a search for 'legal' again and our book is in
: it. Based on our index,  we can scan the resultset and see that the
: results have three suggestable fields, two of which do not require a
: preceding field.

I'm not sure what you mean by "scan the result" to get to get the
suggestable (and their values) ... can you elaborate?


I'm not sure if you read the thread yonik mentioned earlier about how we
do this at CNET, but the way we store info about which fields we want to
have facets on (and what those facets should be in the case of range
queries and such) is to put "metadata documents" into the index.  for a
single user request, you pull out the metadata document, then use the info
contained in it to determine facets to search on and intersect with the
main result.

the format of hte metadata docs we use is very custom, but perhaps a
similar, generalized approach could be implimented?

The plugin could dictate a specific XML format indicating the behavior to
drive the facets using either of hte following mechanisms (more could be
added as needed)...
  * make group FF of all indexed values of field F
  * make group G using queries x, y, and z with labels a, b, and c
...users could index one or more metadata documents, containing the XML
info in any stored field they want defined in the schema -- when
configuring the plugin, they'd specify the field in the solrconfig.xml.
at query time, they specify two queries: one to restrict the main results,
and one to identify the metadata doc they want to use (if it's allways the
same one, a defualt could be configured in solrconfig as well)

an example of what i mean about XML stored in a field of the metadata
doc...

   <facets>
     <group id="price" label="Price">
       <facet id="0-20"  label="Under $20">price:[0 TO 20]</facet>
       <facet id="21-40" label="$21 - $40">price:[21 TO 40]</facet>
       <facet id="41-60" label="$41 - $60">price:[41 TO 60]</facet>
     </group>
     <group id="initial" label="Author">
       <facet id="a" label="A">author:a*</facet>
       ...
     </group>
     <group id="name" label="Author" depends="initial">
       <facet use-terms-field="author" />
     </group>
     ...
   </facets>


-Hoss


Mime
View raw message