lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Rochkind <>
Subject Re: OAI on SOLR already done?
Date Wed, 02 Feb 2011 20:38:48 GMT
The trick is that you can't just have a generic black box OAI-PMH 
provider on top of any Solr index. How would it know where to get the 
metadata elements it needs, such as title, or last-updated date, etc. 
Any given solr index might not even have this in stored fields -- and a 
given app might want to look them up from somewhere other than stored 

If the Solr index does have them in stored fields, and you do want to 
get them from the stored fields, then it's, I think (famous last words) 
relatively straightforward code to write. A mapping from solr stored 
fields to metadata elements needed for OAI-PMH, and then simply 
outputting the XML template with those filled in.

I am not aware of anyone that has done this in a 
re-useable/configurable-for-your-solr tool. You could possibly do it 
solely using the built-in Solr 
JSP/XSLT/other-templating-stuff-I-am-not-familiar-with stuff, rather 
than as an external Solr client app, or it could be an external Solr 
client app.

This is actually a very similar problem to something someone else asked 
a few days ago "Does anyone have an OpenSearch add-on for Solr?"  Very 
very similar problem, just with a different XML template for output 
(usually RSS or Atom) instead of OAI-PMH.

On 2/2/2011 3:14 PM, Paul Libbrecht wrote:
> Peter,
> I'm afraid your service is harvesting and I am trying to look at a PMH provider service.
> Your project appeared early in the goolge matches.
> paul
> Le 2 févr. 2011 à 20:46, Péter Király a écrit :
>> Hi,
>> I don't know whether it fits to your need, but we are builing a tool
>> based on Drupal (eXtensible Catalog Drupal Toolkit), which can harvest
>> with OAI-PMH and index the harvested records into Solr. The records is
>> harvested, processed, and stored into MySQL, then we index them into
>> Solr. We created some ways to manipulate the original values before
>> sending to Solr. We created it in a modular way, so you can change
>> settings in an admin interface or write your own "hooks" (special
>> Drupal functions), to taylor the application to your needs. We support
>> only Dublin Core, and our own FRBR-like schema (called XC schema), but
>> you can add more schemas. Since this forum is about Solr, and not
>> applications using Solr, if you interested this tool, plase write me a
>> private message, or visit, or the
>> module's page at
>> Hope this helps,
>> Péter
>> eXtensible Catalog
>> 2011/2/2 Paul Libbrecht<>:
>>> Hello list,
>>> I've met a few google matches that indicate that SOLR-based servers implement
the Open Archive Initiative's Metadata Harvesting Protocol.
>>> Is there something made to be re-usable that would be an add-on to solr?
>>> thanks in advance
>>> paul

View raw message