forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki>
Subject Re: SolrForrest plugin
Date Thu, 15 Jan 2009 13:07:28 GMT
Thorsten Scherler wrote:
> El mié, 14-01-2009 a las 12:11 +0100, Andrzej Bialecki escribió:
>> Hi devs,
>> I found this Forrest plugin at the Forrest site. If you guys have a 
>> moment to spare I'd really appreciate your advice.
>> I'm a complete newbie to Forrest, the only things I know how to do is to 
>> fill in the blanks in the default site xdocs and generate static html. 
>> It's not much, I'm afraid.
> Should be enough. ;)
> Since the plugin is still in the whiteboard you need to use the TRUNK of
> forrest. Best to get started with the plugin:
> cd
> $FORREST_HOME/whiteboard/plugins/org.apache.forrest.plugin.output.solr
> forrest run
> http://localhost:8888/index.html -> here you find some samples and basic
> instructions.

Ok, so far so good. I was able to complete these steps, and I can see 
the documentation.

>> Now, I need to index the content of a Forrest site in Solr, using a 
>> custom schema - e.g. the "id" in my case should be equivalent to the 
>> full URL of the page of the deployed site.
> You have seen and

Yes, but that documentation is not helpful for a newbie like me. It 
lists some configuration snippets without telling where to put them.

Basically, I need a step-by-step instruction how to generate _static_ 
Solr documents output, exactly like the one here - but this one is 
generated dynamically, i.e. requires a running instance of forrest, and 
I need to generate it statically.

>> First, I'm stuck conceptually - sitting in the top-level dir of the 
>> forrest site, what is it that I have to do to produce a file with the 
>> Solr <add> documents? 
> Actually that is doing the plugin to you.
> ...
> <!-- Output xdocs as solr docs -->
> <map:match pattern="**.solr">
>  <map:generate src="cocoon://{1}.xml"/>
>  <map:transform src="{lm:solr.transform.xdocs.solrDoc}">
>   <map:parameter name="document-url" value="{1}.xml"/>
>   <map:parameter name="project" value="{}"/>
>  </map:transform>
>  <map:serialize/>
> </map:match>

I'm not sure what this means - does it mean that I have to specify 
somewhere the list of document names with .xml replaced by .solr ?

> You are talking about to extend the ./resources/stylesheets/xdocs-to-solrDoc.xsl with
your custom attributes. 
> First have a look at the plugins xsl to get an idea about how we are doing things.
> Now copy the file to your project into your stylesheet dir (default is src/documentation/resources/stylesheets/
= {project.stylesheets-dir}).
> Let forrest know that you want to provide a custom location by adding the following in
your project locationmap.xml after the "locator" element:
> <match pattern="solr.transform.xdocs.solrDoc">
>  <location src="{project.stylesheets-dir}/xdocs-to-solrDoc.xsl"/>
> </match>
>>>From here you need to implement your logic. 
>> I already added the Solr output plugin to 
>> skinconf.xml. I discovered that I can get this via webapp, but I'd 
>> rather not actually run the webapp.
> hmm, skinconf.xml has nothing to do with the plugin. Where did you get
> the expression that you need to edit this file? You need to add the
> plugin to "project.required.plugins".

Ok, I did that. Still after running 'forrest site' I don't see the solr 
documents anywhere.

>> Second, how can I modify the schema of the produced documents, so that 
>> e.g. the id is the full URL - a configurable root URL plus the page 
>> name, and so that I can add other metadata to the docs?
> You will need to create your own xsl to override the default one as
> described above.
>> Thanks in advance for any help that you can provide!
> Please keep on asking if this are still not very clear. 

Thanks for your help. I'm afraid this is still not very clear...

Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration  Contact: info at sigram dot com

View raw message