cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Quinn <jer...@media.demon.co.uk>
Subject Re: Mods to CocoonLucene
Date Fri, 15 Nov 2002 14:52:52 GMT

On Friday, Nov 15, 2002, at 15:29 Europe/London, Vadim Gritsenko wrote:

> Jeremy Quinn wrote:
>
>>
>> On Friday, Nov 15, 2002, at 14:12 Europe/London, Vadim Gritsenko 
>> wrote:
>>
>>> Jeremy Quinn wrote:
>>>
>>>> Dear All,
>>>>
>>>> I am making some modifications to the way Lucene works with Cocoon, 
>>>> and am wondering if you lot think these modifications are suitable 
>>>> to add to the 2.1 CVS.
>>>>
>>>>
>>>> 1.) Allow the 'query' to be set on the SearchGenerator from the 
>>>> Sitemap. (Currently, it only reads the query directly from the 
>>>> request, setting via the sitemap overides this)
>>>>
>>>>     WHY?: because you can do lots of funky things like modify the 
>>>> query on the fly by manipulating sitemap params directly or via an 
>>>> Action. This is useful for things like security (not allowing 
>>>> certain docs to become hits to certain users), partitioning 
>>>> (searches of only part of a site), search wizards (helping to build 
>>>> complex searches) etc.
>>>
>>>
>>>
>>> Ok
>>
>>
>> good
>
>
> And may be handling of some other parameters can be changed too.

What did you have in mind?
The only sitemap param the SearchGenerator takes I believe, is the one 
I have just added.

The names of the request params it will use are configurable already I 
think.


>>>> 2.) Modified LuceneIndexContentHandler to add (Stored, Untokenised, 
>>>> Unindexed) 'title' fields to the index, made from
>>>
>>>
>>>
>>> Configurable (and default is
>>>
>>>> <title/>
>>>
>>>
>>>
>>> )
>>>
>>
>> I would love to make this configurable, but do not know how.
>> Any suggestions would be gratefully accepted!
>
>
> Can't help right now... Come up with something ;-)

Ta ;)


>>>> tags in the content.
>>>>
>>>>     WHY? So that you get the title(s) of your documents with the 
>>>> search hits!
>>>>
>>>> Merits:
>>>>
>>>>     mod 1 has no impact on anyone using Lucene as it is today, as 
>>>> the old mode of operation is not effected.
>>>>
>>>>     mod 2 _does_ have an impact on the way Lucene works, if you 
>>>> have lots of <title> tags in your document, you will get all of 
>>>> them displayed in your hits. If there are too many for your taste, 
>>>> you would need to pre-filter your content, using a stylesheet in 
>>>> the 'content-view' to simplify your docs before they get indexed.
>>>>
>>>>     IMHO, this is a good thing to do anyway!!
>>>>
>>>>
>>>> thanks for any input ......
>>>>
>>>> The next modification I am planning is to allow the re-indexing of 
>>>> single documents .....
>>>
>>>
>>>
>>> Isn't it already possible by using "update index"?
>>
>>
>> While playing with Lucene, I find no time difference between 'create' 
>> and 'update'.
>>
>> I believe update index still uses the crawler to get the content, 
>> meaning that a well-linked site will end up getting totally >> reindexed.
>
>
> Are you referring to *sample* create-index.xsp page? But this is 
> *sample*! :)
> AFAIU, You can extend it to behave more like Cocoon CLI: option to 
> crawl or not to; not one input URL, but set of URLs, etc.

Yea, I am talking about the sample.
I was thinking of adding a method to 
o.a.c.components.search.SimpleLuceneCocoonIndexerImpl to allow single 
url indexing, bypassing the crawler, to make it easier to write custom 
indexing XSPs.

regards Jeremy


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Mime
View raw message