cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vadim Gritsenko <vadim.gritse...@verizon.net>
Subject Re: Mods to CocoonLucene
Date Fri, 15 Nov 2002 15:29:14 GMT
Jeremy Quinn wrote:

>
> On Friday, Nov 15, 2002, at 14:12 Europe/London, Vadim Gritsenko wrote:
>
>> Jeremy Quinn wrote:
>>
>>> Dear All,
>>>
>>> I am making some modifications to the way Lucene works with Cocoon, 
>>> and am wondering if you lot think these modifications are suitable 
>>> to add to the 2.1 CVS.
>>>
>>>
>>> 1.) Allow the 'query' to be set on the SearchGenerator from the 
>>> Sitemap. (Currently, it only reads the query directly from the 
>>> request, setting via the sitemap overides this)
>>>
>>>     WHY?: because you can do lots of funky things like modify the 
>>> query on the fly by manipulating sitemap params directly or via an 
>>> Action. This is useful for things like security (not allowing 
>>> certain docs to become hits to certain users), partitioning 
>>> (searches of only part of a site), search wizards (helping to build 
>>> complex searches) etc.
>>
>>
>>
>> Ok
>
>
> good 


And may be handling of some other parameters can be changed too.



>>> 2.) Modified LuceneIndexContentHandler to add (Stored, Untokenised, 
>>> Unindexed) 'title' fields to the index, made from
>>
>>
>>
>> Configurable (and default is
>>
>>> <title/>
>>
>>
>>
>> )
>>
>
> I would love to make this configurable, but do not know how.
> Any suggestions would be gratefully accepted! 


Can't help right now... Come up with something ;-)



>>> tags in the content.
>>>
>>>     WHY? So that you get the title(s) of your documents with the 
>>> search hits!
>>>
>>> Merits:
>>>
>>>     mod 1 has no impact on anyone using Lucene as it is today, as 
>>> the old mode of operation is not effected.
>>>
>>>     mod 2 _does_ have an impact on the way Lucene works, if you have 
>>> lots of <title> tags in your document, you will get all of them 
>>> displayed in your hits. If there are too many for your taste, you 
>>> would need to pre-filter your content, using a stylesheet in the 
>>> 'content-view' to simplify your docs before they get indexed.
>>>
>>>     IMHO, this is a good thing to do anyway!!
>>>
>>>
>>> thanks for any input ......
>>>
>>> The next modification I am planning is to allow the re-indexing of 
>>> single documents .....
>>
>>
>>
>> Isn't it already possible by using "update index"?
>
>
> While playing with Lucene, I find no time difference between 'create' 
> and 'update'.
>
> I believe update index still uses the crawler to get the content, 
> meaning that a well-linked site will end up getting totally reindexed. 


Are you referring to *sample* create-index.xsp page? But this is 
*sample*! :)
AFAIU, You can extend it to behave more like Cocoon CLI: option to crawl 
or not to; not one input URL, but set of URLs, etc.

Vadim





---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Mime
View raw message