cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vadim Gritsenko <vadim.gritse...@verizon.net>
Subject Re: Mods to CocoonLucene
Date Fri, 15 Nov 2002 16:01:39 GMT
Jeremy Quinn wrote:

>
> On Friday, Nov 15, 2002, at 15:29 Europe/London, Vadim Gritsenko wrote:
>
>> Jeremy Quinn wrote:
>>
>>>
>>> On Friday, Nov 15, 2002, at 14:12 Europe/London, Vadim Gritsenko wrote:
>>>
>>>> Jeremy Quinn wrote:
>>>>
>>>>> Dear All,
>>>>>
>>>>> I am making some modifications to the way Lucene works with 
>>>>> Cocoon, and am wondering if you lot think these modifications are 
>>>>> suitable to add to the 2.1 CVS.
>>>>>
>>>>>
>>>>> 1.) Allow the 'query' to be set on the SearchGenerator from the 
>>>>> Sitemap. (Currently, it only reads the query directly from the 
>>>>> request, setting via the sitemap overides this)
>>>>>
>>>>>     WHY?: because you can do lots of funky things like modify the 
>>>>> query on the fly by manipulating sitemap params directly or via an 
>>>>> Action. This is useful for things like security (not allowing 
>>>>> certain docs to become hits to certain users), partitioning 
>>>>> (searches of only part of a site), search wizards (helping to 
>>>>> build complex searches) etc.
>>>>
>>>>
>>>>
>>>>
>>>> Ok
>>>
>>>
>>>
>>> good
>>
>>
>>
>> And may be handling of some other parameters can be changed too.
>
>
> What did you have in mind?
> The only sitemap param the SearchGenerator takes I believe, is the one 
> I have just added. 


If all is ok, then don't mind.



> The names of the request params it will use are configurable already I 
> think.
>
>
>>>>> 2.) Modified LuceneIndexContentHandler to add (Stored, 
>>>>> Untokenised, Unindexed) 'title' fields to the index, made from
>>>>
>>>>
>>>>
>>>>
>>>> Configurable (and default is
>>>>
>>>>> <title/>
>>>>
>>>>
>>>>
>>>>
>>>> )
>>>>
>>>
>>> I would love to make this configurable, but do not know how.
>>> Any suggestions would be gratefully accepted!
>>
>>
>>
>> Can't help right now... Come up with something ;-)
>
>
> Ta ;)
>
>
>>>>> tags in the content.
>>>>>
>>>>>     WHY? So that you get the title(s) of your documents with the 
>>>>> search hits!
>>>>>
>>>>> Merits:
>>>>>
>>>>>     mod 1 has no impact on anyone using Lucene as it is today, as 
>>>>> the old mode of operation is not effected.
>>>>>
>>>>>     mod 2 _does_ have an impact on the way Lucene works, if you 
>>>>> have lots of <title> tags in your document, you will get all of

>>>>> them displayed in your hits. If there are too many for your taste, 
>>>>> you would need to pre-filter your content, using a stylesheet in 
>>>>> the 'content-view' to simplify your docs before they get indexed.
>>>>>
>>>>>     IMHO, this is a good thing to do anyway!!
>>>>>
>>>>>
>>>>> thanks for any input ......
>>>>>
>>>>> The next modification I am planning is to allow the re-indexing of 
>>>>> single documents .....
>>>>
>>>>
>>>>
>>>>
>>>> Isn't it already possible by using "update index"?
>>>
>>>
>>>
>>> While playing with Lucene, I find no time difference between 
>>> 'create' and 'update'.
>>>
>>> I believe update index still uses the crawler to get the content, 
>>> meaning that a well-linked site will end up getting totally >> 
>>> reindexed.
>>
>>
>>
>> Are you referring to *sample* create-index.xsp page? But this is 
>> *sample*! :)
>> AFAIU, You can extend it to behave more like Cocoon CLI: option to 
>> crawl or not to; not one input URL, but set of URLs, etc.
>
>
> Yea, I am talking about the sample.
> I was thinking of adding a method to 
> o.a.c.components.search.SimpleLuceneCocoonIndexerImpl to allow single 
> url indexing, bypassing the crawler, to make it easier to write custom 
> indexing XSPs. 


+1

... and I have other approach implemented - LuceneTransformer ;)

Vadim



> regards Jeremy





---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Mime
View raw message