cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <>
Subject Re: Ant: Re: Adding XML searching with Lucene
Date Sat, 08 Dec 2001 23:43:52 GMT
Bernhard Huber wrote:

> >Which leads me to think that making crawling, indexing and searching as
> >Avalon components might be FS since we're not going to use any other
> >implementation of these....
> >
> I'm sorry but what means FS?

Flexibility Syndrome: when you want to do more than necessary because it
appears cool but later on you find out it was a mistake to do it and now
you have to support it and others are taking this even further down the
wrong path and you can't stop them!
> >What do you think?
> >
> I try to switch to implementing again, as there are quite some nice
> ideas, but let's implement them....
> Let's continue to "work"....

> As a result of the last emails I would summarize:
> * Avalon components stays
> * your naming
> org.apache.cocoon.components.crawler, and are okay.
It makes more sense, can you put some skeleton into the cvs, or send it to  me?

Why don't you throw in your skeleton ideas here and we discuss then in
the open? 

> * I will implement some paging for the search result, if there are too
> much search result for displaying on a single page.

Yep, this is a must do.

> * I will study the Main class for the internal crawling..


> * As I read some other mail of you; you are thinking about some more
> indexing calculation using matrix-calculus.

yes, more on this to follow soon, but for now, your indexing is good
enough to implement the rest around it, we'll tune the indexing later on
(even because semantic relevance rating require some deep Lucene changes
that might not easy to get approuved, we'll see how that goes, but I'll
keep pushing)

> * I will try to write some docu, how to use the "XML Searching".


> * First XSP is okay, a generator for the searching should be written,
> let's call it:

I'd call it


that uses

this is coherent with the rest of the system.

> What about the xml output of the SearchGenerator?
> Search Generator XML DTD, if you want it to be RDF, i have to read the RDF documentation
more closely, the next few days..... :)

Nah, forget RDF for now, let's keep it as simple as possible:

searching for 'cocoon' would result in something like:

  <search:hit rank="1" score="89%" uri="...">
    <search:highlight>Cocoon</search:highlight> now offers semantic    

As you can see, this also includes part of the "context" where the
textual information is found. This follows the Google model and I think
it would be a *great* feature to have.

But this requires more thinking, I'd say let's ignore it for now, so you
can come up with

 <search:results xmlns:search=">
  <search:hit rank="1" score="89%" uri="..."/>

which is good enough for now but could be easily improved later on.

>  I hope you don't mind making these suggestions, but after some emails i
> get a bit confused about what should be really done.

Don't worry, I love your 'hands on' attitude. It balances my 'think a
lot and never get something done' one :)
> As there is surley some need for the searching stuff in cocoon, let's do it.


> I must admit that i have some limited time only, which i can put into
> implementing...
> Nevertheless i like that email&idea exchange, learning a lot...

Learning is always the the fun part :)

Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<>                             Friedrich Nietzsche

To unsubscribe, e-mail:
For additional commands, email:

View raw message