forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ramon Prades" <rpra...@porcelanosa.com>
Subject RE: about lucent and exist
Date Mon, 15 Sep 2003 09:33:49 GMT
Hi Juan Jose

Do you think we should drop Lucene and use Xindice instead?

This is what I think:

- Use Xindice.
- Populate the database using a crawler and cocoon's xml-views.
- Create a search page with a number of options as in "search in content",
"search in title" and so on.

Regards.

Ramón

> -----Mensaje original-----
> De: Juan Jose Pablos [mailto:cheche@che-che.com] 
> Enviado el: sábado, 13 de septiembre de 2003 17:56
> Para: forrest-dev@xml.apache.org
> Asunto: Re: about lucent and exist
> 
> 
> Stefano Mazzocchi wrote:
> > 
> > Lucene is based on algorithms that don't allow the above.
> > 
> 
> Thanks for backing this up. That was my initial feeling.
> 
> > For that, you need what is called an "xml database", which 
> could be, 
> > in
> > the most simple case, a collection of files in a file 
> system and a very 
> > slow incremental collector that opens all files, scans them 
> and collects 
> > the matching elements and returns the results as a new 
> document. In the 
> > best case, it's a semi-structured database with multidimensional 
> > indexing features (exist and xindice are much closer to that).
> > 
> 
> I am happy to look at xindice.
> 
> > 
> > You are trying to create "virtual documents" out of 
> XML-aware queries
> > over a repository of hierarchical content (not necessarely XML, but 
> > XML-viewable).
> 
> Are you saying that because we are making the request to document-v12 
> schema? I am not sure about this. I am not thinking about doing the 
> request to the document-v12 schema.
> 
> In Forrest we are importing from another schema and on that 
> process we 
> are losing information ( i.e. <author/> becames <p> ). So I 
> would like 
> to get a search on the source and get the results to where I can 
> retrieve that document.
> 
> > Eh, if it was that easy. You are implying that:
> > 
> >  1) a tag is used to indicate the semantics of the nodes contained
> > therein. Although this is generally the case (and there is 
> the ability 
> > to have RDF/XML to performm this way) this is not generalizable.
> 
> I would like to see an example on this.
> 
> > 
> >  2) without namespaces, there is a tremendous semantic 
> collision. With
> > namespaces, you are assuming that the namespace refers to 
> the 'meaning' 
> > of the tag, again not generalizable.
> > 
> 
> ok, I have not mention anything about namespaces, the request 
> that put 
> as an example only deals with faq schema. I had not thought 
> about multi 
>   namespace documents or other type of XML input.
> 
> > This said, I agree that having the ability to run XQuery 
> queries over a 
> > content repository that exposes XML views would be a 
> tremendous help.
> > Just don't call it "semantic searching", because that's not 
> even close 
> > (but very few are able to explain the difference and the 
> reason why we 
> > need the entire RDF stack in the first place, so don't worry).
> > 
> > -- 
> > Stefano.
> 
> ok, I will not used that name, I will not worry either.
> 
> Cheers,
> Cheche
> 
> 
> 



Mime
View raw message