lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Groschupf ...@media-style.com>
Subject Re: Document Clustering
Date Tue, 11 Nov 2003 20:47:41 GMT
really cool Stuff!!!

maurits van wijland wrote:

>Hi All and Marc,
>
>There is the carrot project :
>http://www.cs.put.poznan.pl/dweiss/carrot/
>
>The carrot system consists of webservices that can easily be fed by a lucene
>resultlist. You simply have to create a JSP that creates this XML file and
>create a custom process and input component. The input component
>for lucene could look like:
>
><?xml version="1.0" encoding="UTF-8"?>
><service xmlns      =
>"http://www.dawidweiss.com/projects/carrot/componentDescriptor" framework  =
>"Carrot2">
>    <component id               = "carrot2.input.lucene"
>               type             = "input"
>               serviceURL       = "http://localhost/weblucene/c2.jsp"
>               infoURL          = "http://localhost/weblucene/"
>    />
></service>
>
>The c2.jsp file simply has to translate a resultlist into an XLM file such
>as:
><searchresult>
>    <document id="1">
> <title>...</title>
> <weight>1.0</weight>
> <url>http://...</url>
> <summary>sum 1</summary>
> <snippet>snip 2</snippet>
>    </document>
>    <document id="2">
> <title>...</title>
> <weight>1.0</weight>
> <url>http://...</url>
> <summary>sum 2</summary>
> <snippet>snip 2</snippet>
>    </document>
></searchresult>
>
>Feed this into the carrot system, and you will get a nice clustered
>result list. The amazing part is of this clustering mechanism is that
>the cluster labels are incredible, their great!
>
>Then there is a open source project called Classifier4J that can
>be used for classification, the oposite of clustering. These other
>open source projects are a great addition to the Lucene system.
>
>I hope this helps...
>
>Marc, what are you building?? Maybe we can help!
>
>Kind regards,
>
>Maurits
>
>
>----- Original Message ----- 
>From: "marc" <marc@bioseeker.bioinfocg.com>
>To: "Lucene Users List" <lucene-user@jakarta.apache.org>
>Sent: Tuesday, November 11, 2003 5:15 PM
>Subject: Document Clustering
>
>
>Hi,
>
>does anyone have any sample code/documentation available for doing document
>based clustering using lucene?
>
>Thanks,
>Marc
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
>  
>

-- 
day time: www.media-style.com
spare time: www.text-mining.org | www.weta-group.net




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message