lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Elshaimaa Ali <elshaimaa....@hotmail.com>
Subject RE: Document Similarity
Date Mon, 30 Jul 2012 17:26:52 GMT

thank you so much for the prompt reply
I need to extract a document from the index that is similar to an Html document, and I need
to use cosine similarity or latent semantic analysis which means that I need to generate term
vector for the html document, the link you sent me doesn't contain any code 
any help will be greatly apreciated
regardsshaimaa

> Date: Mon, 30 Jul 2012 07:32:49 -0700
> From: in.abdul@gmail.com
> To: java-user@lucene.apache.org
> Subject: Re: Document Similarity
> 
> Hi ELshaimaa,
>   I couldnt able understood what is your need . Can you please explain your
> use case.
> 
>   If this is case  "I need to use Lucene to find the most similar documents
> from the generated index"
> then go for morelikethis[1] components .
> 
> Based on your use case people can suggest some good ways.
> 
> 
> 
> [1] http://wiki.apache.org/solr/MoreLikeThis
> 
> 
> 
> 
>             Thanks and Regards,
>         S SYED ABDUL KATHER
> 
> 
> 
> On Mon, Jul 30, 2012 at 7:30 PM, Elshaimaa Ali [via Lucene] <
> ml-node+s472066n3998082h68@n3.nabble.com> wrote:
> 
> >
> > Hi All
> > I created a Lucene index for over 3 million document, and I used term
> > vectors to create the index.now for an external document I need to use
> > Lucene to find the most similar documents from the generated index.how can
> > I process the document to generate a term vector to this document and what
> > search technique I can use to map the document to one of the documents in
> > the index
> > regardsshaimaa
> >
> > ------------------------------
> >  If you reply to this email, your message will be added to the discussion
> > below:
> > http://lucene.472066.n3.nabble.com/Document-Similarity-tp3998082.html
> >  To unsubscribe from Lucene, click here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=472066&code=aW4uYWJkdWxAZ21haWwuY29tfDQ3MjA2NnwxMDczOTUyNDEw>
> > .
> > NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
> >
> 
> 
> 
> 
> -----
> THANKS AND REGARDS,
> SYED ABDUL KATHER
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Document-Similarity-tp3998082p3998095.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
 		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message