lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ahmed algohary <algoharya...@gmail.com>
Subject Re: Calculate Term Co-occurrence Matrix
Date Sun, 22 Aug 2010 12:47:02 GMT
Thanks! It is exactly what I need. But, isn't there a way to get the
matching score ?

for example, "damaged"  co-occurs with "shipment" with a probability = 0.4
??


On Sun, Aug 22, 2010 at 5:35 AM, Ivan Provalov <iprovalo@yahoo.com> wrote:

> Ahmed,
>
> FYI, I updated the term collocations package I mentioned earlier with a few
> fixes and changes which will make it work for Lucene 3.0.2.  This may help
> your task.
>
> See:
> https://issues.apache.org/jira/browse/LUCENE-474
>
> Thanks,
>
> Ivan Provalov
>
>
> --- On Sat, 8/21/10, Otis Gospodnetic <otis_gospodnetic@yahoo.com> wrote:
>
> > From: Otis Gospodnetic <otis_gospodnetic@yahoo.com>
> > Subject: Re: Calculate Term Co-occurrence Matrix
> > To: java-user@lucene.apache.org
> > Date: Saturday, August 21, 2010, 8:05 AM
> > Ahmed,
> >
> > That's what that KPE (link in my previous email, below)
> > will do for you.  It's
> > not open source at this time, but that is exactly one of
> > the things it does.  I
> > think Mahout collocations stuff might work for you, too.
> >
> > Otis
> > ----
> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> > Lucene ecosystem search :: http://search-lucene.com/
> >
> >
> >
> > ----- Original Message ----
> > > From: ahmed algohary <algoharyalex@gmail.com>
> > > To: java-user@lucene.apache.org
> > > Sent: Sat, August 21, 2010 7:20:03 AM
> > > Subject: Re: Calculate Term Co-occurrence Matrix
> > >
> > > Thanks for all your answers!
> > >
> > > it seems like I did not make my question  clear.
> > I have a text corpus and I
> > > need to determine the pairs of words that  occur
> > together in many documents.
> > > I need to do that to be able to measure the
> > semantic proximity between
> > > words. This method is expanded
> > > here<http://forums.searchenginewatch.com/showthread.php?t=48>.
> > > I hope to  find some code that given a text
> > corpus, generate all the words
> > > pairs with  their probability of occurring
> > together.
> > >
> > >
> > > On Sat, Aug 21, 2010 at 1:46  AM, Otis
> > Gospodnetic <
> > > otis_gospodnetic@yahoo.com>
> > wrote:
> > >
> > > > There is also a non-Mahout Key Phrase Extractor
> > for  Collocations, SIPs, and
> > > > a
> > > > few other things:
> > > > http://sematext.com/products/key-phrase-extractor/index.html
> > > >
> > > >  One of the demos that uses news data is at
> > > > http://sematext.com/demo/kpe/index.html
> > > >
> > > > Otis
> > > >  ----
> > > > Sematext :: http://sematext.com/ :: Solr - Lucene -
> > Nutch
> > > > Lucene ecosystem  search :: http://search-lucene.com/
> > > >
> > > >
> > > >
> > > > ----- Original  Message ----
> > > > > From: Grant Ingersoll <gsingers@apache.org>
> > > > > To: java-user@lucene.apache.org
> > > >  > Sent: Fri, August 20, 2010 8:52:17 AM
> > > > > Subject: Re: Calculate  Term
> > Co-occurrence Matrix
> > > > >
> > > > > You might also be interested  in
> > Mahout's collocations package:
> > > > >http://cwiki.apache.org/confluence/display/MAHOUT/Collocations
> > > >  >
> > > > > -Grant
> > > > > On  Aug 19, 2010, at 11:39 AM,
> > ahmed  algohary wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > >  > > I need to know if there is a
> > Lucene plug-in or a Lucene-based  API  for
> > > > > > calculating the term co-occurrence
> > matrix for a  given text  corpus.
> > > > > >
> > > > > > Thanks!
> > > >  > >
> > > > > > --
> > > > > >  Ahmed
> > > >  >
> > > > > --------------------------
> > > > > Grant  Ingersoll
> > > > > http://www.lucidimagination.com/
> > > > >
> > > > > Search the  Lucene ecosystem
> > using  Solr/Lucene:
> > > > >http://www.lucidimagination.com/search
> > > > >
> > > > >
> > > >  >
> > ---------------------------------------------------------------------
> > > >  > To  unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > >  > For  additional commands, e-mail:
> > java-user-help@lucene.apache.org
> > > >  >
> > > > >
> > > >
> > > >
> > ---------------------------------------------------------------------
> > > > To  unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > >  For additional commands, e-mail: java-user-help@lucene.apache.org
> > > >
> > > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message