lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexandre Rafalovitch <arafa...@gmail.com>
Subject Re: find all two word phrases that appear in more than one document
Date Tue, 10 Sep 2013 06:10:42 GMT
I believe one of the admin pages (Solr 4+) shows all the terms and
frequencies. You can use that even with stock example. Try that. If that
makes sense, you can explore further.

As to other examples, there is a couple of books. I bet Jack's book covers
this.

Regards,
   Alex.

Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Tue, Sep 10, 2013 at 12:09 PM, Ali, Saqib <docbook.xml@gmail.com> wrote:

> Thanks Alexandre. I looked at the wiki page for the TermsComponent. But I
> am not sure if I follow. Do you have an example or some better document?
> Thanks! :)
>
>
> On Mon, Sep 9, 2013 at 8:17 PM, Alexandre Rafalovitch <arafalov@gmail.com
> >wrote:
>
> > The "phases" are usually called n-grams or shingles.
> >
> > You can probably use ShingleFilterFactory to create your shingles
> (possibly
> > with outputUnigrams=false) and then use TermsComponent (
> > http://wiki.apache.org/solr/TermsComponent) to list the results.
> >
> > Regards,
> >    Alex.
> >
> > Personal website: http://www.outerthoughts.com/
> > LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
> > - Time is the quality of nature that keeps events from happening all at
> > once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)
> >
> >
> > On Tue, Sep 10, 2013 at 8:22 AM, Ali, Saqib <docbook.xml@gmail.com>
> wrote:
> >
> > > Dear Solr Ninjas,
> > >
> > > We would like to run a query that returns two word phrases that appear
> in
> > > more than one document. So for e.g. take the string "Solr Ninja". Since
> > it
> > > appears in more than one document in our Solr instance, the query
> should
> > > return that. The query should  find all such phrases from all the
> > documents
> > > in our Solr instance, by querying for two adjacent word combination
> > > (forming a phrase) in the documents that are in the Solr. These two
> > > adjacent word combinations should come from the documents in the Solr
> > > index.
> > >
> > > Any ideas on how to write this query?
> > >
> > > Thanks.
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message