lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vermansi <verma...@gmail.com>
Subject Re: Cluster Retrieval in Lucene
Date Fri, 26 Nov 2010 19:22:06 GMT

This link would give some idea about the kind of implementation im
suggesting

http://docs.google.com/viewer?a=v&q=cache:q7eiY--1ilUJ:citeseerx.ist.psu.edu/viewdoc/download%3Fdoi%3D10.1.1.83.4177%26rep%3Drep1%26type%3Dpdf+cluster+based+retrieval+using+language+models&hl=en&gl=in&pid=bl&srcid=ADGEEShbrM9OEIRc-2_p0V74V7sf339j213Qete8bq1OykMLAfQ7YefU_GQ4G5VRGYz5jeg17i4PGlZ5nil-17QgW5HBRfwQmMtHi4Jxy18Pdgf54wt31Ktj38XiJte6qdCxR4ZFXcY-&sig=AHIEtbTeokGxQwRwAdz_gFo8c2YwDOOv0w


On Fri, Nov 26, 2010 at 3:54 AM, Ted Dunning [via Lucene] <
ml-node+1969914-190519764-277119@n3.nabble.com<ml-node%2B1969914-190519764-277119@n3.nabble.com>
> wrote:

> Can you provide a citation?  Citeseer is down at the moment.
>
> On Thu, Nov 25, 2010 at 2:09 PM, vermansi <[hidden email]<http://user/SendEmail.jtp?type=node&node=1969914&i=0>>
> wrote:
>
> >
> > Im sorry if my post misled you in any way but by cluster retrieval i mean
>
> > something like this :
> >
> >
> http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.83.4177&rep=rep1&type=pdf
> >
> >
> > On Fri, Nov 26, 2010 at 12:55 AM, Ted Dunning [via Lucene] <
> > [hidden email] <http://user/SendEmail.jtp?type=node&node=1969914&i=1><[hidden
> email] <http://user/SendEmail.jtp?type=node&node=1969914&i=2>>
> > <[hidden email] <http://user/SendEmail.jtp?type=node&node=1969914&i=3><[hidden
> email] <http://user/SendEmail.jtp?type=node&node=1969914&i=4>>
> > >
> > > wrote:
> >
> > > This is generally referred to as sharding.
> > >
> > > Solr can do this.
> > >
> > > Katta does this as well in a bit more flexible approach.  Solr Cloud is
>
> > > retrofitting a similar approach into Solr.
> > >
> > > On Thu, Nov 25, 2010 at 9:49 AM, vermansi <[hidden email]<
> > http://user/SendEmail.jtp?type=node&node=1968974&i=0>>
> > > wrote:
> > >
> > > >
> > > > Hello
> > > > I wish to implement cluster based retrieval model in lucene. I havent
>
> > > gone
> > > > through the code fully and am unaware of any existing implementations
>
> > for
> > >
> > > > it
> > > > based on lucene.
> > > > Could someone give me a heads up on where to begin .. as there is too
>
> > > much
> > > > of code to go through and I have very little time.
> > > >
> > > > Now my idea is ..
> > > >
> > > > Lucene index should be created in form of clusters . Ie At indexing
> > time
> > > > each Document (D) could belong to a cluster.
> > > > On Query (Q) submission the each cluster is searched for relevant
> > > > documents.
> > > > And the documents from that cluster as well as other clusters are
> > ranked.
> > >
> > > >
> > > > A brute force way of implementing it could be
> > > > 1. Clusters are denoted by a field Name -- cluster (C).
> > > > 2. the words are search in cluster field.
> > > > 3. The scoring functions are changed to incorporate the math used in
> > > > cluster
> > > > retrieval
> > > > 4. Documents in each cluster ranked seperately. And then merged
> > > >
> > > > Now the problem with this approach is many queries will have to be
> > > created
> > > > and the result processing will increase considerably...
> > > > If there are more ways to doing it please lemme know .
> > > >
> > > > Regards
> > > > Mansi
> > > >
> > > > --
> > > > View this message in context:
> > > >
> > >
> >
> http://lucene.472066.n3.nabble.com/Cluster-Retrieval-in-Lucene-tp1968500p1968500.html<http://lucene.472066.n3.nabble.com/Cluster-Retrieval-in-Lucene-tp1968500p1968500.html?by-user=t>
> > <
> >
> http://lucene.472066.n3.nabble.com/Cluster-Retrieval-in-Lucene-tp1968500p1968500.html?by-user=t<http://lucene.472066.n3.nabble.com/Cluster-Retrieval-in-Lucene-tp1968500p1968500.html?by-user=t&by-user=t>
>
> > >
> > > > Sent from the Lucene - General mailing list archive at Nabble.com.
> > > >
> > >
> > >
> > > ------------------------------
> > >  View message @
> > >
> >
> http://lucene.472066.n3.nabble.com/Cluster-Retrieval-in-Lucene-tp1968500p1968974.html<http://lucene.472066.n3.nabble.com/Cluster-Retrieval-in-Lucene-tp1968500p1968974.html?by-user=t>
> > > To unsubscribe from Cluster Retrieval in Lucene, click here<
> >
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1968500&code=dmVybWFuc2lAZ21haWwuY29tfDE5Njg1MDB8NDAzMzYxNzU4<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1968500&code=dmVybWFuc2lAZ21haWwuY29tfDE5Njg1MDB8NDAzMzYxNzU4&by-user=t>
>
> > >.
> > >
> > >
> >
> > --
> > View this message in context:
> >
> http://lucene.472066.n3.nabble.com/Cluster-Retrieval-in-Lucene-tp1968500p1969844.html<http://lucene.472066.n3.nabble.com/Cluster-Retrieval-in-Lucene-tp1968500p1969844.html?by-user=t>
>
> > Sent from the Lucene - General mailing list archive at Nabble.com.
> >
>
>
> ------------------------------
>  View message @
> http://lucene.472066.n3.nabble.com/Cluster-Retrieval-in-Lucene-tp1968500p1969914.html
>
> To unsubscribe from Cluster Retrieval in Lucene, click here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1968500&code=dmVybWFuc2lAZ21haWwuY29tfDE5Njg1MDB8NDAzMzYxNzU4>.
>
>

-- 
View this message in context: http://lucene.472066.n3.nabble.com/Cluster-Retrieval-in-Lucene-tp1968500p1974373.html
Sent from the Lucene - General mailing list archive at Nabble.com.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message