lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shalin Shekhar Mangar <>
Subject Re: Text classification with Solr
Date Mon, 26 Jan 2009 17:44:48 GMT
On Mon, Jan 26, 2009 at 10:59 PM, Neal Richter <> wrote:

> Hey all,
>  I'm in the processing of implementing a system to do 'text
> classification' with Solr.  The basic idea is to take an
> ontology/taxonomy like dmoz of {label: "X", tags: "a,b,c,d,e"}, index
> it and then classify documents into the taxonomy by pushing parsed
> document into the Solr search API.  Why?  Lucene/Solr's ability to do
> weighted term boosting at both search and index time has lots of
> obvious uses here.
>  Has anyone worked on this or a similar project yet?  I've seen some
> talk on the list about this area but it's pretty thin... December
> thread "Taxonomy Support on Solr".  I'm assuming Grant Ingersoll is
> looking at similar things with his 'taming text' project.
> I store the 'documents' in another repository and they are far too
> dynamic (write intensive) for direct indexing in Solr... so the
> previously suggested procedure of 1) store document 2) execute
> more-like-this and 3) delete document would be too slow.
> If people are interested I could start a JIRA issue on this (I do not
> see anything there at the moment).
> Thanks - Neal Richter

Grant did some work at

Take a look and see if that helps.

Shalin Shekhar Mangar.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message