mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: Taste in production
Date Thu, 28 Aug 2008 20:25:38 GMT

On Aug 28, 2008, at 3:53 PM, Shalin Shekhar Mangar wrote:

> On Fri, Aug 29, 2008 at 1:06 AM, Otis Gospodnetic <
>> wrote:
>> I think I see what Grant is talking about - using Solr as, kind of, a
>> Job/TaskTracker, a work dispatcher component.  Is that right, Grant?
>> I'm not sure what advantages that would have over using a system  
>> that's
>> specifically built for distributed computation, like Hadoop, but we  
>> can
>> think about that when we see an actual need. :)  Interesting  
>> thinking.
> I'd love to see Solr use all the interesting stuff that Mahout is
> developing. Also bear in mind that unlike Hadoop, Solr is used in a  
> lot of
> small web shops with not a large amount of data. If these ML  
> techniques can
> be presented to solve relevant use-cases inside Solr -- it will be a  
> huge
> boost to Solr as well as Mahout. Not everyone has a Hadoop cluster.
> Some integration points that I can think of would be a better/faster
> MoreLikeThis (clustering), "Did you mean this..." suggestions (query  
> log
> analysis?), text summarization (instead of dumb highlighting) etc.

I will have more on this all sometime in the near future, as I will be  
working through it for my book, but yeah, Otis and Shalin are on the  
right track.  Solr really is a nice platform for doing lots of things  
beyond search.  It can be a vector server (

), a spelling server, as well as a search server, and it doesn't take  
much to make the leap to other things as well, like clusters.

Then, the trick, is to hook in the replication/distribution stuff 

View raw message