lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: Lucene & Solr a one way street?
Date Sun, 13 Mar 2011 19:18:32 GMT

On Mar 13, 2011, at 11:47 AM, Robert Muir wrote:

> On Sun, Mar 13, 2011 at 11:47 AM, Grant Ingersoll <gsingers@apache.org> wrote:
>> I guess the question people w/ Solr only hats on have (if there are such people),
is which way is that street going?  It seems like most people want to pull stuff out of Solr,
but they don't seem to want to put into it.  That's probably where some of the resistance
comes from.  If you want to modularize everything so that you can consume it outside of Solr,
it usually means you don't use Solr, which sometimes comes across that you don't care if the
modularization actually has a negative effect on Solr.  I'm all for modularization and enabling
everyone, but not at the cost of loss of performance in Solr.  As tightly coupled as Solr
is, it's pretty damn fast and resilient.  Show me that you keep that whole and I'll be +1
on everything.
> 
> Do you have any facts to back up these baseless accusations?

I apologize.  I didn't attend to accuse anyone if it was read that way.  If you read earlier,
I actually thought the whole merge is going well and that their is some pretty good cross-fertilization
going on.  If I didn't properly convey it here, the accusations are actually against those
who have only Solr hats on.  Hint, I ain't one of them.  It is a concern I've heard from people
in the "don't poach Solr camp".  I don't think it's the right attitude, but I do think it
is worth mentioning the concern.    I really see Lucene/Solr as a broad continuum of enabling
technologies and really there isn't one or the other in my mind.

> 
> Because I'll tell you how its "seems" to me: lucene committers are
> going well beyond whats required (fixing solr) to commit changes to
> lucene.

I totally agree.  The sum of the parts is really awesome now.

> 
> Take a look at the commits list, we are the ones doing Solr's dirty work:
> * Like Uwe Schindler fixing up tons of XML related bugs in Solr,
> fixing analysis.jsp and the related request handlers.
> * Like Simon Willnauer doing the necessary improvements to IndexReader
> such that SolrIndexReader need not exist, and trying to add good codec
> support to Solr so it can take advantage of flexible indexing.

Yep and he should commit those when he is ready.  

I heartily agree this is great work.

> 
> And I guess i didnt "put any effort into solr" when i spent a huge
> chunk of this weekend tracking down jre crashes and test bugs in a
> Solr cloud test?!

I never said you didn't.  I am totally in awe of the work you are doing.  I wish I had half
the energy and focus of some of the people who commit on a regular basis.

> 
> As far as modularization having a negative performance effect on Solr,
> how is this the case? Again do you have any concrete examples, or is
> this just more baseless accusations?

No, I don't.  I just said those are the concerns.  I tend to agree that they are unfounded.


> 
> Do you have specific benchmarks to where solr's analysis is now
> somehow slower due to the refactoring (since this is the only
> modularization thats happened from solr)?!
> Doesn't look slower to me:
> http://www.lucidimagination.com/search/document/46a8351089a98aec/protwords_txt_support_in_stemmers#46a8351089a98aec

Dude, I think the analysis modularization is awesome.  I'm about to begin porting it to OpenNLP
for instance.  I wish it was more decoupled so I wouldn't have to bring all of Lucene core
over and could just bring the analysis.  Likewise for Mahout.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message