lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jaeyoung Yoon <jaeyoungy...@gmail.com>
Subject Re: Shared Directory for two Solr Clouds(Writer and Reader)
Date Tue, 21 Oct 2014 04:34:50 GMT
In my case, injest rate is very high(above 300K docs/sec) and data are kept
inserted. So CPU is already bottleneck because of indexing.

older-style master/slave replication with http or scp takes long to copy
big files from master/slave.

That's why I setup two separate Solr Clouds. One for indexing and the other
for query.

Thanks,
Jae

On Mon, Oct 20, 2014 at 6:22 PM, Erick Erickson <erickerickson@gmail.com>
wrote:

> I guess I'm not quite sure what the point is. So can you back up a bit
> and explain what problem this is trying to solve? Because all it
> really appears to be doing that's not already done with stock Solr
> is saving some disk space, and perhaps your "reader" SolrCloud
> is having some more cycles to devote to serving queries rather
> than indexing.
>
> So I'm curious why
> 1> standard SolrCloud with selective hard and soft commits doesn't
> satisfy the need
> and
> 2> If <1> is not reasonable, why older-style master/slave replication
> doesn't work.
>
> Unless there's a compelling use-case for this, it seems like there's
> a lot of complexity here for questionable value.
>
> Please note I'm not saying this is a bad idea. It would just be good
> to  understand what problem it's trying to solve. I'm reluctant to
> introduce complexity without discussing the use-case. Perhaps
> the existing code could provide a "good enough" solution.
>
> Best,
> Erick
>
> On Mon, Oct 20, 2014 at 7:35 PM, Jaeyoung Yoon <jaeyoungyoon@gmail.com>
> wrote:
> > Hi Folks,
> >
> > Here are some my ideas to use shared file system with two separate Solr
> > Clouds(Writer Solr Cloud and Reader Solr Cloud).
> >
> > I want to get your valuable feedbacks
> >
> > For prototype, I setup two separate Solr Clouds(one for Writer and the
> > other for Reader).
> >
> > Basically big picture of my prototype is like below.
> >
> > 1. Reader and Writer Solr clouds share the same directory
> > 2. Writer SolrCloud sends the "openSearcher" commands to Reader Solr
> Cloud
> > inside postCommit eventHandler. That is, when new data are added to
> Writer
> > Solr Cloud, writer Solr Cloud sends own openSearcher command to Reader
> Solr
> > Cloud.
> > 3. Reader opens "searcher" only when it receives "openSearcher" commands
> > from Writer SolrCloud
> > 4. Writer has own deletionPolicy to keep old commit points which might be
> > used by running queries on Reader Solr Cloud when new searcher is opened
> on
> > reader SolrCloud.
> > 5. Reader has no update/no commits. Everything on reader Solr Cloud are
> > read-only. It also creates searcher from directory not from
> > indexer(nrtMode=false).
> >
> > That is,
> > In Writer Solr Cloud, I added postCommit eventListner. Inside the
> > postCommit eventListner, it sends own "openSearcher" command to reader
> Solr
> > Cloud's own handler. Then reader Solr Cloud will create openSearcher
> > directly without commit and return the writer's request.
> >
> > With this approach, Writer and Reader can use the same commit points in
> > shared file system in synchronous way.
> > When a Reader SolrCloud starts, it doesn't create openSearcher. Instead.
> > Writer Solr Cloud listens the zookeeper of Reader Solr Cloud. Any change
> in
> > the reader SolrCloud, writer sends "openSearcher" command to reader Solr
> > Cloud.
> >
> > Does it make sense? Or am I missing some important stuff?
> >
> > any feedback would be very helpful to me.
> >
> > Thanks,
> > Jae
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message