incubator-blur-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron McCurry <amccu...@gmail.com>
Subject Re: Help needed on SearchExecutor...
Date Thu, 07 Jul 2016 14:35:27 GMT
Yeah I think that would work.

On Thu, Jul 7, 2016 at 9:58 AM, Ravikumar Govindarajan <
ravikumar.govindarajan@gmail.com> wrote:

> I just now looked at IndexSearcherCloseableSecureBase.java
>
> Guess if we want to cap each search request with max-of "n" threads, we can
> plug the above logic into this class directly instead of
> BlurIndexSimpleWriter.java
>
> On Wed, Jun 29, 2016 at 6:04 PM, Ravikumar Govindarajan <
> ravikumar.govindarajan@gmail.com> wrote:
>
> > This is really nice Aaron. You've done the bulk of work already!!!
> >
> > I think parallelism can be provided too for searching a single shard....
> >
> > Just as a quick proposal, we can do a static initialization in
> > BlurIndexSimpleWriter
> >
> > static LinkedBlockingQueue executorQueue = new LBQ(128/4);
> >
> > static {
> >  for(int i=0;i<128/4;i++) {
> >     queue.add(Executors.newFixedThreadPool(4));
> >    }
> > }
> > ----
> >
> > Incoming search request per-shard...
> >
> > public IndexSearcher getIndexSearcher() {
> >  .....
> >  Executor current = executorQueue.poll();
> >
> > if (current==null) {
> >   //All thread-pools are busy or user has explicitly switched off via
> > config.
> >   //Search proceeds in single threaded fashion utilizing calling-thread
> > itself
> > }
> >
> > return new IndexSearcherCloseable(indexReader, current);
> > }
> > ---
> >
> > Btw, we can do this by over-riding a single method
> > IndexSearcher.slices(...) in lucene 5.x & above!!!
> >
> >
> > On Tue, Jun 28, 2016 at 8:01 PM, Aaron McCurry <amccurry@gmail.com>
> wrote:
> >
> >> Some time ago I created something similar, it's kinda a backport into
> >> Lucene 4.3:
> >>
> >>
> >>
> https://github.com/apache/incubator-blur/blob/65640200a8e7dd539c1dd4d920255c717102b9b2/blur-query/src/main/java/org/apache/blur/lucene/search/CloneableCollector.java#L25
> >>
> >> It's handles the execution of searching the segments in parallel but
> >> doesn't provide any limitations on parallelism.
> >>
> >> Aaron
> >>
> >>
> >>
> >> On Tue, Jun 28, 2016 at 6:37 AM, Ravikumar Govindarajan <
> >> ravikumar.govindarajan@gmail.com> wrote:
> >>
> >> > Aaron,
> >> >
> >> > Just an update..
> >> >
> >> > https://issues.apache.org/jira/browse/LUCENE-5299
> >> >
> >> > You can now use any collector & get guaranteed parallel execution.
> They
> >> > have also provided a "parallelism" hint that will limit the number of
> >> > search threads at request level...
> >> >
> >> > i.e., we can fix blur executor thread-count at 128 & limit
> >> "parallelism" at
> >> > a max of 4 threads per request..
> >> >
> >> > On Fri, Feb 6, 2015 at 5:25 PM, Ravikumar Govindarajan <
> >> > ravikumar.govindarajan@gmail.com> wrote:
> >> >
> >> > > Thanks for the clarifications.
> >> > >
> >> > > Another point I thought about is the disk efficiency of a serving
a
> >> > > random-IO. Many parallel threads could end-up hitting just one or
> two
> >> > disks
> >> > > in the cluster…
> >> > >
> >> > > Think I can skip it safely for my work-loads.
> >> > >
> >> > > --
> >> > > Ravi
> >> > >
> >> > > On Fri, Feb 6, 2015 at 3:09 PM, Aaron McCurry <amccurry@gmail.com>
> >> > wrote:
> >> > >
> >> > >> The ServiceExecutor (thread pool) put inside the IndexSearcher
was
> an
> >> > >> attempt at making the segments search in parallel when available.
> >> > However
> >> > >> there is a limitation in Lucene that does not allow segment
> parallel
> >> > >> searches when you are using Collectors.
> >> > >>
> >> > >>
> >> > >>
> >> >
> >>
> https://github.com/apache/lucene-solr/blob/lucene_solr_4_3_0/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L595
> >> > >>
> >> > >> We override this method to allow for Tracing:
> >> > >>
> >> > >>
> >> > >>
> >> >
> >>
> https://github.com/apache/incubator-blur/blob/master/blur-core/src/main/java/org/apache/blur/server/IndexSearcherCloseableBase.java#L46
> >> > >>
> >> > >> and here:
> >> > >>
> >> > >>
> >> > >>
> >> >
> >>
> https://github.com/apache/incubator-blur/blob/master/blur-core/src/main/java/org/apache/blur/server/IndexSearcherCloseableSecureBase.java#L51
> >> > >>
> >> > >> I agree that if you are already running a lot of shards per server
> >> that
> >> > if
> >> > >> we were to enhance Lucene to allow for parallel searching of
> >> segments it
> >> > >> could become counter productive.  I have seen underutilized systems
> >> that
> >> > >> could take advantage of the parallel segment search, so as with
any
> >> > >> feature
> >> > >> like this, it depends.  :-)
> >> > >>
> >> > >> Aaron
> >> > >>
> >> > >> On Fri, Feb 6, 2015 at 2:39 AM, Ravikumar Govindarajan <
> >> > >> ravikumar.govindarajan@gmail.com> wrote:
> >> > >>
> >> > >> > Blur by default uses a SearchExecutor for IndexSearcher.
I
> believe
> >> > >> lucene
> >> > >> > helps searching segments of a single shard in parallel.
> >> > >> >
> >> > >> > Our previous index was built on a lower version of lucene
where
> >> such a
> >> > >> > feature was absent and we ran sequential search per shard
only…
> >> > >> >
> >> > >> > What is the general recommendation for blur? Is it advisable
to
> use
> >> > the
> >> > >> > SearchExecutor? What will happen when there are many parallel
> >> queries
> >> > >> for
> >> > >> > different shards. Will SearchExecutor become a bottle-neck?
> >> > >> >
> >> > >> > Any help is much appreciated...
> >> > >> >
> >> > >> > --
> >> > >> > Ravi
> >> > >> >
> >> > >>
> >> > >
> >> > >
> >> >
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message