lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anshum <ansh...@gmail.com>
Subject Re: Binding lucene instance/threads to a particular processor(or core)
Date Tue, 22 Apr 2008 04:10:17 GMT
The paper seems pretty good but I am still wondering if there was a way to
achieve this through the command line parameters. I'm just trying this to
optimize the code, if this works, would let all know else would keep
everyone informed :)
Any other suggestions for handling a concurrency of over 7 search requests
per second for an index size of over 15Gigs containing over 13 million
records?
Also, could someone help me with obtaining a 'index size' - 'concurrency' -
'processor power' - 'memory' relationship formula (or something similar)?

--
Anshum


On Tue, Apr 22, 2008 at 3:55 AM, Antony Bowesman <adb@teamware.com> wrote:

> That paper from 1997 is pretty old, but mirrors our experiences in those
> days. Then, we used Solaris processor sets to really improve performance by
> binding one of our processes to a particular CPU while leaving the other
> CPUs to manage the thread intensive work.
>
> You can bind processes/LWPs to a CPU on Solaris with psrset.
>
> The Solaris thread model in the late '90s was also a significant factor in
> performance of multi-threaded programs.  The default thread library in
> Solaris 8 implemented a MxN unbound thread model (threads/LWPS).  In those
> days we found that it did not perform well, so used the bound thread model
> (i.e. 1:1) where a Solaris thread was bound permanently to an LWP.  That
> improved performance a lot.  In Solaris 8, Sun had what they called the
> 'alternate' thread library (T2) around 2000, which became the default
> library in Solaris 9, and implemented a 1:1 model of Solaris threads to
> LWPs.  That new library had dramatic performance improvements over the old.
>
> Some background info for Java and threading
>
> http://java.sun.com/j2se/1.5.0/docs/guide/vm/thread-priorities.html
>
> Antony
>
>
>
> Glen Newton wrote:
>
> > I realised that not everyone on this list might be able to access the
> > IEEE paper I pointed-out, so I will include the abstract and some
> > paragraphs from the paper which I have included below.
> >
> > Also of interest (and should be available to all): Fedorova et al.
> > 2005. Performance of Multithreaded Chip Multiprocessors And
> > Implications For Operating System Design. Usenix 2005.
> > http://www.eecs.harvard.edu/margo/papers/usenix05/paper.pdf
> > "Abstract: We investigated how operating system design should be
> > adapted for multithreaded chip multiprocessors (CMT) – a new
> > generation of processors that exploit thread-level parallelism to mask
> > the memory latency in modern workloads. We
> > determined that the L2 cache is a critical shared resource on CMT and
> > that an insufficient amount of L2 cache can undermine the ability to
> > hide memory latency on these processors. To use the L2 cache as
> > efficiently as possible, we propose an L2-conscious scheduling
> > algorithm and quantify its performance potential. Using this algorithm
> > it is possible to reduce miss ratios in the L2 cache by 25-37% and
> > improve processor throughput by 27-45%."
> >
> >
> > From Lundberg, L. 1997:
> > Abstract: "The default scheduling algorithm in Solaris and other
> > operating systems may result in frequent relocation of threads at
> > run-time. Excessive thread relocation cause
> > poor memory performance. This can be avoided by binding threads to
> > processors. However, binding threads to processors may result in an
> > unbalanced load. By considering a previously obtained theoretical
> > result and by evaluating a set of multithreaded Solaris
> > programs using a multiprocessor with 8 processors, we are able to
> > bound the maximum performance loss due to binding threads, The
> > theoretical result is also recapitulated. By evaluating another set of
> > multithreaded programs, we show that the gain of binding threads to
> > processors may be substantial, particularly for programs with fine
> > grained parallelism."
> >
> > First paragraph: "The thread concept in Solaris [3][5] and other
> > operating systems makes it possible to write multithreaded programs,
> > which can be executed in parallel on a multiprocessor. Previous
> > experience from real world programs [4] show that, using the default
> > scheduling algorithm in Solaris, threads are frequently relocated from
> > one processor
> > to another at run-time. After each such relocation, the code and data
> > associated with the relocated thread is moved from the cache memory of
> > the 0113 processor to the cache of the new processor. This reduces the
> > performance. Run-time relocation of threads to processors can also
> > result in unpredictable response times. This is a problem in systems
> > which operate in a real-time environment. In order to avoid poor
> > memory performance and unpredictable real-time behaviour due to
> > frequent thread relocation, threads can be bound to processors using
> > the processor-bind directive [3] [5]. The major problem with binding
> > threads is that one can end up with an unbalanced load, i.e. some
> > processors may be extremely busy during some time periods while other
> > processors are idle."
> >
> > -Glen
> >
> > On 21/04/2008, Glen Newton <glen.newton@gmail.com> wrote:
> >
> > > And this discussion on bound threads may also shed light on things:
> > >
> > > http://coding.derkeiler.com/Archive/Java/comp.lang.java.programmer/2007-11/msg02801.html
> > >
> > >
> > >  -Glen
> > >
> > >
> > >  On 21/04/2008, Glen Newton <glen.newton@gmail.com> wrote:
> > >  > BInding threads to processors - in many situations - improves
> > >  >  throughput by reducing memory overhead. When a thread is running
> > > on a
> > >  >  core, its state is local; if it is timeshared-out and either 1)
> > >  >  swapped back in on the same core, it is likely that there will be
> > >  the
> > >  >  core's L1 cache; or 2) onto another core, there will not be a
> > > cache
> > >  >  hit and the state will have to be fetched from L2 or main memory,
> > >  >  incurring a performance hit, esp in the latter. See Lundberg, L.
> > > 1997.
> > >  >  Evaluating the Performance Implications of Binding Threads to
> > >  >  Processors. 393.
> > > http://ieeexplore.ieee.org/iel3/5020/13768/00634520.pdf
> > >  >  for more info.
> > >  >
> > >  >  If you are using JVM on Solaris on SPARC, you should take a look
> > > at
> > >  >  the following links for tuning (the Sun JVM on Solaris SPARC has
> > > many
> > >  >  more performance tuning parameters available), including
> > > threading:
> > >  >  - http://java.sun.com/docs/hotspot/threads/threads.html
> > >  >  -
> > > http://java.sun.com/j2se/1.5.0/docs/guide/vm/thread-priorities.html
> > >  >  -
> > > http://www-1.ibm.com/support/docview.wss?rs=180&context=SSEQTP&uid=swg21107291
> > >  >  - http://java.sun.com/javase/technologies/performance.jsp
> > >  >
> > >  >
> > >  >  -Glen
> > >  >
> > >  >
> > >  >
> > >  >
> > >  >
> > >  >
> > >  >
> > >  >  On 21/04/2008, Ulf Dittmer <udittmer@yahoo.com> wrote:
> > >  >  > This sounds odd. Why would restricting it to a single
> > >  >  >  core improve performance? The point of using multiple
> > >  >  >  cores (and multiple threads) is to improve performance
> > >  >  >  isn't it? I'd leave thread scheduling decisions to the
> > >  >  >  JVM. Plus, I don't think there is anything in Java to
> > >  >  >  facilitate this (short of using JNI).
> > >  >  >
> > >  >  >  Are you talking about indexing or searching? You may
> > >  >  >  be able to use multiple parallel threads to improve
> > >  >  >  indexing performance. I don't think Lucene uses
> > >  >  >  multi-threading for searching; not unless you have
> > >  >  >  multiple indices, anyway.
> > >  >  >
> > >  >  >  Ulf
> > >  >  >
> > >  >  >
> > >  >  >
> > >  >  >  --- Anshum <anshumg@gmail.com> wrote:
> > >  >  >
> > >  >  >  > Hi,
> > >  >  >  >
> > >  >  >  > I have been trying to bind my lucene instance (JVM -
> > >  >  >  > Sun Hotspot*) to a
> > >  >  >  > particular core so as to improve the performance. Is
> > >  >  >  > there a way to do so or
> > >  >  >  > is there support in lucene to explicitly control the
> > >  >  >  > thread - processor
> > >  >  >  > linkup?
> > >  >  >  >
> > >  >  >  > --
> > >  >  >  > --
> > >  >  >  > The facts expressed here belong to everybody, the
> > >  >  >  > opinions to me.
> > >  >  >  > The distinction is yours to draw............
> > >  >  >  >
> > >  >  >
> > >  >  >
> > >  >  >
> > >  >  >
> > >  >  >
> > > ____________________________________________________________________________________
> > >  >  >  Be a better friend, newshound, and
> > >  >  >  know-it-all with Yahoo! Mobile.  Try it now.
> > > http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
> > >  >  >
> > >  >  >
> > >  ---------------------------------------------------------------------
> > >  >  >  To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >  >  >  For additional commands, e-mail:
> > > java-user-help@lucene.apache.org
> > >  >  >
> > >  >  >
> > >  >
> > >  >
> > >  >
> > >  > --
> > >  >
> > >  >  -
> > >  >
> > >
> > >
> > >
> > > --
> > >
> > >  -
> > >
> > >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
--
The facts expressed here belong to everybody, the opinions to me.
The distinction is yours to draw............

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message