Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 78944 invoked from network); 27 Nov 2009 15:53:22 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 27 Nov 2009 15:53:22 -0000 Received: (qmail 98647 invoked by uid 500); 27 Nov 2009 15:53:21 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 98564 invoked by uid 500); 27 Nov 2009 15:53:21 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 98556 invoked by uid 99); 27 Nov 2009 15:53:21 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Nov 2009 15:53:21 +0000 X-ASF-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL,BAYES_00,HTML_MESSAGE X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of erickerickson@gmail.com designates 209.85.219.225 as permitted sender) Received: from [209.85.219.225] (HELO mail-ew0-f225.google.com) (209.85.219.225) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Nov 2009 15:53:18 +0000 Received: by ewy25 with SMTP id 25so1862264ewy.5 for ; Fri, 27 Nov 2009 07:52:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=g6od76WWlJWiDL1dr9YHBFE6A6WHXijQvnHkfd8beaQ=; b=FG4hFt0quzLm4WEZUqymj4u216zNj2EkM9YU7NXIjp4hbZ2IMzirBPCjxXs/XsXQJ5 DWnlARrcsycRQz7/CONx+8FV02CKz+GC0mycklxwbZiz0NUkzUmoK04tGzwt/ZFVnyUg c+6h9GdnAoUp4VdchXh9udCoUm+tsue3vMC0c= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=QKGLNZfiegXUiHXzUazY7aXBkooqtlNEWg2O1sInLgrtyVD1j/nsxQdUOPiEgboopw f8OjID2Ovq6l31Dj9zhvCbj70vjgM4xqwqIGGCed6471v0wsmZYfvQBNBqgVh7euRqHI eHz/TLzOWN7L3i/egSoow0IEdkzvbDMpGrP50= MIME-Version: 1.0 Received: by 10.216.90.1 with SMTP id d1mr382275wef.136.1259337176302; Fri, 27 Nov 2009 07:52:56 -0800 (PST) In-Reply-To: <359a92830911260738p478f3729i36867470886c6cc8@mail.gmail.com> References: <359a92830911251832o5f58cff4xeac1cb4b8f4bcbb4mkm@mail.gmail.com> <9ac0c6aa0911260224l125be689nb80356e3938d8e90@mail.gmail.com> <192011AF-E9DD-4B3B-995E-7BA8F05E5DF1@gmail.com> <359a92830911260738p478f3729i36867470886c6cc8@mail.gmail.com> Date: Fri, 27 Nov 2009 10:52:56 -0500 Message-ID: <359a92830911270752m41b78055tf229ad97448740c3@mail.gmail.com> Subject: Re: (LUCENE-1844) Speed up junit tests From: Erick Erickson To: java-dev@lucene.apache.org Content-Type: multipart/alternative; boundary=0016e6d99f260bc49e04795c4984 --0016e6d99f260bc49e04795c4984 Content-Type: text/plain; charset=ISO-8859-1 But then I got to thinking..... I admit I've only scratched the surface of the JUnit4 parallelization stuff. That said, it seems like the real benefit comes from making use of multiple cores, we don't get huge speedups just from running multiple threads at once on a single core. Which makes sense if you're not doing much in the way of I/O. This notion was inspired by the "scary Python script" comment..... So what if we use Ant ForEach construct instead? Yet again this is a fuzzy idea I'm throwing out without much to back it up. Mostly I'm wondering if anyone's thought about it before or can shoot it down before it takes wing. Or if it is worth exploring. Assuming we structure our test directories so there are only directories at the root of the test area, could we persuade Ant to fire off the tests N directories at a time in parallel? N would default to 1 but could be passed in to the task, something like -DmaxThreads=4. ForEach actually has a maxThreads parameter..... In fact, we wouldn't even need to have only directories at the test root, but the individual test files at the root would probably be inefficiently run. I suspect that keeping the test directories in balance would be much less work that trying to parallelize using JUnit4, and be much less fraught with gremlins. This assumes we get sufficient isolation by Ant running separate threads, about which I have absolutely NO information. Like I said, mostly I'm wondering if anybody's gone down this path before and has wisdom to offer. Which *still* doesn't mean we shouldn't do whatever we can to speed up individual tests, but looking that the timings there's no obvious low-hanging fruit.... I wonder if we could somehow run the various directories in time order, longest-to-shortest in the hope that all the threads would finish up "close enough" to the same time. I haven't thought about *how* to make this happen yet though.... Anyway, I'll be happy to pursue this if y'all think it has merit, let me know and I'll open a JIRA and take it on. For the benefit of those aforementioned *real* people with *real* machines, who I'll rely upon to help test this notion.... Is the poor-mans version of this on a dual-core machine just running "test-core" and "test-contrib" in two separate windows? Best Erick On Thu, Nov 26, 2009 at 10:38 AM, Erick Erickson wrote: > Despite my long rambling, I agree that speeding things up is worthwhile. > Just > not a huge deal for some of us poor peons who are on dinky little 2-core > machines and feel inadequate even *talking* to people who have *real* > machines ... > > Time to go get ready to eat Turkey.... > > Erick > > > On Thu, Nov 26, 2009 at 9:02 AM, Mark Miller wrote: > >> right - as soon as you have to start running the tests often enough, any >> decent savings turns into less waiting and more work. Waiting for tests to >> run is time that could be better spent elsewhere. And many of us runthe >> tests *a lot* considering how long they take. And we will only keep adding >> more and will continue to do so. >> >> Also, many of us *are* on multicore and should be able to benifit from it. >> I don't dev on anything less than 4 cores these days. It's a life changer :) >> and cheap currently. I'd like 8. >> >> - Mark >> >> http://www.lucidimagination.com (mobile) >> >> >> On Nov 26, 2009, at 5:24 AM, Michael McCandless < >> lucene@mikemccandless.com> wrote: >> >> I still think there's value to faster tests, even if they don't become >>> so fast as to enable "fully interactive testing". >>> >>> Plus, this is an ongoing goal with time, not a one-time event. As we >>> create tests we should generally try to maximize coverage and minimize >>> CPU cost, as long as the effort is smallish. >>> >>> Mike >>> >>> On Wed, Nov 25, 2009 at 9:32 PM, Erick Erickson >>> wrote: >>> >>>> I posted a rather long diatribe outlining why I think speed-ups >>>> are a false goal for Lucene. Briefly, I'm convinced that as long >>>> as the tests are run when Hudson builds Lucene, 99% of the >>>> value of unit tests is realized. I suppose this implies that the >>>> hard-core committers agree that as long as failed tests >>>> are caught/corrected within a day things are fine. >>>> >>>> Although coming from a background where unit >>>> tests are not always required, my viewpoint may be >>>> suspect . >>>> >>>> Erick@NotToBeConfusedWithHatcher.com.... >>>> >>>> On Wed, Nov 25, 2009 at 8:43 PM, Michael McCandless (JIRA) >>>> wrote: >>>> >>>> >>>>> [ >>>>> >>>>> https://issues.apache.org/jira/browse/LUCENE-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782716#action_12782716 >>>>> ] >>>>> >>>>> Michael McCandless commented on LUCENE-1844: >>>>> -------------------------------------------- >>>>> >>>>> Will we also speed up back-compat tests? >>>>> >>>>> Speed up junit tests >>>>>> -------------------- >>>>>> >>>>>> Key: LUCENE-1844 >>>>>> URL: https://issues.apache.org/jira/browse/LUCENE-1844 >>>>>> Project: Lucene - Java >>>>>> Issue Type: Improvement >>>>>> Reporter: Mark Miller >>>>>> Attachments: FastCnstScoreQTest.patch, >>>>>> >>>>> hi_junit_test_runtimes.png, LUCENE-1844.patch >>>>> >>>>>> >>>>>> >>>>>> As Lucene grows, so does the number of JUnit tests. This is obviously >>>>>> a >>>>>> >>>>> good thing, but it comes with longer and longer test times. Now that we >>>>> also >>>>> run back compat tests in a standard test run, this problem is >>>>> essentially >>>>> doubled. >>>>> >>>>>> There are some ways this may get better, including running parallel >>>>>> >>>>> tests. You will need the hardware to fully take advantage, but it >>>>> should be >>>>> a nice gain. There is already an issue for this, and Junit 4.6, 4.7 >>>>> have the >>>>> beginnings of something we might be able to count on soon. 4.6 was >>>>> buggy, >>>>> and 4.7 still doesn't come with nice ant integration. Parallel tests >>>>> will >>>>> come though. >>>>> >>>>>> Beyond parallel testing, I think we also need to concentrate on >>>>>> keeping >>>>>> >>>>> our tests lean. We don't want to sacrifice coverage or quality, but I'm >>>>> sure >>>>> there is plenty of fat to skim. >>>>> >>>>>> I've started making a list of some of the longer tests - I think with >>>>>> >>>>> some work we can make our tests much faster - and then with >>>>> parallelization, >>>>> I think we could see some really great gains. >>>>> >>>>> -- >>>>> This message is automatically generated by JIRA. >>>>> - >>>>> You can reply to this email to add a comment to the issue online. >>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org >>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org >>>>> >>>>> >>>>> >>>> I posted a rather long diatribe outlining why I think speed-ups >>>> are a false goal for Lucene. Briefly, I'm convinced that as long >>>> as the tests are run when Hudson builds Lucene, 99% of the >>>> value of unit tests is realized. I suppose this implies that the >>>> hard-core committers agree that as long as failed tests >>>> are caught/corrected within a day things are fine. >>>> >>>> Although coming from a background where unit >>>> tests are not always required, my viewpoint may be >>>> suspect . >>>> >>>> Erick@NotToBeConfusedWithHatcher.com.... >>>> >>>> On Wed, Nov 25, 2009 at 8:43 PM, Michael McCandless (JIRA) < >>>> jira@apache.org> wrote: >>>> >>>>> >>>>> [ >>>>> https://issues.apache.org/jira/browse/LUCENE-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782716#action_12782716 >>>>> ] >>>>> >>>>> Michael McCandless commented on LUCENE-1844: >>>>> -------------------------------------------- >>>>> >>>>> Will we also speed up back-compat tests? >>>>> >>>>> Speed up junit tests >>>>>> -------------------- >>>>>> >>>>>> Key: LUCENE-1844 >>>>>> URL: https://issues.apache.org/jira/browse/LUCENE-1844 >>>>>> Project: Lucene - Java >>>>>> Issue Type: Improvement >>>>>> Reporter: Mark Miller >>>>>> Attachments: FastCnstScoreQTest.patch, >>>>>> hi_junit_test_runtimes.png, LUCENE-1844.patch >>>>>> >>>>>> >>>>>> As Lucene grows, so does the number of JUnit tests. This is obviously >>>>>> a good thing, but it comes with longer and longer test times. Now that we >>>>>> also run back compat tests in a standard test run, this problem is >>>>>> essentially doubled. >>>>>> There are some ways this may get better, including running parallel >>>>>> tests. You will need the hardware to fully take advantage, but it should be >>>>>> a nice gain. There is already an issue for this, and Junit 4.6, 4.7 have the >>>>>> beginnings of something we might be able to count on soon. 4.6 was buggy, >>>>>> and 4.7 still doesn't come with nice ant integration. Parallel tests will >>>>>> come though. >>>>>> Beyond parallel testing, I think we also need to concentrate on >>>>>> keeping our tests lean. We don't want to sacrifice coverage or quality, but >>>>>> I'm sure there is plenty of fat to skim. >>>>>> I've started making a list of some of the longer tests - I think with >>>>>> some work we can make our tests much faster - and then with parallelization, >>>>>> I think we could see some really great gains. >>>>>> >>>>> >>>>> -- >>>>> This message is automatically generated by JIRA. >>>>> - >>>>> You can reply to this email to add a comment to the issue online. >>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org >>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org >>>>> >>>>> >>>> >>>> >>>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org >>> For additional commands, e-mail: java-dev-help@lucene.apache.org >>> >>> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-dev-help@lucene.apache.org >> >> > --0016e6d99f260bc49e04795c4984 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable But then I got to thinking..... I admit I've only scratched the
surf= ace of the JUnit4 parallelization stuff. That said, it
seems like the re= al benefit comes from making use of
multiple cores, we don't get hug= e speedups just from
running multiple threads at once on a single core. Which
makes sense if = you're not doing much in the way of I/O.

This notion was inspire= d by the "scary Python script"
comment.....

So what if = we use Ant ForEach construct instead? Yet
again this is a fuzzy idea I'm throwing out without much
to back it= up. Mostly I'm wondering if anyone's thought about
it before or= can shoot it down before it takes wing. Or if
it is worth exploring.
Assuming we structure our test directories so there are only
directo= ries at the root of the test area, could we persuade Ant
to fire off the= tests N directories at a time in parallel?
N would default to 1 but cou= ld be passed in to the task, something
like -DmaxThreads=3D4. ForEach actually has a maxThreads
parameter..... = In fact, we wouldn't even need to have only directories
at the test = root, but the individual test files at the root would probably
be ineffi= ciently run.

I suspect that keeping the test directories in balance would be
much= less work that trying to parallelize using JUnit4, and be
much less fra= ught with gremlins. This assumes we get
sufficient isolation by Ant runn= ing separate threads, about
which I have absolutely NO information. Like I said, mostly
I'm wond= ering if anybody's gone down this path before and
has wisdom to offe= r.

Which *still* doesn't mean we shouldn't do whatever we ca= n
to speed up individual tests, but looking that the timings there's
n= o obvious low-hanging fruit....

I wonder if we could somehow run the= various directories in
time order, longest-to-shortest in the hope that= all the threads
would finish up "close enough" to the same time. I haven't thought about *how* to make this happen yet though....

Anyway, I&#= 39;ll be happy to pursue this if y'all think it has merit,
let me kn= ow and I'll open a JIRA and take it on. For the
benefit of those aforementioned *real* people with *real*
machines, who = I'll rely upon to help test this notion....

Is the poor-mans ver= sion of this on a dual-core machine
just running "test-core" a= nd "test-contrib" in two separate
windows?

Best
Erick

On Thu, Nov= 26, 2009 at 10:38 AM, Erick Erickson <erickerickson@gmail.com> wrote:
Despite my long r= ambling, I agree that speeding things up is worthwhile. Just
not a huge = deal for some of us poor peons who are on dinky little 2-core
machines and feel inadequate even *talking* to people who have *real*
machines <G>...

Time to go get ready to eat Turkey....

Erick

<= br>
On Thu, Nov 26, 2009 at 9:02 AM, Mark Miller = <markrmiller@gmail.com> wrote:
=A0right - as soo= n as you have to start running the tests often enough, any decent savings t= urns into less waiting and more work. Waiting for tests to run is time that= could be better spent elsewhere. And many of us runthe tests *a lot* consi= dering how long they take. And we will only keep adding more and will conti= nue to do so.

Also, many of us *are* on multicore and should be able to benifit from it. = I don't dev on anything less than 4 cores these days. It's a life c= hanger :) and cheap currently. I'd like 8.

- Mark

http://www.lu= cidimagination.com (mobile)


On Nov 26, 2009, at 5:24 AM, Michael McCandless <lucene@mikemccandless.com> w= rote:

I still think there's value to faster tests, even if they don't bec= ome
so fast as to enable "fully interactive testing".

Plus, this is an ongoing goal with time, not a one-time event. =A0As we
create tests we should generally try to maximize coverage and minimize
CPU cost, as long as the effort is smallish.

Mike

On Wed, Nov 25, 2009 at 9:32 PM, Erick Erickson <erickerickson@gmail.com> wrote= :
I posted a rather long diatribe outlining why I think speed-ups
are a false goal for Lucene. Briefly, I'm convinced that as long
as the tests are run when Hudson builds Lucene, 99% of the
value of unit tests is realized. I suppose this implies that the
hard-core committers agree that as long as failed tests
are caught/corrected within a day things are fine.

Although coming from a background where unit
tests are not always required, my viewpoint may be
suspect <G>.

Erick@NotToBeConfusedWithHatcher.com....

On Wed, Nov 25, 2009 at 8:43 PM, Michael McCandless (JIRA)
<jira@apache.org>wrote:


=A0 [
https://issues.apache.org= /jira/browse/LUCENE-1844?page=3Dcom.atlassian.jira.plugin.system.issuetabpa= nels:comment-tabpanel&focusedCommentId=3D12782716#action_12782716]<= br>
Michael McCandless commented on LUCENE-1844:
--------------------------------------------

Will we also speed up back-compat tests?

Speed up junit tests
--------------------

=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Key: LUCENE-1844
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0URL: https://issues.apache.org/jira/br= owse/LUCENE-1844
=A0 =A0 =A0 =A0 =A0 =A0Project: Lucene - Java
=A0 =A0 =A0 =A0 Issue Type: Improvement
=A0 =A0 =A0 =A0 =A0 Reporter: Mark Miller
=A0 =A0 =A0 =A0Attachments: FastCnstScoreQTest.patch,
hi_junit_test_runtimes.png, LUCENE-1844.patch


As Lucene grows, so does the number of JUnit tests. This is obviously a
good thing, but it comes with longer and longer test times. Now that we als= o
run back compat tests in a standard test run, this problem is essentially doubled.
There are some ways this may get better, including running parallel
tests. You will need the hardware to fully take advantage, but it should be=
a nice gain. There is already an issue for this, and Junit 4.6, 4.7 have th= e
beginnings of something we might be able to count on soon. 4.6 was buggy, and 4.7 still doesn't come with nice ant integration. Parallel tests wi= ll
come though.
Beyond parallel testing, I think we also need to concentrate on keeping
our tests lean. We don't want to sacrifice coverage or quality, but I&#= 39;m sure
there is plenty of fat to skim.
I've started making a list of some of the longer tests - I think with
some work we can make our tests much faster - and then with parallelization= ,
I think we could see some really great gains.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org



I posted a rather long diatribe outlining why I think speed-ups
are a false goal for Lucene. Briefly, I'm convinced that as long
as the tests are run when Hudson builds Lucene, 99% of the
value of unit tests is realized. I suppose this implies that the
hard-core committers agree that as long as failed tests
are caught/corrected within a day things are fine.

Although coming from a background where unit
tests are not always required, my viewpoint may be
suspect <G>.

Erick@NotToBeConfusedWithHatcher.com....

On Wed, Nov 25, 2009 at 8:43 PM, Michael McCandless (JIRA) <jira@apache.org> wrote:

=A0 [ https://issues.apa= che.org/jira/browse/LUCENE-1844?page=3Dcom.atlassian.jira.plugin.system.iss= uetabpanels:comment-tabpanel&focusedCommentId=3D12782716#action_1278271= 6=A0]

Michael McCandless commented on LUCENE-1844:
--------------------------------------------

Will we also speed up back-compat tests?

Speed up junit tests
--------------------

=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Key: LUCENE-1844
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0URL: https://issues.apache.org/jira/br= owse/LUCENE-1844
=A0 =A0 =A0 =A0 =A0 =A0Project: Lucene - Java
=A0 =A0 =A0 =A0 Issue Type: Improvement
=A0 =A0 =A0 =A0 =A0 Reporter: Mark Miller
=A0 =A0 =A0 =A0Attachments: FastCnstScoreQTest.patch, hi_junit_test_runtim= es.png, LUCENE-1844.patch


As Lucene grows, so does the number of JUnit tests. This is obviously a goo= d thing, but it comes with longer and longer test times. Now that we also r= un back compat tests in a standard test run, this problem is essentially do= ubled.
There are some ways this may get better, including running parallel tests. = You will need the hardware to fully take advantage, but it should be a nice= gain. There is already an issue for this, and Junit 4.6, 4.7 have the begi= nnings of something we might be able to count on soon. 4.6 was buggy, and 4= .7 still doesn't come with nice ant integration. Parallel tests will co= me though.
Beyond parallel testing, I think we also need to concentrate on keeping our= tests lean. We don't want to sacrifice coverage or quality, but I'= m sure there is plenty of fat to skim.
I've started making a list of some of the longer tests - I think with s= ome work we can make our tests much faster - and then with parallelization,= I think we could see some really great gains.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org



--0016e6d99f260bc49e04795c4984--