lucene-java-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gsing...@apache.org
Subject svn commit: r495834 [2/3] - in /lucene/java/trunk/contrib/benchmark: ./ conf/ src/java/org/apache/lucene/benchmark/byTask/ src/java/org/apache/lucene/benchmark/byTask/feeds/ src/java/org/apache/lucene/benchmark/byTask/programmatic/ src/java/org/apache/...
Date Sat, 13 Jan 2007 04:08:25 GMT
Added: lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/package.html
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/package.html?view=auto&rev=495834
==============================================================================
--- lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/package.html (added)
+++ lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/package.html Fri Jan 12 20:08:23 2007
@@ -0,0 +1,480 @@
+<HTML>
+<!--
+* Copyright 2005 The Apache Software Foundation
+*
+* Licensed under the Apache License, Version 2.0 (the "License");
+* you may not use this file except in compliance with the License.
+* You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*
+-->
+<HEAD>
+    <TITLE>Benchmarking Lucene By Tasks</TITLE>
+</HEAD>
+<BODY>
+<DIV>
+Benchmarking Lucene By Tasks.
+<p>
+This package provides "task based" performance benchmarking of Lucene.
+One can use the predefined benchmarks, or create new ones.
+</p>
+<p>
+Contained packages:
+</p>
+
+<table border=1 cellpadding=4>
+ <tr>
+   <td><b>Package</b></td>
+   <td><b>Description</b></td>
+ </tr>
+ <tr>
+   <td><a href="stats/package-summary.html">stats</a></td>
+   <td>Statistics maintained when running benchmark tasks.</td>
+ </tr>
+ <tr>
+   <td><a href="tasks/package-summary.html">tasks</a></td>
+   <td>Benchmark tasks.</td>
+ </tr>
+ <tr>
+   <td><a href="feeds/package-summary.html">feeds</a></td>
+   <td>Sources foe benchmark inputs: documents and queries.</td>
+ </tr>
+ <tr>
+   <td><a href="utils/package-summary.html">utils</a></td>
+   <td>Utilities used for the benchmark, and for the reports.</td>
+ </tr>
+ <tr>
+   <td><a href="programmatic/package-summary.html">programmatic</a></td>
+   <td>Sample performance test written programatically.</td>
+ </tr>
+</table>
+
+<h2>Table Of Contents</h2>
+<p>
+    <ol>
+        <li><a href="#concept">Benchmarking By Tasks</a></li>
+        <li><a href="#usage">How to use</a></li>
+        <li><a href="#algorithm">Benchmark "algorithm"</a></li>
+        <li><a href="#tasks">Supported tasks/commands</a></li>
+        <li><a href="#properties">Benchmark properties</a></li>
+        <li><a href="#example">Example input algorithm and the result benchmark report.</a></li>
+    </ol>
+</p>
+<a name="concept"></a>
+<h2>Benchmarking By Tasks</h2>
+<p>
+Benchmark Lucene using task primitives.
+</p>
+
+<p>
+A benchmark is composed of some predefined tasks, allowing for creating an index, adding documents,
+optimizing, searching, generating reports, and more. A benchmark run takes an "algorithm" file
+that contains a description of the sequence of tasks making up the run, and some properties defining a few
+additional characteristics of the benchmark run.
+</p>
+
+<a name="usage"></a>
+<h2>How to use</h2>
+<p>
+Easiest way to run a benchmarks is using the predefined ant task:
+<ul>
+ <li>ant run-task
+     <br>- would run the <code>micro-standard.alg</code> "algorithm".
+ </li>
+ <li>ant run-task -Dtask.alg=conf/compound-penalty.alg
+     <br>- would run the <code>compound-penalty.alg</code> "algorithm".
+ </li>
+ <li>ant run-task -Dtask.alg=[full-path-to-your-alg-file]
+     <br>- would run the <code>your perf test</code> "algorithm".
+ </li>
+ <li>java org.apache.lucene.benchmark.byTask.programmatic.Sample
+     <br>- would run a performance test programmatically - without using an alg file.
+     This is less readable, and less convinient, but possible.
+ </li>
+</ul>
+</p>
+
+<p>
+You may find existing tasks sufficient for defining the benchmark <i>you</i> need,
+otherwise, you can extend the framework to meet your needs, as explained herein.
+</p>
+
+<p>
+Each benchmark run has a DocMaker and a QueryMaker. These two should usually match, so
+that "meaningful" queries are used for a certain collection.
+Properties defined at the header of the alg file define which "makers" should be used.
+You can also specify your own makers, implementing the DocMaker and QureyMaker interfaces.
+</p>
+
+<p>
+Benchmark .alg file contains the benchmark "algorithm". The syntax is described below.
+Within the algorithm, you can specify groups of commands, assign them names,
+specify commands that should be repeated,
+do commands in serial or in parallel,
+and also control the speed of "firing" the commands.
+</p>
+
+<p>
+This allows, for instance, to specify
+that an index should be opened for update,
+documents should be added to it one by one but not faster than 20 docs a minute,
+and, in parallel with this,
+some N queries should be searched against that index,
+again, no more than 2 queries a second.
+You can have the searches all share an index searcher,
+or have them each open its own searcher and close it afterwords.
+</p>
+
+<p>
+If the commands available for use in the algorithm do not meet your needs,
+you can add commands by adding a new task under
+org.apache.lucene.benchmark.byTask.tasks -
+you should extend the PerfTask abstract class.
+Make sure that your new task class name is suffixed by Task.
+Assume you added the class "WonderfulTask" - doing so also enables the
+command "Wonderful" to be used in the algorithm.
+</p>
+
+<a name="algorithm"></a>
+<h2>Benchmark "algorithm"</h2>
+
+<p>
+The following is an informal description of the supported syntax.
+</p>
+
+<ol>
+ <li>
+ <b>Measuring</b>: When a command is executed, statistics for the elapsed execution time and memory consumption are collected.
+ At any time, those statistics can be printed, using one of the available ReportTasks.
+ </li>
+ <li>
+ <b>Comments</b> start with '<font color="#FF0066">#</font>'.
+ </li>
+ <li>
+ <b>Serial</b> sequences are enclosed within '<font color="#FF0066">{ }</font>'.
+ </li>
+ <li>
+ <b>Parallel</b> sequences are enclosed within '<font color="#FF0066">[ ]</font>'
+ </li>
+ <li>
+ <b>Sequence naming:</b> To name a sequence, put '<font color="#FF0066">"name"</font>' just after '<font color="#FF0066">{</font>' or '<font color="#FF0066">[</font>'.
+ <br>Example - <font color="#FF0066">{ "ManyAdds" AddDoc } : 1000000</font> - would
+ name the sequence of 1M add docs "ManyAdds", and this name would later appear in statistic reports.
+ If you don't specify a name for a sequence, it is given one: you can see it as the
+ algorithm is printed just before benchmark execution starts.
+ </li>
+ <li>
+ <b>Repeating</b>:
+ To repeat sequence tasks N times, add '<font color="#FF0066">: N</font>' just after the
+ sequence closing tag - '<font color="#FF0066">}</font>' or '<font color="#FF0066">]</font>' or '<font color="#FF0066">></font>'.
+ <br>Example -  <font color="#FF0066">[ AddDoc ] : 4</font>  - would do 4 addDoc in parallel, spawning 4 threads at once.
+ <br>Example -  <font color="#FF0066">[ AddDoc AddDoc ] : 4</font>  - would do 8 addDoc in parallel, spawning 8 threads at once.
+ <br>Example -  <font color="#FF0066">{ AddDoc } : 30</font> - would do addDoc 30 times in a row.
+ <br>Example -  <font color="#FF0066">{ AddDoc AddDoc } : 30</font> - would do addDoc 60 times in a row.
+ </li>
+ <li>
+ <b>Command parameter</b>: a command can take a single parameter.
+ If the certain command does not support a parameter, or if the parameter is of the wrong type,
+ reading the algorithm will fail with an exception and the test would not start.
+ Currently only AddDoc supports a (numeric) parameter, which indicates the required size of added document.
+ If the DocMaker implementation used in the test does not support makeDoc(size), an exception would be thrown and the test would fail.
+ <br>Example - <font color="#FF0066">AddDoc(2000)</font> - would add a document of size 2000 (~bytes).
+ <br>See conf/task-sample.alg for how this can be used, for instance, to check which is faster, adding
+ many smaller documents, or few larger documents.
+ Next candidates for supporting a parameter may be the Search tasks, for controlling the qurey size.
+ </li>
+ <li>
+ <b>Statistic recording elimination</b>: - a sequence can also end with '<font color="#FF0066">></font>',
+ in which case child tasks would not store their statistics.
+ This can be useful to avoid exploding stats data, for adding say 1M docs.
+ <br>Example - <font color="#FF0066">{ "ManyAdds" AddDoc > : 1000000</font> -
+ would add million docs, measure that total, but not save stats for each addDoc.
+ <br>Notice that the granularity of System.currentTimeMillis() (which is used here) is system dependant,
+ and in some systems an operation that takes 5 ms to complete may show 0 ms latency time in performance measurements.
+ Therefore it is sometimes more accurate to look at the elapsed time of a larger sequence, as demonstrated here.
+ </li>
+ <li>
+ <b>Rate</b>:
+ To set a rate (ops/sec or ops/min) for a sequence, add '<font color="#FF0066">: N : R</font>' just after sequence closing tag.
+ This would specify repetition of N with rate of R operations/sec.
+ Use '<font color="#FF0066">R/sec</font>' or '<font color="#FF0066">R/min</font>'
+ to explicitely specify that the rate is per second or per minute.
+ The default is per second,
+ <br>Example -  <font color="#FF0066">[ AddDoc ] : 400 : 3</font> - would do 400 addDoc in parallel, starting up to 3 threads per second.
+ <br>Example -  <font color="#FF0066">{ AddDoc } : 100 : 200/min</font> - would do 100 addDoc serially,
+ waiting before starting next add, if otherwise rate would exceed 200 adds/min.
+ </li>
+ <li>
+ <b>Command names</b>: Each class "AnyNameTask" in the package org.apache.lucene.benchmark.byTask.tasks,
+ that extends PerfTask, is supported as command "AnyName" that can be
+ used in the benchmark "algorithm" description.
+ This allows to add new commands by just adding such classes.
+ </li>
+</ol>
+
+
+<a name="tasks"></a>
+<h2>Supported tasks/commands</h2>
+
+<p>
+Existing tasks can be divided into a few groups:
+regular index/search work tasks, report tasks, and control tasks.
+</p>
+
+<ol>
+
+ <li>
+ <b>Report tasks</b>: There are a few Report commands for generating reports.
+ Only task runs that were completed are reported.
+ (The 'Report tasks' themselves are not measured and not reported.)
+ <ul>
+             <li>
+            <font color="#FF0066">RepAll</font> - all (completed) task runs.
+            </li>
+            <li>
+            <font color="#FF0066">RepSumByName</font> - all statistics, aggregated by name. So, if AddDoc was executed 2000 times,
+            only 1 report line would be created for it, aggregating all those 2000 statistic records.
+            </li>
+            <li>
+            <font color="#FF0066">RepSelectByPref &nbsp; prefixWord</font> - all records for tasks whose name start with <font color="#FF0066">prefixWord</font>.
+            </li>
+            <li>
+            <font color="#FF0066">RepSumByPref &nbsp; prefixWord</font> - all records for tasks whose name start with <font color="#FF0066">prefixWord</font>,
+            aggregated by their full task name.
+            </li>
+            <li>
+            <font color="#FF0066">RepSumByNameRound</font> - all statistics, aggregated by name and by <font color="#FF0066">Round</font>.
+            So, if AddDoc was executed 2000 times in each of 3 <font color="#FF0066">rounds</font>, 3 report lines would be created for it,
+            aggregating all those 2000 statistic records in each round. See more about rounds in the <font color="#FF0066">NewRound</font> command description below.
+            </li>
+            <li>
+            <font color="#FF0066">RepSumByPrefRound &nbsp; prefixWord</font> - similar to <font color="#FF0066">RepSumByNameRound</font>,
+            just that only tasks whose name starts with <font color="#FF0066">prefixWord</font> are included.
+            </li>
+ </ul>
+ If needed, additional reports can be added by extending the abstract class ReportTask, and by
+ manipulating the statistics data in Points and TaskStats.
+ </li>
+
+ <li><b>Control tasks</b>: Few of the tasks control the benchmark algorithm all over:
+ <ul>
+     <li>
+     <font color="#FF0066">ClearStats</font> - clears the entire statistics.
+     Further reports would only include task runs that would start after this call.
+     </li>
+     <li>
+     <font color="#FF0066">NewRound</font> - virtually start a new round of performance test.
+     Although this command can be placed anywhere, it mostly makes sense at the end of an outermost sequence.
+     <br>This increments a global "round counter". All task runs that would start now would
+     record the new, updated round counter as their round number. This would appear in reports.
+     In particular, see <font color="#FF0066">RepSumByNameRound</font> above.
+     <br>An additional effect of NewRound, is that numeric and boolean properties defined in the
+     .properties file as a sequence of values, e.g. <font color="#FF0066">merge.factor=mrg:10:100:10:100</font> would
+     increment (cyclic) to the next value.
+     Note: this would also be reflected in the reports, in this case under a column that would be named "mrg".
+     </li>
+     <li>
+     <font color="#FF0066">ResetInputs</font> - DocMaker and the various QueryMakers
+     would reset their counters to start.
+     The way these Maker interfaces work, each call for makeDocument()
+     or makeQuery() creates the next document or query
+     that it "knows" to create.
+     If that pool is "exhausted", the "maker" start over again. The resetInpus command
+     therefore allows to make the rounds comparable.
+     It is therefore useful to invoke ResetInputs together with NewRound.
+     </li>
+     <li>
+     <font color="#FF0066">ResetSystemErase</font> - reset all index and input data and call gc.
+     Does NOT reset statistics. This contains ResetInputs.
+     All writers/readers are nullified, deleted, closed.
+     Index is erased.
+     Directory is erased.
+     You would have to call CreateIndex once this was called...
+     </li>
+     <li>
+     <font color="#FF0066">ResetSystemSoft</font> -  reset all index and input data and call gc.
+     Does NOT reset statistics. This contains ResetInputs.
+     All writers/readers are nullified, closed.
+     Index is NOT erased.
+     Directory is NOT erased.
+     This is useful for testing performance on an existing index, for instance if the construction of a large index
+     took a very long time and now you would to test its search or update performance.
+     </li>
+ </ul>
+ </li>
+
+ <li>
+ Other existing tasks are quite straightforward and would just be briefly described here.
+ <ul>
+     <li>
+     <font color="#FF0066">CreateIndex</font> and <font color="#FF0066">OpenIndex</font> both leave the index open for later update operations.
+     <font color="#FF0066">CloseIndex</font> would close it.
+     </li>
+     <li>
+     <font color="#FF0066">OpenReader</font>, similarly, would leave an index reader open for later search operations.
+     But this have further semantics.
+     If a Read operation is performed, and an open reader exists, it would be used.
+     Otherwise, the read operation would open its own reader and close it when the read operation is done.
+     This allows testing various scenarios - sharing a reader, searching with "cold" reader, with "warmed" reader, etc.
+     The read operations affected by this are: <font color="#FF0066">Warm</font>,
+     <font color="#FF0066">Search</font>, <font color="#FF0066">SearchTrav</font> (search and traverse),
+     and <font color="#FF0066">SearchTravRet</font> (search and traverse and retrieve).
+     Notice that each of the 3 search task types maintains its own queryMaker instance.
+     </li>
+ </ul
+ </li>
+ </ol>
+
+<a name="properties"></a>
+<h2>Benchmark properties</h2>
+
+<p>
+Properties are read from the header of the .alg file, and
+define several parameters of the performance test.
+As mentioned above for the <font color="#FF0066">NewRound</font> task,
+numeric and boolean properties that are defined as a sequence
+of values, e.g. <font color="#FF0066">merge.factor=mrg:10:100:10:100</font>
+would increment (cyclic) to the next value, when NewRound is called, and would also
+appear as a named column in the reports (column name would be "mrg" in this example).
+</p>
+
+<p>
+Some of the currently defined properties are:
+</p>
+
+<ol>
+    <li>
+    <font color="#FF0066">analyzer</font> - full class name for the analyzer to use.
+    Same analyzer would be used in the entire test.
+    </li>
+
+    <li>
+    <font color="#FF0066">directory</font> - valid values are FSDirectory and RAMDirectory.
+    This tells which directory to use for the performance test.
+    </li>
+
+    <li>
+    <b>Index work parameters</b>:
+    Multi int/boolean values would be iterated with calls to NewRound.
+    There would be also added as columns in the reports, first string in the
+    sequence is the column name.
+    (Make sure it is no shorter than any value in the sequence).
+    <ul>
+        <li><font color="#FF0066">max.buffered</font>
+        <br>Example: buffered=buf.10.10.100.100 -
+        this would define using maxBufferedDocs of 10 in iterations 0 and 1,
+        and 100 in iterations 2 and 3.
+        </li>
+        <li>
+        <font color="#FF0066">merge.factor</font> - which
+        merge factor to use.
+        </li>
+        <li>
+        <font color="#FF0066">compound</font> - whether the index is
+        using the compound format or not. Valid values are "true" and "false".
+        </li>
+    </ul>
+</ol>
+
+<p>
+For additional defined properties see the *.alg files under conf.
+</p>
+
+<a name="example"></a>
+<h2>Example input algorithm and the result benchmark report</h2>
+<p>
+The following example is in conf/sample.alg:
+<pre>
+<font color="#003333"># --------------------------------------------------------
+#
+# Sample: what is the effect of doc size on indexing time?
+#
+# There are two parts in this test:
+# - PopulateShort adds 2N documents of length  L
+# - PopulateLong  adds  N documents of length 2L
+# Which one would be faster?
+# The comparison is done twice.
+#
+# --------------------------------------------------------
+
+<font color="#990066"># -------------------------------------------------------------------------------------
+# multi val params are iterated by NewRound's, added to reports, start with column name.
+merge.factor=mrg:10:20
+max.buffered=buf:100:1000
+compound=true
+
+analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
+directory=FSDirectory
+
+doc.stored=true
+doc.tokenized=true
+doc.term.vector=false
+doc.add.log.step=500
+
+docs.dir=reuters-out
+
+doc.maker=org.apache.lucene.benchmark.byTask.feeds.SimpleDocMaker
+
+query.maker=org.apache.lucene.benchmark.byTask.feeds.SimpleQueryMaker
+
+# task at this depth or less would print when they start
+task.max.depth.log=2
+
+log.queries=false
+# -------------------------------------------------------------------------------------</font>
+<font color="#3300FF">{
+
+    { "PopulateShort"
+        CreateIndex
+        { AddDoc(4000) > : 20000
+        Optimize
+        CloseIndex
+    >
+
+    ResetSystemErase
+
+    { "PopulateLong"
+        CreateIndex
+        { AddDoc(8000) > : 10000
+        Optimize
+        CloseIndex
+    >
+
+    ResetSystemErase
+
+    NewRound
+
+} : 2
+
+RepSumByName
+RepSelectByPref Populate
+</font>
+</pre>
+</p>
+
+<p>
+The command line for running this sample:
+<br><code>ant run-task -Dtask.alg=conf/sample.alg</code>
+</p>
+
+<p>
+The output report from running this test contains the following:
+<pre>
+Operation     round mrg  buf   runCnt   recsPerRun        rec/s  elapsedSec    avgUsedMem    avgTotalMem
+PopulateShort     0  10  100        1        20003        119.6      167.26    12,959,120     14,241,792
+PopulateLong -  - 0  10  100 -  -   1 -  -   10003 -  -  - 74.3 -  - 134.57 -  17,085,208 -   20,635,648
+PopulateShort     1  20 1000        1        20003        143.5      139.39    63,982,040     94,756,864
+PopulateLong -  - 1  20 1000 -  -   1 -  -   10003 -  -  - 77.0 -  - 129.92 -  87,309,608 -  100,831,232
+</pre>
+</p>
+</DIV>
+<DIV>&nbsp;</DIV>
+</BODY>
+</HTML>
\ No newline at end of file

Added: lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/programmatic/Sample.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/programmatic/Sample.java?view=auto&rev=495834
==============================================================================
--- lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/programmatic/Sample.java (added)
+++ lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/programmatic/Sample.java Fri Jan 12 20:08:23 2007
@@ -0,0 +1,81 @@
+package org.apache.lucene.benchmark.byTask.programmatic;
+
+import java.io.IOException;
+import java.util.Properties;
+
+import org.apache.lucene.benchmark.byTask.PerfRunData;
+import org.apache.lucene.benchmark.byTask.tasks.AddDocTask;
+import org.apache.lucene.benchmark.byTask.tasks.CloseIndexTask;
+import org.apache.lucene.benchmark.byTask.tasks.CreateIndexTask;
+import org.apache.lucene.benchmark.byTask.tasks.RepSumByNameTask;
+import org.apache.lucene.benchmark.byTask.tasks.TaskSequence;
+import org.apache.lucene.benchmark.byTask.utils.Config;
+
+/**
+ * Sample performance test written programatically - no algorithm file is needed here.
+ */
+public class Sample {
+
+  /**
+   * @param args
+   * @throws Exception 
+   * @throws IOException 
+   */
+  public static void main(String[] args) throws Exception {
+    Properties p = initProps();
+    Config conf = new Config(p);
+    PerfRunData runData = new PerfRunData(conf);
+    
+    // 1. top sequence
+    TaskSequence top = new TaskSequence(runData,null,null,false); // top level, not parralel
+    
+    // 2. task to create the index
+    CreateIndexTask create = new CreateIndexTask(runData);
+    top.addTask(create);
+    
+    // 3. task seq to add 500 docs (order matters - top to bottom - add seq to top, only then add to seq)
+    TaskSequence seq1 = new TaskSequence(runData,"AddDocs",top,false);
+    seq1.setRepetitions(500);
+    seq1.setNoChildReport();
+    top.addTask(seq1);
+
+    // 4. task to add the doc
+    AddDocTask addDoc = new AddDocTask(runData);
+    //addDoc.setParams("1200"); // doc size limit if supported
+    seq1.addTask(addDoc); // order matters 9see comment above)
+
+    // 5. task to close the index
+    CloseIndexTask close = new CloseIndexTask(runData);
+    top.addTask(close);
+
+    // task to report
+    RepSumByNameTask rep = new RepSumByNameTask(runData);
+    top.addTask(rep);
+    // execute
+    top.doLogic();
+  }
+
+  // Sample programmatic settings. Could also read from file.
+  private static Properties initProps() {
+    Properties p = new Properties();
+    p.setProperty ( "task.max.depth.log"  , "3" );
+    p.setProperty ( "max.buffered"        , "buf:10:10:100:100:10:10:100:100" );
+    p.setProperty ( "doc.maker"           , "org.apache.lucene.benchmark.byTask.feeds.ReutersDocMaker" );
+    p.setProperty ( "doc.add.log.step"    , "2000" );
+    p.setProperty ( "doc.delete.log.step" , "2000" );
+    p.setProperty ( "doc.delete.step"     , "8" );
+    p.setProperty ( "analyzer"            , "org.apache.lucene.analysis.standard.StandardAnalyzer" );
+    p.setProperty ( "doc.term.vector"     , "false" );
+    p.setProperty ( "directory"           , "FSDirectory" );
+    p.setProperty ( "query.maker"         , "org.apache.lucene.benchmark.byTask.feeds.ReutersQueryMaker" );
+    p.setProperty ( "doc.stored"          , "true" );
+    p.setProperty ( "docs.dir"            , "reuters-out" );
+    p.setProperty ( "compound"            , "cmpnd:true:true:true:true:false:false:false:false" );
+    p.setProperty ( "doc.tokenized"       , "true" );
+    p.setProperty ( "merge.factor"        , "mrg:10:100:10:100:10:100:10:100" );
+    return p;
+  }
+  
+  
+
+}

Added: lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/programmatic/package.html
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/programmatic/package.html?view=auto&rev=495834
==============================================================================
--- lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/programmatic/package.html (added)
+++ lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/programmatic/package.html Fri Jan 12 20:08:23 2007
@@ -0,0 +1,5 @@
+<html>
+<body>
+Sample performance test written programatically - no algorithm file is needed here.
+</body>
+</html>
\ No newline at end of file

Added: lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/stats/Points.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/stats/Points.java?view=auto&rev=495834
==============================================================================
--- lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/stats/Points.java (added)
+++ lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/stats/Points.java Fri Jan 12 20:08:23 2007
@@ -0,0 +1,343 @@
+package org.apache.lucene.benchmark.byTask.stats;
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.LinkedHashMap;
+
+import org.apache.lucene.benchmark.byTask.tasks.PerfTask;
+import org.apache.lucene.benchmark.byTask.utils.Config;
+import org.apache.lucene.benchmark.byTask.utils.Format;
+
+
+/**
+ * Test run data points collected as the test proceeds.
+ */
+public class Points {
+
+  private Config config;
+  
+  private static final String newline = System.getProperty("line.separator");
+  
+  // stat points ordered by their start time. 
+  // for now we collect points as TaskStats objects.
+  // later might optimize to collect only native data.
+  private ArrayList points = new ArrayList();
+
+  private int nextTaskRunNum = 0;
+
+  /**
+   * Get a textual summary of the benchmark results, average from all test runs.
+   */
+  static final String OP =          "Operation  ";
+  static final String ROUND =       " round";
+  static final String RUNCNT =      "   runCnt";
+  static final String RECCNT =      "   recsPerRun";
+  static final String RECSEC =      "        rec/s";
+  static final String ELAPSED =     "  elapsedSec";
+  static final String USEDMEM =     "    avgUsedMem";
+  static final String TOTMEM =      "    avgTotalMem";
+  static final String COLS[] = {
+      RUNCNT,
+      RECCNT,
+      RECSEC,
+      ELAPSED,
+      USEDMEM,
+      TOTMEM
+  };
+
+  /**
+   * Create a Points statistics object. 
+   */
+  public Points (Config config) {
+    this.config = config;
+  }
+
+  private String tableTitle (String longestOp) {
+    StringBuffer sb = new StringBuffer();
+    sb.append(Format.format(OP,longestOp));
+    sb.append(ROUND);
+    sb.append(config.getColsNamesForValsByRound());
+    for (int i = 0; i < COLS.length; i++) {
+      sb.append(COLS[i]);
+    }
+    return sb.toString(); 
+  }
+  
+  /**
+   * Report detailed statistics as a string
+   * @return the report
+   */
+  public Report reportAll() {
+    String longestOp = longestOp(points);
+    boolean first = true;
+    StringBuffer sb = new StringBuffer();
+    sb.append(tableTitle(longestOp));
+    sb.append(newline);
+    int reported = 0;
+    for (Iterator it = points.iterator(); it.hasNext();) {
+      TaskStats stat = (TaskStats) it.next();
+      if (stat.getElapsed()>=0) { // consider only tasks that ended
+        if (!first) {
+          sb.append(newline);
+        }
+        first = false;
+        String line = taskReportLine(longestOp, stat);
+        reported++;
+        if (points.size()>2&& reported%2==0) {
+          line = line.replaceAll("   "," - ");
+        }
+        sb.append(line);
+      }
+    }
+    String reptxt = (reported==0 ? "No Matching Entries Were Found!" : sb.toString());
+    return new Report(reptxt,reported,reported,points.size());
+  }
+
+  /**
+   * Report statistics as a string, aggregate for tasks named the same.
+   * @return the report
+   */
+  public Report reportSumByName() {
+    // aggregate by task name
+    int reported = 0;
+    LinkedHashMap p2 = new LinkedHashMap();
+    for (Iterator it = points.iterator(); it.hasNext();) {
+      TaskStats stat1 = (TaskStats) it.next();
+      if (stat1.getElapsed()>=0) { // consider only tasks that ended
+        reported++;
+        String name = stat1.getTask().getName();
+        TaskStats stat2 = (TaskStats) p2.get(name);
+        if (stat2 == null) {
+          try {
+            stat2 = (TaskStats) stat1.clone();
+          } catch (CloneNotSupportedException e) {
+            throw new RuntimeException(e);
+          }
+          p2.put(name,stat2);
+        } else {
+          stat2.add(stat1);
+        }
+      }
+    }
+    // now generate report from secondary list p2    
+    return genReportFromList(reported, p2);
+  }
+
+  /**
+   * Report statistics as a string, aggregate for tasks named the same, and from the same round.
+   * @return the report
+   */
+  public Report reportSumByNameRound() {
+    // aggregate by task name and round
+    LinkedHashMap p2 = new LinkedHashMap();
+    int reported = 0;
+    for (Iterator it = points.iterator(); it.hasNext();) {
+      TaskStats stat1 = (TaskStats) it.next();
+      if (stat1.getElapsed()>=0) { // consider only tasks that ended
+        reported++;
+        String name = stat1.getTask().getName();
+        String rname = stat1.getRound()+"."+name; // group by round
+        TaskStats stat2 = (TaskStats) p2.get(rname);
+        if (stat2 == null) {
+          try {
+            stat2 = (TaskStats) stat1.clone();
+          } catch (CloneNotSupportedException e) {
+            throw new RuntimeException(e);
+          }
+          p2.put(rname,stat2);
+        } else {
+          stat2.add(stat1);
+        }
+      }
+    }
+    // now generate report from secondary list p2    
+    return genReportFromList(reported, p2);
+  }
+  
+  private String longestOp(Collection c) {
+    String longest = OP;
+    for (Iterator it = c.iterator(); it.hasNext();) {
+      TaskStats stat = (TaskStats) it.next();
+      if (stat.getElapsed()>=0) { // consider only tasks that ended
+        String name = stat.getTask().getName();
+        if (name.length() > longest.length()) {
+          longest = name;
+        }
+      }
+    }
+    return longest;
+  }
+
+  private String taskReportLine(String longestOp, TaskStats stat) {
+    PerfTask task = stat.getTask();
+    StringBuffer sb = new StringBuffer();
+    sb.append(Format.format(task.getName(), longestOp));
+    String round = (stat.getRound()>=0 ? ""+stat.getRound() : "-");
+    sb.append(Format.formatPaddLeft(round, ROUND));
+    sb.append(config.getColsValuesForValsByRound(stat.getRound()));
+    sb.append(Format.format(stat.getNumRuns(), RUNCNT)); 
+    sb.append(Format.format(stat.getCount() / stat.getNumRuns(), RECCNT));
+    long elapsed = (stat.getElapsed()>0 ? stat.getElapsed() : 1); // assume at least 1ms
+    sb.append(Format.format(1,(float) (stat.getCount() * 1000.0 / elapsed), RECSEC));
+    sb.append(Format.format(2, (float) stat.getElapsed() / 1000, ELAPSED));
+    sb.append(Format.format(0, (float) stat.getMaxUsedMem() / stat.getNumRuns(), USEDMEM)); 
+    sb.append(Format.format(0, (float) stat.getMaxTotMem() / stat.getNumRuns(), TOTMEM));
+    return sb.toString();
+  }
+
+  public Report reportSumByPrefix(String prefix) {
+    // aggregate by task name
+    int reported = 0;
+    LinkedHashMap p2 = new LinkedHashMap();
+    for (Iterator it = points.iterator(); it.hasNext();) {
+      TaskStats stat1 = (TaskStats) it.next();
+      if (stat1.getElapsed()>=0 && stat1.getTask().getName().startsWith(prefix)) { // only ended tasks with proper name
+        reported++;
+        String name = stat1.getTask().getName();
+        TaskStats stat2 = (TaskStats) p2.get(name);
+        if (stat2 == null) {
+          try {
+            stat2 = (TaskStats) stat1.clone();
+          } catch (CloneNotSupportedException e) {
+            throw new RuntimeException(e);
+          }
+          p2.put(name,stat2);
+        } else {
+          stat2.add(stat1);
+        }
+      }
+    }
+    // now generate report from secondary list p2    
+    return genReportFromList(reported, p2);
+  }
+  
+  public Report reportSumByPrefixRound(String prefix) {
+    // aggregate by task name and by round
+    int reported = 0;
+    LinkedHashMap p2 = new LinkedHashMap();
+    for (Iterator it = points.iterator(); it.hasNext();) {
+      TaskStats stat1 = (TaskStats) it.next();
+      if (stat1.getElapsed()>=0 && stat1.getTask().getName().startsWith(prefix)) { // only ended tasks with proper name
+        reported++;
+        String name = stat1.getTask().getName();
+        String rname = stat1.getRound()+"."+name; // group by round
+        TaskStats stat2 = (TaskStats) p2.get(rname);
+        if (stat2 == null) {
+          try {
+            stat2 = (TaskStats) stat1.clone();
+          } catch (CloneNotSupportedException e) {
+            throw new RuntimeException(e);
+          }
+          p2.put(rname,stat2);
+        } else {
+          stat2.add(stat1);
+        }
+      }
+    }
+    // now generate report from secondary list p2    
+    return genReportFromList(reported, p2);
+  }
+
+  private Report genReportFromList(int reported, LinkedHashMap p2) {
+    String longetOp = longestOp(p2.values());
+    boolean first = true;
+    StringBuffer sb = new StringBuffer();
+    sb.append(tableTitle(longetOp));
+    sb.append(newline);
+    int lineNum = 0;
+    for (Iterator it = p2.values().iterator(); it.hasNext();) {
+      TaskStats stat = (TaskStats) it.next();
+      if (!first) {
+        sb.append(newline);
+      }
+      first = false;
+      String line = taskReportLine(longetOp,stat);
+      lineNum++;
+      if (p2.size()>2&& lineNum%2==0) {
+        line = line.replaceAll("   "," - ");
+      }
+      sb.append(line);
+    }
+    String reptxt = (reported==0 ? "No Matching Entries Were Found!" : sb.toString());
+    return new Report(reptxt,p2.size(),reported,points.size());
+  }
+
+  public Report reportSelectByPrefix(String prefix) {
+    String longestOp = longestOp(points);
+    boolean first = true;
+    StringBuffer sb = new StringBuffer();
+    sb.append(tableTitle(longestOp));
+    sb.append(newline);
+    int reported = 0;
+    for (Iterator it = points.iterator(); it.hasNext();) {
+      TaskStats stat = (TaskStats) it.next();
+      if (stat.getElapsed()>=0 && stat.getTask().getName().startsWith(prefix)) { // only ended tasks with proper name
+        reported++;
+        if (!first) {
+          sb.append(newline);
+        }
+        first = false;
+        String line = taskReportLine(longestOp,stat);
+        if (points.size()>2&& reported%2==0) {
+          line = line.replaceAll("   "," - ");
+        }
+        sb.append(line);
+      }
+    }
+    String reptxt = (reported==0 ? "No Matching Entries Were Found!" : sb.toString());
+    return new Report(reptxt,reported,reported, points.size());
+  }
+
+  /**
+   * Mark that a task is starting. 
+   * Create a task stats for it and store it as a point.
+   * @param task the starting task.
+   * @return the new task stats created for the starting task.
+   */
+  public synchronized TaskStats markTaskStart (PerfTask task, int round) {
+    TaskStats stats = new TaskStats(task, nextTaskRunNum(), round);
+    points.add(stats);
+    return stats;
+  }
+  
+  // return next task num
+  private synchronized int nextTaskRunNum() {
+    return nextTaskRunNum++;
+  }
+  
+  /**
+   * mark the end of a task
+   */
+  public synchronized void markTaskEnd (TaskStats stats, int count) {
+    int numParallelTasks = nextTaskRunNum - 1 - stats.getTaskRunNum();
+    // note: if the stats were cleared, might be that this stats object is 
+    // no longer in points, but this is just ok.
+    stats.markEnd(numParallelTasks, count);
+  }
+
+  /**
+   * Clear all data, prepare for more tests.
+   */
+  public void clearData() {
+    points.clear();
+  }
+
+}

Added: lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/stats/Report.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/stats/Report.java?view=auto&rev=495834
==============================================================================
--- lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/stats/Report.java (added)
+++ lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/stats/Report.java Fri Jan 12 20:08:23 2007
@@ -0,0 +1,64 @@
+package org.apache.lucene.benchmark.byTask.stats;
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/**
+ * Textual report of current statistics.
+ */
+public class Report {
+
+  private String text;
+  private int size;
+  private int outOf;
+  private int reported;
+
+  Report (String text, int size, int reported, int outOf) {
+    this.text = text;
+    this.size = size;
+    this.reported = reported;
+    this.outOf = outOf;
+  }
+
+  /**
+   * Returns total number of stats points when this report was created.
+   */
+  public int getOutOf() {
+    return outOf;
+  }
+
+  /**
+   * Returns number of lines in the reoprt.
+   */
+  public int getSize() {
+    return size;
+  }
+
+  /**
+   * Returns the report text.
+   */
+  public String getText() {
+    return text;
+  }
+
+  /**
+   * Returns number of stats points represented in this report.
+   */
+  public int getReported() {
+    return reported;
+  }
+}

Added: lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/stats/TaskStats.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/stats/TaskStats.java?view=auto&rev=495834
==============================================================================
--- lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/stats/TaskStats.java (added)
+++ lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/stats/TaskStats.java Fri Jan 12 20:08:23 2007
@@ -0,0 +1,192 @@
+package org.apache.lucene.benchmark.byTask.stats;
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import org.apache.lucene.benchmark.byTask.tasks.PerfTask;
+
+/**
+ * Statistics for a task run. 
+ * <br>The same task can run more than once, but, if that task records statistics, 
+ * each run would create its own TaskStats.
+ */
+public class TaskStats implements Cloneable {
+
+  /** task for which data was collected */
+  private PerfTask task; 
+
+  /** round in which task run started */
+  private int round;
+
+  /** task start time */
+  private long start;
+  
+  /** task elapsed time.  elapsed >= 0 indicates run completion! */
+  private long elapsed = -1;
+  
+  /** max tot mem during task */
+  private long maxTotMem;
+  
+  /** max used mem during task */
+  private long maxUsedMem;
+  
+  /** serial run number of this task run in the perf run */
+  private int taskRunNum;
+  
+  /** number of other tasks that started to run while this task was still running */ 
+  private int numParallelTasks;
+  
+  /** number of work items done by this task.
+   * For indexing that can be number of docs added.
+   * For warming that can be number of scanned items, etc. 
+   * For repeating tasks, this is a sum over repetitions.
+   */
+  private int count;
+
+  /** Number of similar tasks aggregated into this record.   
+   * Used when summing up on few runs/instances of similar tasks.
+   */
+  private int numRuns = 1;
+  
+  /**
+   * Create a run data for a task that is starting now.
+   * To be called from Points.
+   */
+  TaskStats (PerfTask task, int taskRunNum, int round) {
+    this.task = task;
+    this.taskRunNum = taskRunNum;
+    this.round = round;
+    maxTotMem = Runtime.getRuntime().totalMemory();
+    maxUsedMem = maxTotMem - Runtime.getRuntime().freeMemory();
+    start = System.currentTimeMillis();
+  }
+  
+  /**
+   * mark the end of a task
+   */
+  void markEnd (int numParallelTasks, int count) {
+    elapsed = System.currentTimeMillis() - start;
+    long totMem = Runtime.getRuntime().totalMemory();
+    if (totMem > maxTotMem) {
+      maxTotMem = totMem;
+    }
+    long usedMem = totMem - Runtime.getRuntime().freeMemory();
+    if (usedMem > maxUsedMem) {
+      maxUsedMem = usedMem;
+    }
+    this.numParallelTasks = numParallelTasks;
+    this.count = count;
+  }
+
+  /**
+   * @return the taskRunNum.
+   */
+  public int getTaskRunNum() {
+    return taskRunNum;
+  }
+
+  /* (non-Javadoc)
+   * @see java.lang.Object#toString()
+   */
+  public String toString() {
+    StringBuffer res = new StringBuffer(task.getName());
+    res.append(" ");
+    res.append(count);
+    res.append(" ");
+    res.append(elapsed);
+    return res.toString();
+  }
+
+  /**
+   * @return Returns the count.
+   */
+  public int getCount() {
+    return count;
+  }
+
+  /**
+   * @return elapsed time.
+   */
+  public long getElapsed() {
+    return elapsed;
+  }
+
+  /**
+   * @return Returns the maxTotMem.
+   */
+  public long getMaxTotMem() {
+    return maxTotMem;
+  }
+
+  /**
+   * @return Returns the maxUsedMem.
+   */
+  public long getMaxUsedMem() {
+    return maxUsedMem;
+  }
+
+  /**
+   * @return Returns the numParallelTasks.
+   */
+  public int getNumParallelTasks() {
+    return numParallelTasks;
+  }
+
+  /**
+   * @return Returns the task.
+   */
+  public PerfTask getTask() {
+    return task;
+  }
+
+  /**
+   * @return Returns the numRuns.
+   */
+  public int getNumRuns() {
+    return numRuns;
+  }
+
+  /**
+   * Add data from another stat, for aggregation
+   * @param stat2 the added stat data.
+   */
+  public void add(TaskStats stat2) {
+    numRuns += stat2.getNumRuns();
+    elapsed += stat2.getElapsed();
+    maxTotMem += stat2.getMaxTotMem();
+    maxUsedMem += stat2.getMaxUsedMem();
+    count += stat2.getCount();
+    if (round != stat2.round) {
+      round = -1; // no meaning if agregating tasks of different ruond. 
+    }
+  }
+
+  /* (non-Javadoc)
+   * @see java.lang.Object#clone()
+   */
+  protected Object clone() throws CloneNotSupportedException {
+    return super.clone();
+  }
+
+  /**
+   * @return the round number.
+   */
+  int getRound() {
+    return round;
+  }
+  
+}

Added: lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/stats/package.html
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/stats/package.html?view=auto&rev=495834
==============================================================================
--- lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/stats/package.html (added)
+++ lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/stats/package.html Fri Jan 12 20:08:23 2007
@@ -0,0 +1,5 @@
+<html>
+<body>
+  Statistics maintained when running benchmark tasks.
+</body>
+</html>

Added: lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/AddDocTask.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/AddDocTask.java?view=auto&rev=495834
==============================================================================
--- lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/AddDocTask.java (added)
+++ lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/AddDocTask.java Fri Jan 12 20:08:23 2007
@@ -0,0 +1,88 @@
+package org.apache.lucene.benchmark.byTask.tasks;
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import org.apache.lucene.benchmark.byTask.PerfRunData;
+import org.apache.lucene.benchmark.byTask.feeds.DocMaker;
+import org.apache.lucene.document.Document;
+
+
+/**
+ * Add a document, optionally with of a cetrain size.
+ * Other side effects: none.
+ */
+public class AddDocTask extends PerfTask {
+
+  public AddDocTask(PerfRunData runData) {
+    super(runData);
+  }
+
+  private static int logStep = -1;
+  private int docSize = 0;
+  
+  // volatile data passed between setup(), doLogic(), tearDown().
+  private Document doc = null;
+  
+  /*
+   *  (non-Javadoc)
+   * @see PerfTask#setup()
+   */
+  public void setup() throws Exception {
+    super.setup();
+    DocMaker docMaker = getRunData().getDocMaker();
+    if (docSize > 0) {
+      doc = docMaker.makeDocument(docSize);
+    } else {
+      doc = docMaker.makeDocument();
+    }
+  }
+
+  /* (non-Javadoc)
+   * @see PerfTask#tearDown()
+   */
+  public void tearDown() throws Exception {
+    DocMaker docMaker = getRunData().getDocMaker();
+    log(docMaker.getCount());
+    doc = null;
+    super.tearDown();
+  }
+
+  public int doLogic() throws Exception {
+    getRunData().getIndexWriter().addDocument(doc);
+    return 1;
+  }
+
+  private void log (int count) {
+    if (logStep<0) {
+      // avoid sync although race possible here
+      logStep = getRunData().getConfig().get("doc.add.log.step",500);
+    }
+    if (logStep>0 && (count%logStep)==0) {
+      System.out.println("--> processed "+count+" docs");
+    }
+  }
+
+  /**
+   * Set the params (docSize only)
+   * @param params docSize, or 0 for no limit.
+   */
+  public void setParams(String params) {
+    super.setParams(params);
+    docSize = (int) Float.parseFloat(params); 
+  }
+}

Added: lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/ClearStatsTask.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/ClearStatsTask.java?view=auto&rev=495834
==============================================================================
--- lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/ClearStatsTask.java (added)
+++ lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/ClearStatsTask.java Fri Jan 12 20:08:23 2007
@@ -0,0 +1,44 @@
+package org.apache.lucene.benchmark.byTask.tasks;
+
+import org.apache.lucene.benchmark.byTask.PerfRunData;
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/**
+ * Clear statistics data.
+ * Other side effects: None.
+ */
+public class ClearStatsTask extends PerfTask {
+
+  public ClearStatsTask(PerfRunData runData) {
+    super(runData);
+  }
+
+  public int doLogic() throws Exception {
+    getRunData().getPoints().clearData();
+    return 0;
+  }
+
+  /* (non-Javadoc)
+   * @see PerfTask#shouldNotRecordStats()
+   */
+  protected boolean shouldNotRecordStats() {
+    return true;
+  }
+
+}

Added: lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/CloseIndexTask.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/CloseIndexTask.java?view=auto&rev=495834
==============================================================================
--- lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/CloseIndexTask.java (added)
+++ lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/CloseIndexTask.java Fri Jan 12 20:08:23 2007
@@ -0,0 +1,44 @@
+package org.apache.lucene.benchmark.byTask.tasks;
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import java.io.IOException;
+
+import org.apache.lucene.benchmark.byTask.PerfRunData;
+import org.apache.lucene.index.IndexWriter;
+
+/**
+ * Close index writer.
+ * Other side effects: index writer object in perfRunData is nullified.
+ */
+public class CloseIndexTask extends PerfTask {
+
+  public CloseIndexTask(PerfRunData runData) {
+    super(runData);
+  }
+
+  public int doLogic() throws IOException {
+    IndexWriter iw = getRunData().getIndexWriter();
+    if (iw!=null) {
+      iw.close();
+    }
+    getRunData().setIndexWriter(null);
+    return 1;
+  }
+
+}

Added: lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/CloseReaderTask.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/CloseReaderTask.java?view=auto&rev=495834
==============================================================================
--- lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/CloseReaderTask.java (added)
+++ lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/CloseReaderTask.java Fri Jan 12 20:08:23 2007
@@ -0,0 +1,45 @@
+package org.apache.lucene.benchmark.byTask.tasks;
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import java.io.IOException;
+
+import org.apache.lucene.benchmark.byTask.PerfRunData;
+import org.apache.lucene.index.IndexReader;
+
+/**
+ * Close index reader.
+ * Other side effects: index reader in perfRunData is nullified.
+ * This would cause read related tasks to reopen their own reader. 
+ */
+public class CloseReaderTask extends PerfTask {
+
+  public CloseReaderTask(PerfRunData runData) {
+    super(runData);
+  }
+
+  public int doLogic() throws IOException {
+    IndexReader reader= getRunData().getIndexReader();
+    if (reader!=null) {
+      reader.close();
+    }
+    getRunData().setIndexReader(null);
+    return 1;
+  }
+
+}

Added: lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/CreateIndexTask.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/CreateIndexTask.java?view=auto&rev=495834
==============================================================================
--- lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/CreateIndexTask.java (added)
+++ lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/CreateIndexTask.java Fri Jan 12 20:08:23 2007
@@ -0,0 +1,59 @@
+package org.apache.lucene.benchmark.byTask.tasks;
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import java.io.IOException;
+
+import org.apache.lucene.analysis.Analyzer;
+import org.apache.lucene.index.IndexWriter;
+import org.apache.lucene.store.Directory;
+import org.apache.lucene.benchmark.byTask.PerfRunData;
+import org.apache.lucene.benchmark.byTask.utils.Config;
+
+
+/**
+ * Create an index.
+ * Other side effects: index writer object in perfRunData is set.
+ */
+public class CreateIndexTask extends PerfTask {
+
+  public CreateIndexTask(PerfRunData runData) {
+    super(runData);
+  }
+
+  public int doLogic() throws IOException {
+    Directory dir = getRunData().getDirectory();
+    Analyzer analyzer = getRunData().getAnalyzer();
+    
+    IndexWriter iw = new IndexWriter(dir, analyzer, true);
+    
+    Config config = getRunData().getConfig();
+    
+    boolean cmpnd = config.get("compound",true);
+    int mrgf = config.get("merge.factor",10);
+    int mxbf = config.get("max.buffered",10);
+
+    iw.setUseCompoundFile(cmpnd);
+    iw.setMergeFactor(mrgf);
+    iw.setMaxBufferedDocs(mxbf);
+
+    getRunData().setIndexWriter(iw);
+    return 1;
+  }
+
+}

Added: lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/DeleteDocTask.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/DeleteDocTask.java?view=auto&rev=495834
==============================================================================
--- lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/DeleteDocTask.java (added)
+++ lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/DeleteDocTask.java Fri Jan 12 20:08:23 2007
@@ -0,0 +1,86 @@
+package org.apache.lucene.benchmark.byTask.tasks;
+
+import org.apache.lucene.benchmark.byTask.PerfRunData;
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/**
+ * Delete a document by docid.
+ * Other side effects: none.
+ */
+public class DeleteDocTask extends PerfTask {
+
+  public DeleteDocTask(PerfRunData runData) {
+    super(runData);
+  }
+
+  private static int logStep = -1;
+  private static int deleteStep = -1;
+  private static int numDeleted = 0;
+  private static int lastDeleted = -1;
+
+  private int docid = -1;
+  private boolean byStep = true;
+  
+  public int doLogic() throws Exception {
+    getRunData().getIndexReader().deleteDocument(docid);
+    lastDeleted = docid;
+    return 1; // one work item done here
+  }
+
+  /* (non-Javadoc)
+   * @see org.apache.lucene.benchmark.byTask.tasks.PerfTask#setup()
+   */
+  public void setup() throws Exception {
+    super.setup();
+    // one time static initializations
+    if (logStep<0) {
+      logStep = getRunData().getConfig().get("doc.delete.log.step",500);
+    }
+    if (deleteStep<0) {
+      deleteStep = getRunData().getConfig().get("doc.delete.step",8);
+    }
+    // set the docid to be deleted
+    docid = (byStep ? lastDeleted + deleteStep : docid);
+  }
+
+  /* (non-Javadoc)
+   * @see PerfTask#tearDown()
+   */
+  public void tearDown() throws Exception {
+    log(++numDeleted);
+    super.tearDown();
+  }
+
+  private void log (int count) {
+    if (logStep>0 && (count%logStep)==0) {
+      System.out.println("--> processed "+count+" docs, last deleted: "+lastDeleted);
+    }
+  }
+  
+  /**
+   * Set the params (docid only)
+   * @param params docid to delete, or -1 for deleting by delete gap settings.
+   */
+  public void setParams(String params) {
+    super.setParams(params);
+    docid = (int) Float.parseFloat(params);
+    byStep = (docid < 0);
+  }
+
+}

Added: lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/NewRoundTask.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/NewRoundTask.java?view=auto&rev=495834
==============================================================================
--- lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/NewRoundTask.java (added)
+++ lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/NewRoundTask.java Fri Jan 12 20:08:23 2007
@@ -0,0 +1,44 @@
+package org.apache.lucene.benchmark.byTask.tasks;
+
+import org.apache.lucene.benchmark.byTask.PerfRunData;
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+/**
+ * Increment the counter for properties maintained by Round Number.
+ * Other side effects: if there are props by round number, log value change.
+ */
+public class NewRoundTask extends PerfTask {
+
+  public NewRoundTask(PerfRunData runData) {
+    super(runData);
+  }
+
+  public int doLogic() throws Exception {
+    getRunData().getConfig().newRound();
+    return 0;
+  }
+
+  /* (non-Javadoc)
+   * @see PerfTask#shouldNotRecordStats()
+   */
+  protected boolean shouldNotRecordStats() {
+    return true;
+  }
+}

Added: lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/OpenIndexTask.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/OpenIndexTask.java?view=auto&rev=495834
==============================================================================
--- lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/OpenIndexTask.java (added)
+++ lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/OpenIndexTask.java Fri Jan 12 20:08:23 2007
@@ -0,0 +1,59 @@
+package org.apache.lucene.benchmark.byTask.tasks;
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import java.io.IOException;
+
+import org.apache.lucene.analysis.Analyzer;
+import org.apache.lucene.index.IndexWriter;
+import org.apache.lucene.store.Directory;
+import org.apache.lucene.benchmark.byTask.PerfRunData;
+import org.apache.lucene.benchmark.byTask.utils.Config;
+
+
+/**
+ * Open an index writer.
+ * Other side effects: index writer object in perfRunData is set.
+ */
+public class OpenIndexTask extends PerfTask {
+
+  public OpenIndexTask(PerfRunData runData) {
+    super(runData);
+  }
+
+  public int doLogic() throws IOException {
+    Directory dir = getRunData().getDirectory();
+    Analyzer analyzer = getRunData().getAnalyzer();
+    IndexWriter writer = new IndexWriter(dir, analyzer, false);
+    
+    Config config = getRunData().getConfig();
+    
+    boolean cmpnd = config.get("compound",true);
+    int mrgf = config.get("merge.factor",10);
+    int mxbf = config.get("max.buffered",10);
+
+    // must update params for newly opened writer
+    writer.setMaxBufferedDocs(mxbf);
+    writer.setMergeFactor(mrgf);
+    writer.setUseCompoundFile(cmpnd); // this one redundant?
+    
+    getRunData().setIndexWriter(writer);
+    return 1;
+  }
+
+}

Added: lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/OpenReaderTask.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/OpenReaderTask.java?view=auto&rev=495834
==============================================================================
--- lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/OpenReaderTask.java (added)
+++ lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/OpenReaderTask.java Fri Jan 12 20:08:23 2007
@@ -0,0 +1,43 @@
+package org.apache.lucene.benchmark.byTask.tasks;
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import java.io.IOException;
+
+import org.apache.lucene.benchmark.byTask.PerfRunData;
+import org.apache.lucene.index.IndexReader;
+import org.apache.lucene.store.Directory;
+
+/**
+ * Open an index reader.
+ * Other side effects: index redaer object in perfRunData is set.
+ */
+public class OpenReaderTask extends PerfTask {
+
+  public OpenReaderTask(PerfRunData runData) {
+    super(runData);
+  }
+
+  public int doLogic() throws IOException {
+    Directory dir = getRunData().getDirectory();
+    IndexReader reader = IndexReader.open(dir);
+    getRunData().setIndexReader(reader);
+    return 1;
+  }
+
+}

Added: lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/OptimizeTask.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/OptimizeTask.java?view=auto&rev=495834
==============================================================================
--- lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/OptimizeTask.java (added)
+++ lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/OptimizeTask.java Fri Jan 12 20:08:23 2007
@@ -0,0 +1,40 @@
+package org.apache.lucene.benchmark.byTask.tasks;
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import org.apache.lucene.benchmark.byTask.PerfRunData;
+import org.apache.lucene.index.IndexWriter;
+
+/**
+ * Optimize the index.
+ * Other side effects: none.
+ */
+public class OptimizeTask extends PerfTask {
+
+  public OptimizeTask(PerfRunData runData) {
+    super(runData);
+  }
+
+  public int doLogic() throws Exception {
+    IndexWriter iw = getRunData().getIndexWriter();
+    iw.optimize();
+    //System.out.println("optimize called");
+    return 1;
+  }
+
+}

Added: lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/PerfTask.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/PerfTask.java?view=auto&rev=495834
==============================================================================
--- lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/PerfTask.java (added)
+++ lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/PerfTask.java Fri Jan 12 20:08:23 2007
@@ -0,0 +1,217 @@
+package org.apache.lucene.benchmark.byTask.tasks;
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import org.apache.lucene.benchmark.byTask.PerfRunData;
+import org.apache.lucene.benchmark.byTask.stats.Points;
+import org.apache.lucene.benchmark.byTask.stats.TaskStats;
+import org.apache.lucene.benchmark.byTask.utils.Format;
+
+/**
+ * A (abstract)  task to be tested for performance.
+ * <br>
+ * Every performance task extends this class, and provides its own doLogic() method, 
+ * which performss the actual task.
+ * <br>
+ * Tasks performing some work that should be measured for the task, can overide setup() and/or tearDown() and 
+ * placed that work there. 
+ */
+public abstract class PerfTask implements Cloneable {
+
+  private PerfRunData runData;
+  
+  // propeties that all tasks have
+  private String name;
+  private int depth = 0;
+  private int maxDepthLogStart = 0;
+  protected String params = null;
+  
+  protected static final String NEW_LINE = System.getProperty("line.separator");
+
+  /**
+   * Should not be used externally
+   */
+  private PerfTask() {
+    name =  Format.simpleName(getClass());
+    if (name.endsWith("Task")) {
+      name = name.substring(0,name.length()-4);
+    }
+  }
+
+  public PerfTask(PerfRunData runData) {
+    this();
+    this.runData = runData;
+    this.maxDepthLogStart = runData.getConfig().get("task.max.depth.log",0);
+  }
+  
+  /* (non-Javadoc)
+   * @see java.lang.Object#clone()
+   */
+  protected Object clone() throws CloneNotSupportedException {
+    // tasks having non primitive data structures should overide this.
+    // otherwise parallel running of a task sequence might not run crrectly. 
+    return super.clone();
+  }
+
+  /**
+   * Run the task, record statistics.
+   * @return number of work items done by this task.
+   */
+  public final int runAndMaybeStats(boolean reportStats) throws Exception {
+    if (reportStats && depth <= maxDepthLogStart && !shouldNeverLogAtStart()) {
+      System.out.println("------------> starting task: " + getName());
+    }
+    if (shouldNotRecordStats() || !reportStats) {
+      setup();
+      int count = doLogic();
+      tearDown();
+      return count;
+    }
+    setup();
+    Points pnts = runData.getPoints();
+    TaskStats ts = pnts.markTaskStart(this,runData.getConfig().getRoundNumber());
+    int count = doLogic();
+    pnts.markTaskEnd(ts, count);
+    tearDown();
+    return count;
+  }
+
+  /**
+   * Perform the task once (ignoring repetions specification)
+   * Return number of work items done by this task.
+   * For indexing that can be number of docs added.
+   * For warming that can be number of scanned items, etc.
+   * @return number of work items done by this task.
+   */
+  public abstract int doLogic() throws Exception;
+  
+  /**
+   * @return Returns the name.
+   */
+  public String getName() {
+    if (params==null) {
+      return name;
+    } 
+    return new StringBuffer(name).append('(').append(params).append(')').toString();
+  }
+
+  /**
+   * @param name The name to set.
+   */
+  protected void setName(String name) {
+    this.name = name;
+  }
+
+  /**
+   * @return Returns the run data.
+   */
+  public PerfRunData getRunData() {
+    return runData;
+  }
+
+  /**
+   * @return Returns the depth.
+   */
+  public int getDepth() {
+    return depth;
+  }
+
+  /**
+   * @param depth The depth to set.
+   */
+  public void setDepth(int depth) {
+    this.depth = depth;
+  }
+  
+  // compute a blank string padding for printing this task indented by its depth  
+  String getPadding () {
+    char c[] = new char[4*getDepth()];
+    for (int i = 0; i < c.length; i++) c[i] = ' ';
+    return new String(c);
+  }
+  
+  /* (non-Javadoc)
+   * @see java.lang.Object#toString()
+   */
+  public String toString() {
+    String padd = getPadding();
+    StringBuffer sb = new StringBuffer(padd);
+    sb.append(getName());
+    return sb.toString();
+  }
+
+  /**
+   * @return Returns the maxDepthLogStart.
+   */
+  int getMaxDepthLogStart() {
+    return maxDepthLogStart;
+  }
+
+  /**
+   * Tasks that should never log at start can overide this.  
+   * @return true if this task should never log when it start.
+   */
+  protected boolean shouldNeverLogAtStart () {
+    return false;
+  }
+  
+  /**
+   * Tasks that should not record statistics can overide this.  
+   * @return true if this task should never record its statistics.
+   */
+  protected boolean shouldNotRecordStats () {
+    return false;
+  }
+
+  /**
+   * Task setup work that should not be measured for that specific task.
+   * By default it does nothing, but tasks can implement this, moving work from 
+   * doLogic() to this method. Only the work done in doLogicis measured for this task.
+   * Notice that higher level (sequence) tasks containing this task would then 
+   * measure larger time than the sum of their contained tasks.
+   * @throws Exception 
+   */
+  public void setup () throws Exception {
+  }
+  
+  /**
+   * Task tearDown work that should not be measured for that specific task.
+   * By default it does nothing, but tasks can implement this, moving work from 
+   * doLogic() to this method. Only the work done in doLogicis measured for this task.
+   * Notice that higher level (sequence) tasks containing this task would then 
+   * measure larger time than the sum of their contained tasks.
+   */
+  public void tearDown () throws Exception {
+  }
+
+  /**
+   * Set the params of this task.
+   * Sub classes that supports parameters may overide this method for fetching/processing the params.
+   */
+  public void setParams(String params) {
+    this.params = params;
+  }
+
+  /**
+   * @return Returns the Params.
+   */
+  public String getParams() {
+    return params;
+  }
+
+}

Added: lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/ReadTask.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/ReadTask.java?view=auto&rev=495834
==============================================================================
--- lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/ReadTask.java (added)
+++ lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/ReadTask.java Fri Jan 12 20:08:23 2007
@@ -0,0 +1,124 @@
+package org.apache.lucene.benchmark.byTask.tasks;
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import org.apache.lucene.benchmark.byTask.PerfRunData;
+import org.apache.lucene.benchmark.byTask.feeds.QueryMaker;
+import org.apache.lucene.document.Document;
+import org.apache.lucene.index.IndexReader;
+import org.apache.lucene.search.Hits;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.Query;
+import org.apache.lucene.store.Directory;
+
+
+
+/**
+ * Read index (abstract) task.
+ * Sub classes implement withSearch(), withWarm(), withTraverse() and withRetrieve()
+ * methods to configure the actual action.
+ * Other side effects: none.
+ */
+public abstract class ReadTask extends PerfTask {
+
+  public ReadTask(PerfRunData runData) {
+    super(runData);
+  }
+
+  public int doLogic() throws Exception {
+    int res = 0;
+    boolean closeReader = false;
+    
+    // open reader or use existing one
+    IndexReader ir = getRunData().getIndexReader();
+    if (ir == null) {
+      Directory dir = getRunData().getDirectory();
+      ir = IndexReader.open(dir);
+      closeReader = true;
+      //res++; //this is confusing, comment it out
+    }
+    
+    // optionally warm and add num docs traversed to count
+    if (withWarm()) {
+      Document doc = null;
+      for (int m = 0; m < ir.maxDoc(); m++) {
+        if (!ir.isDeleted(m)) {
+          doc = ir.document(m);
+          res += (doc==null ? 0 : 1);
+        }
+      }
+    }
+    
+    if (withSearch()) {
+      res++;
+      IndexSearcher searcher = new IndexSearcher(ir);
+      QueryMaker queryMaker = getQueryMaker();
+      Query q = queryMaker.makeQuery();
+      Hits hits = searcher.search(q);
+      //System.out.println("searched: "+q);
+      
+      if (withTraverse()) {
+        Document doc = null;
+        if (hits != null && hits.length() > 0) {
+          for (int m = 0; m < hits.length(); m++) {
+            int id = hits.id(m);
+            res++;
+
+            if (withRetrieve()) {
+              doc = ir.document(id);
+              res += (doc==null ? 0 : 1);
+            }
+          }
+        }
+      }
+      
+      searcher.close();
+    }
+    
+    if (closeReader) {
+      ir.close();
+    }
+    return res;
+  }
+
+  /**
+   * Return query maker used for this task.
+   */
+  public abstract QueryMaker getQueryMaker();
+
+  /**
+   * Return true if search should be performed.
+   */
+  public abstract boolean withSearch ();
+
+  /**
+   * Return true if warming should be performed.
+   */
+  public abstract boolean withWarm ();
+  
+  /**
+   * Return true if, with search, results should be traversed.
+   */
+  public abstract boolean withTraverse ();
+
+  /**
+   * Return true if, with search & results traversing, docs should be retrieved.
+   */
+  public abstract boolean withRetrieve ();
+
+}

Added: lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/RepAllTask.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/RepAllTask.java?view=auto&rev=495834
==============================================================================
--- lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/RepAllTask.java (added)
+++ lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/RepAllTask.java Fri Jan 12 20:08:23 2007
@@ -0,0 +1,43 @@
+package org.apache.lucene.benchmark.byTask.tasks;
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import org.apache.lucene.benchmark.byTask.PerfRunData;
+import org.apache.lucene.benchmark.byTask.stats.Report;
+
+/**
+ * Report all statistics with no aggregations.
+ * Other side effects: None.
+ */
+public class RepAllTask extends ReportTask {
+
+  public RepAllTask(PerfRunData runData) {
+    super(runData);
+   }
+
+  public int doLogic() throws Exception {
+    Report rp = getRunData().getPoints().reportAll();
+    
+    System.out.println();
+    System.out.println("------------> Report All ("+rp.getSize()+" out of "+rp.getOutOf()+")");
+    System.out.println(rp.getText());
+    System.out.println();
+    return 0;
+  }
+
+}

Added: lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/RepSelectByPrefTask.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/RepSelectByPrefTask.java?view=auto&rev=495834
==============================================================================
--- lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/RepSelectByPrefTask.java (added)
+++ lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/RepSelectByPrefTask.java Fri Jan 12 20:08:23 2007
@@ -0,0 +1,44 @@
+package org.apache.lucene.benchmark.byTask.tasks;
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import org.apache.lucene.benchmark.byTask.PerfRunData;
+import org.apache.lucene.benchmark.byTask.stats.Report;
+
+/**
+ * Report by-name-prefix statistics with no aggregations.
+ * Other side effects: None.
+ */
+public class RepSelectByPrefTask extends RepSumByPrefTask {
+
+  public RepSelectByPrefTask(PerfRunData runData) {
+    super(runData);
+  }
+
+  public int doLogic() throws Exception {
+    Report rp = getRunData().getPoints().reportSelectByPrefix(prefix);
+    
+    System.out.println();
+    System.out.println("------------> Report Select By Prefix ("+prefix+") ("+
+        rp.getSize()+" about "+rp.getReported()+" out of "+rp.getOutOf()+")");
+    System.out.println(rp.getText());
+    System.out.println();
+
+    return 0;
+  }
+}



Mime
View raw message