accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Turner <ke...@deenlo.com>
Subject Re: Accumulo on MapR Continued
Date Fri, 30 Mar 2012 19:16:45 GMT
Another test that would be interesting to try is the continuous ingest
test under test/system/continuous.  There is a lot in the readme.  A
basic test would be to run ingest for a bit (a few hours or
overnight), then stop ingest and run the verification map reduce job.

Keith

On Fri, Mar 30, 2012 at 2:07 PM, Keys Botzum <kbotzum@maprtech.com> wrote:
> I've been building on the work of Todd Stavish to put Accumulo onto MapR:
>
> http://t.co/RJJ8Ht4B
> http://mail-archives.apache.org/mod_mbox/accumulo-user/201203.mbox/%3CCADKA3WiSRDrTw_vKcB9zrP0oyq01jQfZOBGQDRnujEjQ3kHd2w@mail.gmail.com%3E
>
> I've installed Accumulo on a 5 node MapR cluster (real hardware) basing my
> work on what Todd posted earlier. I'm using Accumulo 1.4.0 RC3 (downloaded
> from http://people.apache.org/~kturner/1.4.0rc3/) which contains the file
> system fix (https://issues.apache.org/jira/browse/ACCUMULO-476).
>
> The install went fine. I can start the Accumulo shell and create tables and
> such without difficulty.
>
> I decided then to do a more complete test using the Accumulo built in tests
> found under ACCUMULO_HOME/test and this is where I started encountering
> problems. It may be that I just don't understand how to configure and run
> the tests properly or perhaps something deeper. Any help would be greatly
> appreciated.
>
> I decided to run the tests under the auto directory as those seemed
> promising (was that a good idea?). As instructed in the README I can start
> tests in local mode (no MapReduce job) using ./run.py -t testname or leaving
> out the -t to run all tests. When I run the tests (all at once or
> individually) while the majority succeed a small subset consistently fail.
> They are:
>
> FAIL: runTest (simple.batchScanSplit.BatchScanSplitTest)
> FAIL: runTest (simple.bulkSplitOptimization.BulkSplitOptimizationTest)
> FAIL: runTest (simple.deleterows.DeleteRowsSplitTest)
> FAIL: runTest (simple.dynamic.DynamicClassloader)
> FAIL: runTest (simple.largeRow.LargeRowTest)
> FAIL: runTest (simple.merge.MergeTest)
> FAIL: runTest (simple.zooCacheTest.ZooCacheTest)
>
> Some fail because of timeouts, other fail with assertion errors. Rather than
> dumping every issue into the forum, I thought I'd just start by focusing on
> the first test - simple.batchScanSplit.BatchScanSplitTest. When I run this
> test by itself, this is the output:
>
> ./run.py -t ulkSplitOptimizationTest -d -v 10
> …
> DEBUG:test.auto:out:        1,010 records written |
> DEBUG:test.auto:out:   72,142 records/sec |       92,920 bytes written |
> 6,637,142 bytes/sec |
> DEBUG:test.auto:out:  0.014 secs
> FAIL
> ======================================================================
> FAIL: runTest (simple.bulkSplitOptimization.BulkSplitOptimizationTest)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "/opt/accumulo-1.4.0/test/system/auto/JavaTest.py", line 56, in
> runTest
>     self.waitForStop(handle, self.maxRuntime)
>   File "/opt/accumulo-1.4.0/test/system/auto/TestUtils.py", line 371, in
> waitForStop
>     self.fail("Process failed to finish in %s seconds" % secs)
> AssertionError: Process failed to finish in 200 seconds
>
>
> ======================================================================
> FAIL: runTest (simple.bulkSplitOptimization.BulkSplitOptimizationTest)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "/opt/accumulo-1.4.0/test/system/auto/JavaTest.py", line 56, in
> runTest
>     self.waitForStop(handle, self.maxRuntime)
>   File "/opt/accumulo-1.4.0/test/system/auto/TestUtils.py", line 371, in
> waitForStop
>     self.fail("Process failed to finish in %s seconds" % secs)
> AssertionError: Process failed to finish in 200 seconds
>
> ----------------------------------------------------------------------
> Ran 1 test in 208.163s
>
> FAILED (failures=1)
>
> Since I set the test to preserve the logs, I look at the logs and found only
> one thing that jumps out at me as the problem. I see this message repeated
> over and over while the test is running:
>
> 30 11:00:14,587 [tabletserver.TabletServer] DEBUG: ScanSess tid
> 10.250.99.204:45335 !0 2 entries in 0.00 secs, nbTimes = [1 1 1.00 1]
> 30 11:00:15,092 [tabletserver.TabletServer] DEBUG: ScanSess tid
> 10.250.99.204:45335 !0 2 entries in 0.00 secs, nbTimes = [0 0 0.00 1]
> 30 11:00:15,499 [file.FileUtil] DEBUG: Too many indexes (100) to open at
> once for null null, reducing in tmpDir =
> /user/mapr/accumulo-SE-test-04-17598/tmp/idxReduce_1785031962
> 30 11:00:15,552 [tabletserver.TabletServer] DEBUG: gc PS
> Scavenge=4.22(+0.00) secs PS MarkSweep=0.18(+0.00) secs
> freemem=74,816,456(+5,884,960) totalmem=110,100,480
> 30 11:00:15,596 [tabletserver.TabletServer] DEBUG: ScanSess tid
> 10.250.99.204:45335 !0 2 entries in 0.00 secs, nbTimes = [1 1 1.00 1]
> 30 11:00:15,643 [file.FileUtil] DEBUG: Finished reducing indexes for null
> null in   0.14 secs
> 30 11:00:15,654 [file.FileUtil] DEBUG: Found midPoint from indexes in   0.01
> secs.
>
> 30 11:00:15,778 [tabletserver.Tablet] ERROR: Failed to find lastkey Seeking
> beyond EOF, filelen 733, wantpos 733
>
>
> I suspect the last line is the cause but I do not understand what it means
> or how to investigate this further. Of course I can provide any additional
> information you need but I didn't want to make this post even longer at
> first.
>
> Help is appreciated.
>
> Thanks!
> Keys
>
> p.s. As a sanity test I installed Accumulo a 2nd time on a MapR M5 VM with
> one CPU and the same tests fail.
> ________________________________
> Keys Botzum
> Senior Principal Technologist
> WW Systems Engineering
> kbotzum@maprtech.com
> 443-718-0098
> MapR Technologies
> http://www.mapr.com
>
>
>

Mime
View raw message