hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5754) data lost with gora continuous ingest test (goraci)
Date Fri, 13 Apr 2012 05:49:01 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253135#comment-13253135
] 

stack commented on HBASE-5754:
------------------------------

I ran w/ 10 generators and 10 slots for the verify step and got the below which doesn't prints
out only a REFERENCED count.

Running these recent tests I let it do its natural splitting so it grew from zero to 260odd
regions so maybe the issue you see Eric comes of manual splits coming out of the UI.  Let
me try that next.

Thanks lads.

{code}
12/04/13 05:16:23 INFO mapred.JobClient:  map 100% reduce 99%
12/04/13 05:16:54 INFO mapred.JobClient:  map 100% reduce 100%
12/04/13 05:16:59 INFO mapred.JobClient: Job complete: job_201204092039_0046
12/04/13 05:16:59 INFO mapred.JobClient: Counters: 30
12/04/13 05:16:59 INFO mapred.JobClient:   Job Counters
12/04/13 05:16:59 INFO mapred.JobClient:     Launched reduce tasks=10
12/04/13 05:16:59 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=30125694
12/04/13 05:16:59 INFO mapred.JobClient:     Total time spent by all reduces waiting after
reserving slots (ms)=0
12/04/13 05:16:59 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving
slots (ms)=0
12/04/13 05:16:59 INFO mapred.JobClient:     Rack-local map tasks=6
12/04/13 05:16:59 INFO mapred.JobClient:     Launched map tasks=256
12/04/13 05:16:59 INFO mapred.JobClient:     Data-local map tasks=250
12/04/13 05:16:59 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=5832198
12/04/13 05:16:59 INFO mapred.JobClient:   goraci.Verify$Counts
12/04/13 05:16:59 INFO mapred.JobClient:     REFERENCED=1000000000
12/04/13 05:16:59 INFO mapred.JobClient:   File Output Format Counters
12/04/13 05:16:59 INFO mapred.JobClient:     Bytes Written=0
12/04/13 05:16:59 INFO mapred.JobClient:   FileSystemCounters
12/04/13 05:16:59 INFO mapred.JobClient:     FILE_BYTES_READ=83022967343
12/04/13 05:16:59 INFO mapred.JobClient:     HDFS_BYTES_READ=156414
12/04/13 05:16:59 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=112881560332
12/04/13 05:16:59 INFO mapred.JobClient:   File Input Format Counters
12/04/13 05:16:59 INFO mapred.JobClient:     Bytes Read=0
12/04/13 05:16:59 INFO mapred.JobClient:   Map-Reduce Framework
12/04/13 05:16:59 INFO mapred.JobClient:     Map output materialized bytes=29992170602
12/04/13 05:16:59 INFO mapred.JobClient:     Map input records=1000000000
12/04/13 05:16:59 INFO mapred.JobClient:     Reduce shuffle bytes=29874879887
12/04/13 05:16:59 INFO mapred.JobClient:     Spilled Records=7527086436
12/04/13 05:16:59 INFO mapred.JobClient:     Map output bytes=25992155242
12/04/13 05:16:59 INFO mapred.JobClient:     CPU time spent (ms)=20182570
12/04/13 05:16:59 INFO mapred.JobClient:     Total committed heap usage (bytes)=99953082368
12/04/13 05:16:59 INFO mapred.JobClient:     Combine input records=0
12/04/13 05:16:59 INFO mapred.JobClient:     SPLIT_RAW_BYTES=156414
12/04/13 05:16:59 INFO mapred.JobClient:     Reduce input records=2000000000
12/04/13 05:16:59 INFO mapred.JobClient:     Reduce input groups=1000000000
12/04/13 05:16:59 INFO mapred.JobClient:     Combine output records=0
12/04/13 05:16:59 INFO mapred.JobClient:     Physical memory (bytes) snapshot=91762372608
12/04/13 05:16:59 INFO mapred.JobClient:     Reduce output records=0
12/04/13 05:16:59 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=391126540288
12/04/13 05:16:59 INFO mapred.JobClient:     Map output records=2000000000
{code}
                
> data lost with gora continuous ingest test (goraci)
> ---------------------------------------------------
>
>                 Key: HBASE-5754
>                 URL: https://issues.apache.org/jira/browse/HBASE-5754
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>         Environment: 10 node test cluster
>            Reporter: Eric Newton
>            Assignee: stack
>
> Keith Turner re-wrote the accumulo continuous ingest test using gora, which has both
hbase and accumulo back-ends.
> I put a billion entries into HBase, and ran the Verify map/reduce job.  The verification
failed because about 21K entries were missing.  The goraci [README|https://github.com/keith-turner/goraci]
explains the test, and how it detects missing data.
> I re-ran the test with 100 million entries, and it verified successfully.  
> Both of the times I tested using a billion entries, the verification failed.
> If I run the verification step twice, the results are consistent, so the problem is
> probably not on the verify step.
> Here's the versions of the various packages:
> ||package||version||
> |hadoop|0.20.205.0|
> |hbase|0.92.1|
> |gora|http://svn.apache.org/repos/asf/gora/trunk r1311277|
> |goraci|https://github.com/ericnewton/goraci  tagged 2012-04-08|
> The change I made to goraci was to configure it for hbase and to allow it to build properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message