hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1662) [hbase] Make region splits faster
Date Wed, 08 Aug 2007 19:30:59 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12518532
] 

Hadoop QA commented on HADOOP-1662:
-----------------------------------

+1

http://issues.apache.org/jira/secure/attachment/12363378/splits-v3.patch applied and successfully
tested against trunk revision r563649.

Test results:   http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/529/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/529/console

> [hbase] Make region splits faster
> ---------------------------------
>
>                 Key: HADOOP-1662
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1662
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: fastsplits.patch, mapfile_split.patch, splits-2.patch, splits-v3.patch
>
>
> HADOOP-1644 '[hbase] Compactions should take no longer than period between memcache flushes'
is about making compactions run faster.  This issue is about making splits faster.  Currently
splits are done by reading as input a map file and per record, writing out two new mapfiles.
 Its currently too slow.  ~30 seconds to split 120MB. Google hints in bigtable that splitting
is very fast because they let the split children feed off the split parent.  Primitive testing
has splitting mapfiles using raw streams running 3 to 4 times faster than splitting on mapfile
keys.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message