hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Dimiduk (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7747) Import tools should use a combiner to merge Puts
Date Tue, 26 Feb 2013 20:40:15 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13587512#comment-13587512
] 

Nick Dimiduk commented on HBASE-7747:
-------------------------------------

The grammer of that comment is terrible! Someone might expect better of me. What I mean to
say is:

bq. TODO: There's nothing to say Puts (values) are keyed on rowkey. Thus the map of put.getRow()
to combined Put is necessary. Could use HeapSize to create an upper bound on the memory size
of the puts map and flush some portion of the content. This is acceptable because Combiner
is run an unspecified number of times and is for optimization only.

We could apply further constraint on this implementation by requiring Keys be the rowkey used
in the Puts. In that case, the puts map is unnecessary.

The higher objective is for a MR job to create a single Put per row. This avoids the row-level
contention on write you see when writing wide/sparse table schema.
                
> Import tools should use a combiner to merge Puts
> ------------------------------------------------
>
>                 Key: HBASE-7747
>                 URL: https://issues.apache.org/jira/browse/HBASE-7747
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce, Performance
>            Reporter: Nick Dimiduk
>            Assignee: Nick Dimiduk
>            Priority: Minor
>             Fix For: 0.95.0
>
>         Attachments: 0001-HBASE-7747-Import-use-a-Put-combiner-where-possible.patch
>
>
> Multiple Puts to the same row should be combined into a single mutation object. This
can be done with a Combiner. Import.Importer#writeResult appears to do this manually.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message