hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars George (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6372) Add scanner batching to Export job
Date Fri, 03 Aug 2012 10:03:02 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13427952#comment-13427952
] 

Lars George commented on HBASE-6372:
------------------------------------

A few comments on the patch:

{code}
+  final static String EXPORT_BATCHING = "hbase.mapreduce.export.batch";
...
         + "   -Dhbase.client.scanner.caching=100\n"
{code}

I see that you followed the config key naming already in use where you declared your new key.
But looking at the other keys being used, they are all over the place. The one for scanner
caching - which is the closest to what you are adding. I suggest we follow the same rules,
i.e. name it "hbase.mapreduce.export.batch".

{code}
+      try {
+        s.setBatch(batching);
+      } catch (Exception e) {
+        LOG.error("Batching is not set because : "+e.toString());
+      }
{code}

Why wrap the setBatch() in a try/except? None of the filter being used are of the kind that
trigger the runtime exception. We can add the try/catch later if ever needed?

{code}
+    int batching = conf.getInt(EXPORT_BATCHING,-1);
{code}

Minor not, there should a space between the command the value.

{code}
+        LOG.error("Batching is not set because : "+e.toString());
{code}

Same minor nit, no spaces between the string concatenation.

{code}
+    System.err.println("For very wide rows consider set scan batching properties as below:\n"
{code}

Maybe rephrase a bit? For example: "For tables with very wide rows consider setting the batch
size as below:".

                
> Add scanner batching to Export job
> ----------------------------------
>
>                 Key: HBASE-6372
>                 URL: https://issues.apache.org/jira/browse/HBASE-6372
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>    Affects Versions: 0.96.0, 0.94.2
>            Reporter: Lars George
>            Assignee: Shengsheng Huang
>            Priority: Minor
>              Labels: newbie
>         Attachments: HBASE-6372.2.patch, HBASE-6372.3.patch, HBASE-6372.patch
>
>
> When a single row is too large for the RS heap then an OOME can take out the entire RS.
Setting scanner batching in custom scans helps avoiding this scenario, but for the supplied
Export job this is not set.
> Similar to HBASE-3421 we can set the batching to a low number - or if needed make it
a command line option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message