hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Meil (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8571) CopyTable and RowCounter don't seem to use setCaching setting
Date Fri, 17 May 2013 17:57:16 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13660908#comment-13660908
] 

Doug Meil commented on HBASE-8571:
----------------------------------

Make that "every time" I look at it.

                
> CopyTable and RowCounter don't seem to use setCaching setting
> -------------------------------------------------------------
>
>                 Key: HBASE-8571
>                 URL: https://issues.apache.org/jira/browse/HBASE-8571
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Doug Meil
>
> Maybe it's just me, but I've been looking on trunk and I don't see where either RowCounter
or CopyTable MapReduce can adjust the setCaching setting on the Scan instance.
> Example from RowCounter...
> {code}
>    Job job = new Job(conf, NAME + "_" + tableName);
>     job.setJarByClass(RowCounter.class);
>     Scan scan = new Scan();
>     scan.setCacheBlocks(false);
>     Set<byte []> qualifiers = new TreeSet<byte[]>(Bytes.BYTES_COMPARATOR);
>     if (startKey != null && !startKey.equals("")) {
>       scan.setStartRow(Bytes.toBytes(startKey));
>     }
>     if (endKey != null && !endKey.equals("")) {
>       scan.setStopRow(Bytes.toBytes(endKey));
>     }
>     scan.setFilter(new FirstKeyOnlyFilter());
>     if (sb.length() > 0) {
>       for (String columnName : sb.toString().trim().split(" ")) {
>         String [] fields = columnName.split(":");
>         if(fields.length == 1) {
>           scan.addFamily(Bytes.toBytes(fields[0]));
>         } else {
>           byte[] qualifier = Bytes.toBytes(fields[1]);
>           qualifiers.add(qualifier);
>           scan.addColumn(Bytes.toBytes(fields[0]), qualifier);
>         }
>       }
>     }
>     // specified column may or may not be part of first key value for the row.
>     // Hence do not use FirstKeyOnlyFilter if scan has columns, instead use
>     // FirstKeyValueMatchingQualifiersFilter.
>     if (qualifiers.size() == 0) {
>       scan.setFilter(new FirstKeyOnlyFilter());
>     } else {
>       scan.setFilter(new FirstKeyValueMatchingQualifiersFilter(qualifiers));
>     }
>     job.setOutputFormatClass(NullOutputFormat.class);
>     TableMapReduceUtil.initTableMapperJob(tableName, scan,
>       RowCounterMapper.class, ImmutableBytesWritable.class, Result.class, job);
>     job.setNumReduceTasks(0);
>     return job;
> {code}
> TableMapReduceUtil only serializes the Scan into the job, it doesn't adjust any of the
settings.
> Maybe I'm missing something, but this seems like a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message