[ https://issues.apache.org/jira/browse/ACCUMULO-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14709810#comment-14709810
]
ASF GitHub Bot commented on ACCUMULO-3959:
------------------------------------------
Github user dhutchis commented on a diff in the pull request:
https://github.com/apache/accumulo/pull/45#discussion_r37785744
--- Diff: core/src/main/java/org/apache/accumulo/core/client/BatchScanner.java ---
@@ -16,19 +16,20 @@
*/
package org.apache.accumulo.core.client;
+import org.apache.accumulo.core.data.Range;
+
import java.util.Collection;
import java.util.concurrent.TimeUnit;
-import org.apache.accumulo.core.data.Range;
-
/**
* Implementations of BatchScanner support efficient lookups of many ranges in accumulo.
+ * BatchScanners are also appropriate for large, single ranges,
+ * as a BatchScanner will break those ranges up into separate RPCs
+ * provided the range spans more than one tablet
+ * and there are sufficiently many scan threads available.
*
- * Use this when looking up lots of ranges and you expect each range to contain a small
amount of data. Also only use this when you do not care about the
- * returned data being in sorted order.
- *
- * If you want to lookup a few ranges and expect those ranges to contain a lot of data,
then use the Scanner instead. Also, the Scanner will return data in
- * sorted order, this will not.
+ * Only use this when you do not care about returned data being in sorted order.
--- End diff --
Correct, I see that the <p> tag is necessary from the online javadoc at
http://accumulo.apache.org/1.7/apidocs/org/apache/accumulo/core/client/BatchScanner.html
Will fix tonight when I return to my laptop. I don't think my editor
(IntelliJ with the Eclipse code formatter plugin) adds the HTML tags
automatically.
On Mon, Aug 24, 2015 at 2:25 PM, Keith Turner <notifications@github.com>
wrote:
> In core/src/main/java/org/apache/accumulo/core/client/BatchScanner.java
> <https://github.com/apache/accumulo/pull/45#discussion_r37784571>:
>
> > *
> > - * Use this when looking up lots of ranges and you expect each range to contain
a small amount of data. Also only use this when you do not care about the
> > - * returned data being in sorted order.
> > - *
> > - * If you want to lookup a few ranges and expect those ranges to contain a
lot of data, then use the Scanner instead. Also, the Scanner will return data in
> > - * sorted order, this will not.
> > + * Only use this when you do not care about returned data being in sorted order.
>
> This was already broken before your patch, but I think javadoc need <p>
> markup for paragraphs. Not sure it will render as intended w/o it.
>
> Did you format these changes?
>
> —
> Reply to this email directly or view it on GitHub
> <https://github.com/apache/accumulo/pull/45/files#r37784571>.
>
> Confusing wording on BatchScanner javadoc
> -----------------------------------------
>
> Key: ACCUMULO-3959
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3959
> Project: Accumulo
> Issue Type: Improvement
> Components: docs
> Affects Versions: 1.6.3, 1.7.0
> Reporter: Dylan Hutchison
> Assignee: Dylan Hutchison
> Priority: Minor
> Labels: docuentation
> Fix For: 1.6.4, 1.7.1
>
>
> The following sentence in the [BatchScanner Javadoc|https://accumulo.apache.org/1.7/apidocs/org/apache/accumulo/core/client/BatchScanner.html]
has confused my colleagues into using Scanners and wondering why performance doesn't scale.
> bq. If you want to lookup a few ranges and expect those ranges to contain a lot of data,
then use the Scanner instead.
> Also regarding this next sentence, from what I see of the BatchScanner it will break
up "large Range objects" that span multiple extents (tablets) into multiple ranges, possibly
one for each tablet.
> bq. Use this when looking up lots of ranges and you expect each range to contain a small
amount of data.
> If the client is okay with unsorted order and it is okay with using multiple threads,
then isn't it always a better decision to use a BatchScanner than regular Scanner? In the
worst case, one Range over a single row, the BatchScanner will perform the same as a regular
Scanner, ya?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
|