accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-3931) Repeated invocations of BatchScanner iterator().hasNext() causes MAC to become unresponsive
Date Wed, 29 Jul 2015 17:10:05 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-3931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646431#comment-14646431
] 

Josh Elser commented on ACCUMULO-3931:
--------------------------------------

bq. a BatchScanner is expensive to create (relative to a small scan)

I wouldn't think so, but depending on the environment I could be downplaying it. You have
the object itself, references to some other objects already made (threads, auths, table id),
and it creates the thread pool.

bq. I'm worried that repeated calls to getSlice are going to exhaust the thread pool because
I'm not closing the scanner. The alternative is to set up and tear down the scanner on every
call to getSlice, but that seems like it could be a lot of overhead.

I think this is your only option at the moment (given released software, anyways). My hunch
was that if you're actually making the RPCs, sending data over the wire, deserializing it,
etc -- you would _probably_ be spending more time doing that than creating a new BatchScanner.
I'm not learned enough on how Titan would be using the AccumuloKCVS to make a definite assertion
though. Experimentation would be good.

That being said, I do think that there is merit behind rethinking the resource mgmt behind
BatchScanner (or introducing a class which manages things differently). We've had goofy cases
in the past with JVM GC where the BS gets GC'ed before the Iterator created from the GC which
caused the thread pool to be closed (via a {{finalize()}}). On one side, it's nice to be able
to know you have fixed resources (how the BatchScanner operates now), but being able to treat
a BatchScanner like a factory also has its merits IMO.

> Repeated invocations of BatchScanner iterator().hasNext() causes MAC to become unresponsive
> -------------------------------------------------------------------------------------------
>
>                 Key: ACCUMULO-3931
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3931
>             Project: Accumulo
>          Issue Type: Bug
>          Components: mini
>    Affects Versions: 1.6.3, 1.7.0
>            Reporter: Russ Weeks
>            Assignee: Eric Newton
>         Attachments: tablet_server_0.txt, tablet_server_1.txt, test_runner_stack.txt
>
>
> Steps to reproduce:
> # Instantiate a MiniAccumuloCluster with at least two tablet servers
> # Create a table; add splits to the table.
> # Add some mutations to the table distributed across the splits; flush the mutations.
> # Create a BatchScanner across the full range of the table.
> # Assert that the batch scanner has at least one KV pair by calling {{scanner.iterator().hasNext()}}
> # Repeat.
> It doesn't seem to matter if you close the scanner and create a new one in between calls
to hasNext, or if you re-seek the same scanner, or if the scanner is created in a static context
and re-used by multiple tests or created by each test. Eventually you will see that the {{TabletServerBatchReaderIterator}}
gets stuck polling its resultsQueue, preventing further tests from running.
> This happens roughyl 20% of the time in 1.6 when I run {{mvn clean test -Dtest=org.apache.accumulo.minicluster.MultipleHasNextTest
--projects minicluster}}, maybe less often in 1.7, and 100% of the time when I try to use
the MAC in my company's product build environment, which uses gradle.
> (I'll update with a link to a failing unit test as soon as I get an issue ID)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message