Mailing-List: contact user-help@flink.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@flink.apache.org
MIME-Version: 1.0
In-Reply-To: <581B5FDB.4060409@apache.org>
References: <CAJeEL6KPE9eFLfT4csg6DA0EY7Z+G2Bo2+8SaQ19OMzQMF_LKg@mail.gmail.com>
 <581B5FDB.4060409@apache.org>
From: Oliver Swoboda <oswoboda90@gmail.com>
Date: Mon, 7 Nov 2016 15:34:31 +0100
Message-ID: <CAJeEL6KZyBB2rH24LoypFC9Zg6sG4c32MPNxjTyYhJUQS0su5A@mail.gmail.com>
Subject: Re: Using Flink with Accumulo
To: user@flink.apache.org
Cc: user@accumulo.apache.org
Content-Type: multipart/alternative; boundary=001a113c35760fca3c0540b6ec72
archived-at: Mon, 07 Nov 2016 14:34:37 -0000

--001a113c35760fca3c0540b6ec72
Content-Type: text/plain; charset=UTF-8

Hi Josh, thank you for your quick answer!

2016-11-03 17:03 GMT+01:00 Josh Elser <elserj@apache.org>:

> Hi Oliver,
>
> Cool stuff. I wish I knew more about Flink to make some better
> suggestions. Some points inline, and sorry in advance if I suggest
> something outright wrong. Hopefully someone from the Flink side can help
> give context where necessary :)
>
> Oliver Swoboda wrote:
>
>> Hello,
>>
>> I'm using Flink with Accumulo and wanted to read data from the database
>> by using the createHadoopInput function. Therefore I configure an
>> AccumuloInputFormat. The source code you can find here:
>> https://github.com/OSwoboda/masterthesis/blob/master/aggrega
>> tion.flink/src/main/java/de/oswoboda/aggregation/Main.java
>> <https://github.com/OSwoboda/masterthesis/blob/master/aggreg
>> ation.flink/src/main/java/de/oswoboda/aggregation/Main.java>
>>
>> I'm using a 5 Node Cluster (1 Master, 4 Worker).
>> Accumulo is installed with Ambari and has 1 Master Server on the Master
>> Node and 4 Tablet Servers (one on each Worker).
>> Flink is installed standalone with the Jobmanager on the Master Node and
>> 4 Taskmanagers (one on each Worker). Every Taskmanager can have 4 Tasks,
>> so there are 32 in total.
>>
>> First problem I have:
>> If I start serveral Flink Jobs the client count for Zookeeper in the
>> Accumulo Overview is constantly increasing. I assume that the used
>> scanner isn't correctly closed. The client count only decreases to
>> normal values when I restart Flink.
>>
>
> Hrm, this does seem rather bad. Eventually, you'll saturate the
> connections to ZK and ZK itself will start limiting new connections (per
> the maxClientCnxns property).
>
> This sounds somewhat familiar to https://issues.apache.org/jira
> /browse/ACCUMULO-2113. The lack of a proper "close()" method on the
> Instance interface is a known deficiency. I'm not sure how Flink execution
> happens, so I am kind of just guessing.
>
> You might be able to try to use the CleanUp[1] utility to close out the
> thread pools/connections when your Flink "task" is done.


Unfortunately that didn't worked. I guess because Flink is starting the
tasks with the scanners by a TaskManager and I can't access those tasks
with my program. So after the task is done, I can't close the connections
with the utility, because the thread where I use it hasn't startet the
scanners.

Second problem I have:
>> I want to compare aggregations on time series data with Accumulo (with
>> Iterators) and with flink. Unfortunately, the results vary inexplicable
>> when I'm using Flink. I wanted to compare the results for a full table
>> scan (called baseline in the code), but sometimes it takes 17-18 minutes
>> and sometimes its between 30 and 60 minutes. In the longer case I can
>> see in the Accumulo Overview that after some time only one worker is
>> left with running scans and there are just a few entries/s sanned (4
>> million at the beginning when all workers are running to 200k when the
>> one worker is left). Because there are 2.5 billion records to scan and
>> almost 500 million left it takes really long.
>> This problem doesn't occur with Accumulo using Iterators and a batch
>> scanner on the master node, each scan has almost identical durations and
>> graphics in the Accumulo Overview for entries/s, MB/s scanned and seeks
>> are for each scan the same.
>>
>
> It sounds like maybe your partitioning was sub-optimal and caused one task
> to get a majority of the data? Having the autoAdjustRanges=true (as you do
> by default) should help get many batches of work based on the tablet
> boundaries in Accumulo. I'm not sure how Flink actually executes them
> though.
>

The problem was that half of the data was on one node after a restart of
accumulo. It seems that it has something to do with the problem described
here: https://issues.apache.org/jira/browse/ACCUMULO-4353. I stopped and
then startet accumulo instead of doing a restart and then the data is
distributed evenly across all nodes. For my tests I keep accumulo running
now, because after each restart the data distribution is changed and I
don't want to upgrade to 1.8.

Yours faithfully,
>> Oliver Swoboda
>>
>
>
> [1] https://github.com/apache/accumulo/blob/e900e67425d950bd4c0c
> 5288a6270d7b362ac458/core/src/main/java/org/apache/accumulo/
> core/util/CleanUp.java#L36
>
>

--001a113c35760fca3c0540b6ec72
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Hi Josh, thank you for your quick answer!<br><div class=3D=
"gmail_extra"><br><div class=3D"gmail_quote">2016-11-03 17:03 GMT+01:00 Jos=
h Elser <span dir=3D"ltr">&lt;<a href=3D"mailto:elserj@apache.org" target=
=3D"_blank">elserj@apache.org</a>&gt;</span>:<br><blockquote class=3D"gmail=
_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204=
,204);padding-left:1ex">Hi Oliver,<br>
<br>
Cool stuff. I wish I knew more about Flink to make some better suggestions.=
 Some points inline, and sorry in advance if I suggest something outright w=
rong. Hopefully someone from the Flink side can help give context where nec=
essary :)<span class=3D"gmail-"><br>
<br>
Oliver Swoboda wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left:1px solid rgb(204,204,204);padding-left:1ex">
Hello,<br>
<br>
I&#39;m using Flink with Accumulo and wanted to read data from the database=
<br>
by using the createHadoopInput function. Therefore I configure an<br>
AccumuloInputFormat. The source code you can find here:<br>
<a href=3D"https://github.com/OSwoboda/masterthesis/blob/master/aggregation=
.flink/src/main/java/de/oswoboda/aggregation/Main.java" rel=3D"noreferrer" =
target=3D"_blank">https://github.com/OSwoboda/ma<wbr>sterthesis/blob/master=
/aggrega<wbr>tion.flink/src/main/java/de/<wbr>oswoboda/aggregation/Main.jav=
a</a><br>
&lt;<a href=3D"https://github.com/OSwoboda/masterthesis/blob/master/aggrega=
tion.flink/src/main/java/de/oswoboda/aggregation/Main.java" rel=3D"noreferr=
er" target=3D"_blank">https://github.com/OSwoboda/m<wbr>asterthesis/blob/ma=
ster/aggreg<wbr>ation.flink/src/main/java/de/<wbr>oswoboda/aggregation/Main=
.java</a><wbr>&gt;<br>
<br>
I&#39;m using a 5 Node Cluster (1 Master, 4 Worker).<br>
Accumulo is installed with Ambari and has 1 Master Server on the Master<br>
Node and 4 Tablet Servers (one on each Worker).<br>
Flink is installed standalone with the Jobmanager on the Master Node and<br=
>
4 Taskmanagers (one on each Worker). Every Taskmanager can have 4 Tasks,<br=
>
so there are 32 in total.<br>
<br>
First problem I have:<br>
If I start serveral Flink Jobs the client count for Zookeeper in the<br>
Accumulo Overview is constantly increasing. I assume that the used<br>
scanner isn&#39;t correctly closed. The client count only decreases to<br>
normal values when I restart Flink.<br>
</blockquote>
<br></span>
Hrm, this does seem rather bad. Eventually, you&#39;ll saturate the connect=
ions to ZK and ZK itself will start limiting new connections (per the maxCl=
ientCnxns property).<br>
<br>
This sounds somewhat familiar to <a href=3D"https://issues.apache.org/jira/=
browse/ACCUMULO-2113" rel=3D"noreferrer" target=3D"_blank">https://issues.a=
pache.org/jira<wbr>/browse/ACCUMULO-2113</a>. The lack of a proper &quot;cl=
ose()&quot; method on the Instance interface is a known deficiency. I&#39;m=
 not sure how Flink execution happens, so I am kind of just guessing.<br>
<br>
You might be able to try to use the CleanUp[1] utility to close out the thr=
ead pools/connections when your Flink &quot;task&quot; is done.</blockquote=
><div><br></div><div>Unfortunately=C2=A0that didn&#39;t worked. I guess bec=
ause Flink is starting the tasks with the scanners by a TaskManager and I c=
an&#39;t access those tasks with my program. So after the task is done, I c=
an&#39;t close the connections with the utility, because the thread where I=
 use it hasn&#39;t startet the scanners.</div><div><br></div><blockquote cl=
ass=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid=
 rgb(204,204,204);padding-left:1ex"><span class=3D"gmail-">
<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left:1px solid rgb(204,204,204);padding-left:1ex">
Second problem I have:<br>
I want to compare aggregations on time series data with Accumulo (with<br>
Iterators) and with flink. Unfortunately, the results vary inexplicable<br>
when I&#39;m using Flink. I wanted to compare the results for a full table<=
br>
scan (called baseline in the code), but sometimes it takes 17-18 minutes<br=
>
and sometimes its between 30 and 60 minutes. In the longer case I can<br>
see in the Accumulo Overview that after some time only one worker is<br>
left with running scans and there are just a few entries/s sanned (4<br>
million at the beginning when all workers are running to 200k when the<br>
one worker is left). Because there are 2.5 billion records to scan and<br>
almost 500 million left it takes really long.<br>
This problem doesn&#39;t occur with Accumulo using Iterators and a batch<br=
>
scanner on the master node, each scan has almost identical durations and<br=
>
graphics in the Accumulo Overview for entries/s, MB/s scanned and seeks<br>
are for each scan the same.<br>
</blockquote>
<br></span>
It sounds like maybe your partitioning was sub-optimal and caused one task =
to get a majority of the data? Having the autoAdjustRanges=3Dtrue (as you d=
o by default) should help get many batches of work based on the tablet boun=
daries in Accumulo. I&#39;m not sure how Flink actually executes them thoug=
h.<br></blockquote><div><br></div><div>The problem was that half of the dat=
a was on one node after a restart of accumulo. It seems that it has somethi=
ng to do with the problem described here:=C2=A0<a href=3D"https://issues.ap=
ache.org/jira/browse/ACCUMULO-4353">https://issues.apache.org/jira/browse/A=
CCUMULO-4353</a>. I stopped and then startet accumulo instead of doing a re=
start and then the data is distributed evenly across all nodes. For my test=
s I keep accumulo running now, because after each restart the data distribu=
tion is changed and I don&#39;t want to upgrade to 1.8.</div><div><br></div=
><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border=
-left:1px solid rgb(204,204,204);padding-left:1ex">
<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left:1px solid rgb(204,204,204);padding-left:1ex">
Yours faithfully,<br>
Oliver Swoboda<br>
</blockquote>
<br>
<br>
[1] <a href=3D"https://github.com/apache/accumulo/blob/e900e67425d950bd4c0c=
5288a6270d7b362ac458/core/src/main/java/org/apache/accumulo/core/util/Clean=
Up.java#L36" rel=3D"noreferrer" target=3D"_blank">https://github.com/apache=
/accu<wbr>mulo/blob/e900e67425d950bd4c0c<wbr>5288a6270d7b362ac458/core/src/=
<wbr>main/java/org/apache/accumulo/<wbr>core/util/CleanUp.java#L36</a><br>
<br>
</blockquote></div><br></div></div>

--001a113c35760fca3c0540b6ec72--