drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sudheesh Katkam <skat...@maprtech.com>
Subject Re: [VOTE] Release Apache Drill 1.3.0 (rc0)
Date Sat, 07 Nov 2015 02:56:15 GMT
But the status thread is a daemon. So the Drillbit doesn't have to stop it, right?

- Sudheesh

> On Nov 6, 2015, at 6:44 PM, Jacques Nadeau <jacques@dremio.com> wrote:
> 
> I see that we're bleeding Workmanager Status threads that aren't shutdown
> when the Drillbit is shutdown.
> 
> I'll get a patch together.
> 
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
> 
>> On Fri, Nov 6, 2015 at 4:31 PM, Hanifi Gunes <hgunes@maprtech.com> wrote:
>> 
>> Looks like we are possibly leaking some threads. Investigating.
>> 
>>> On Fri, Nov 6, 2015 at 4:25 PM, Jacques Nadeau <jacques@dremio.com> wrote:
>>> 
>>> Hmm.. that is quite strange. I wonder if we need to look at thread counts
>>> on the daemon.
>>> 
>>> We haven't changed how we create but there were changes to shutdown
>>> (although I can't imagine why that would be a problem).
>>> 
>>> --
>>> Jacques Nadeau
>>> CTO and Co-Founder, Dremio
>>> 
>>>> On Fri, Nov 6, 2015 at 4:11 PM, Hanifi Gunes <hgunes@maprtech.com>
>>> wrote:
>>> 
>>>> Not the testAggregateWithEmptyRequiredInput but I got the following on
>>>> my branch rebased top of master -- @CentOS.
>>>> 
>>>> Tests in error:
>>>>  TestImpersonationQueries.sequenceFileChainedImpersonationWithView »
>>>> UserRemote
>> TestImpersonationQueries.testMultiLevelImpersonationJoinEachSideReachesMaxUserHops:233->BaseTestQuery.updateClient:222->BaseTestQuery.
>>>>   updateClient:236->BaseTestQuery.updateClient:213 » Rpc
>> TestImpersonationQueries.testMultiLevelImpersonationExceedsMaxUserHops:219->BaseTestQuery.updateClient:222->BaseTestQuery.updateClient:
>>>>  236->BaseTestQuery.updateClient:213 » IllegalState
>> TestImpersonationQueries.avroChainedImpersonationWithView:280->BaseTestImpersonation.createView:186->BaseTestQuery.updateClient:222-
>>>>> BaseTestQuery.updateClient:236->BaseTestQuery.updateClient:213 »
>>>> IllegalState
>> TestImpersonationQueries.testDirectImpersonation_HasGroupReadPermissions:186->BaseTestQuery.updateClient:222->BaseTestQuery.updateClient:
>>>> 236->BaseTestQuery.updateClient:213 » IllegalState
>> TestImpersonationQueries.testDirectImpersonation_NoReadPermissions:196->BaseTestQuery.updateClient:222->BaseTestQuery.updateClient:236-
>>>>> BaseTestQuery.updateClient:213 » IllegalState
>> TestImpersonationQueries.testMultiLevelImpersonationEqualToMaxUserHops:210->BaseTestQuery.updateClient:222->BaseTestQuery.updateClient:
>>>>  236->BaseTestQuery.updateClient:213 » IllegalState
>>>> 
>>>> exception details --->
>> testMultiLevelImpersonationExceedsMaxUserHops(org.apache.drill.exec.impersonation.TestImpersonationQueries)
>>>> Time elapsed: 0.008 sec  <<<   ERROR!
>>>> java.lang.IllegalStateException: failed to create a child event loop
>>>>  at sun.nio.ch.IOUtil.makePipe(Native Method)
>>>>  at
>>> io.netty.channel.nio.NioEventLoop.openSelector(NioEventLoop.java:126)
>>>>  at io.netty.channel.nio.NioEventLoop.<init>(NioEventLoop.java:120)
>>>>  at
>> io.netty.channel.nio.NioEventLoopGroup.newChild(NioEventLoopGroup.java:87)
>>>>  at
>> io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:64)
>>>>  at
>> io.netty.channel.MultithreadEventLoopGroup.<init>(MultithreadEventLoopGroup.java:49)
>>>>  at
>> io.netty.channel.nio.NioEventLoopGroup.<init>(NioEventLoopGroup.java:61)
>>>>  at
>> io.netty.channel.nio.NioEventLoopGroup.<init>(NioEventLoopGroup.java:52)
>>>>  at
>> org.apache.drill.exec.rpc.TransportCheck.createEventLoopGroup(TransportCheck.java:74)
>>>>  at
>> org.apache.drill.exec.client.DrillClient.createEventLoop(DrillClient.java:239)
>>>>  at
>>> org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:220)
>>>>  at
>>> org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:178)
>>>>  at org.apache.drill.QueryTestUtil.createClient(QueryTestUtil.java:67)
>>>>  at
>> org.apache.drill.BaseTestQuery.updateClient(BaseTestQuery.java:213)
>>>>  at
>> org.apache.drill.BaseTestQuery.updateClient(BaseTestQuery.java:236)
>>>> 
>>>> 
>>>> My god's telling me that we are creating too many NioEventLoopGroup's.
>>>> Did we make any recent changes around RPC causing this?
>>>> 
>>>> -Hanifi
>>>> 
>>>> 
>>>>> On Fri, Nov 6, 2015 at 3:58 PM, Jacques Nadeau <jacques@dremio.com>
>>>> wrote:
>>>> 
>>>>> Do you have that other output/stack trace I asked about? If we can
>> also
>>>> see
>>>>> the illegalreference count on something other than the JDBC client
>>> close
>>>>> method, that would be helpful.
>>>>> 
>>>>> --
>>>>> Jacques Nadeau
>>>>> CTO and Co-Founder, Dremio
>>>>> 
>>>>>> On Fri, Nov 6, 2015 at 2:48 PM, Jinfeng Ni <jinfengni99@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> I just re-run, and the previous 4 failures are gone. But it failed
>>>>>> with two new ones:
>>>>>> 
>>>>>> Tests in error:
>> TestSqlStdBasedAuthorization.org.apache.drill.exec.impersonation.hive.TestSqlStdBasedAuthorization
>>>>>> » UserRemote
>> TestStorageBasedHiveAuthorization.org.apache.drill.exec.impersonation.hive.TestStorageBasedHiveAuthorization
>>>>>> » UserRemote
>>>>>> 
>>>>>> I re-start the machine, and there are not too many applications
>>>>>> running and the memory should be enough.  At least some days back,
>> I
>>>>>> got clean run on the same machine.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Fri, Nov 6, 2015 at 2:39 PM, Jacques Nadeau <jacques@dremio.com
>>> 
>>>>> wrote:
>>>>>>> Can you provide the complete output for this failure:
>>>>>>> 
>>>>>>> TestAggregateFunctions.testAggregateWithEmptyRequiredInput:237
»
>>>>>>> IllegalReferenceCount
>>>>>>> 
>>>>>>> I haven't seen the other issues. The last one looks like the
>> system
>>>> was
>>>>>>> having an issue since thread creation failure is usually an OS
>>>> problem.
>>>>>> Was
>>>>>>> your system under resourced?
>>>>>>> 
>>>>>>> --
>>>>>>> Jacques Nadeau
>>>>>>> CTO and Co-Founder, Dremio
>>>>>>> 
>>>>>>> On Fri, Nov 6, 2015 at 12:55 PM, Jinfeng Ni <
>> jinfengni99@gmail.com
>>>> 
>>>>>> wrote:
>>>>>>> 
>>>>>>>> I'm seeing unit test case failure when run "mvn clean install"
>>> over
>>>>>>>> drill master branch, on Mac.
>>>>>>>> 
>>>>>>>> The first one seems to be the issue #3 in Jacques's list.
The
>> last
>>>>>>>> three seems to different from the 4 issues. Has anyone seen
this
>>>>>>>> failure before, or it just happened to my mac? Thanks.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> =================================================
>>>>>>>> git log
>>>>>>>> commit 1a24233475ca46aaf2a49a5624b4042f088382f4
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Tests in error:
>> TestAggregateFunctions.testAggregateWithEmptyRequiredInput:237 »
>>>>>>>> IllegalReferenceCount
>> TestImpersonationQueries.testMultiLevelImpersonationEqualToMaxUserHops
>>>>>>>> » UserRemote
>> TestImpersonationQueries.removeMiniDfsBasedStorage:294->BaseTestImpersonation.stopMiniDfsCluster:151
>>>>>>>> » OutOfMemory
>>>>>>>>  TestImpersonationQueries>BaseTestQuery.closeClient:260
»
>>>> OutOfMemory
>>>>>>>> unable to...
>>>>>>>> 
>>>>>>>> Tests run: 1483, Failures: 0, Errors: 4, Skipped: 118
>>>>>>>> 
>>>>>>>> [INFO]
>>> ------------------------------------------------------------------------
>>>>>>>> [INFO] Reactor Summary:
>>>>>>>> [INFO]
>>>>>>>> [INFO] Apache Drill Root POM ..............................
>>> SUCCESS
>>>> [
>>>>>>>> 8.440 s]
>>>>>>>> [INFO] tools/Parent Pom ...................................
>>> SUCCESS
>>>> [
>>>>>>>> 0.631 s]
>>>>>>>> [INFO] tools/freemarker codegen tooling ...................
>>> SUCCESS
>>>> [
>>>>>>>> 5.236 s]
>>>>>>>> [INFO] Drill Protocol .....................................
>>> SUCCESS
>>>> [
>>>>>>>> 5.839 s]
>>>>>>>> [INFO] Common (Logical Plan, Base expressions) ............
>>> SUCCESS
>>>> [
>>>>>>>> 10.831 s]
>>>>>>>> [INFO] contrib/Parent Pom .................................
>>> SUCCESS
>>>> [
>>>>>>>> 0.815 s]
>>>>>>>> [INFO] contrib/data/Parent Pom ............................
>>> SUCCESS
>>>> [
>>>>>>>> 0.331 s]
>>>>>>>> [INFO] contrib/data/tpch-sample-data ......................
>>> SUCCESS
>>>> [
>>>>>>>> 2.838 s]
>>>>>>>> [INFO] exec/Parent Pom ....................................
>>> SUCCESS
>>>> [
>>>>>>>> 0.635 s]
>>>>>>>> [INFO] exec/Java Execution Engine .........................
>>> FAILURE
>>>>>> [12:05
>>>>>>>> min]
>>>>>>>> [INFO] exec/JDBC Driver using dependencies ................
>>> SKIPPED
>>>>>>>> [INFO] JDBC JAR with all dependencies .....................
>>> SKIPPED
>>>>>>>> [INFO] contrib/mongo-storage-plugin .......................
>>> SKIPPED
>>>>>>>> 
>>>>>>>> Tests run: 11, Failures: 0, Errors: 3, Skipped: 0, Time elapsed:
>>>>>>>> 17.042 sec <<< FAILURE! - in
>>>>>>>> org.apache.drill.exec.impersonation.TestImpersonationQueries
>> testMultiLevelImpersonationEqualToMaxUserHops(org.apache.drill.exec.impersonation.TestImpersonationQueries)
>>>>>>>> Time elapsed: 0.099 sec  <<< ERROR!
>>>>>>>> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM
>>>> ERROR:
>>>>>>>> OutOfMemoryError: unable to create new native thread
>>>>>>>> 
>>>>>>>> 
>>>>>>>> [Error Id: a826ac5d-e278-49bc-8f92-fdf241d0e634 on
>>>> 10.250.50.52:31010
>>>>> ]
>>>>>>>>  at
>> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:118)
>>>>>>>>  at
>> org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:112)
>>>>>>>>  at
>> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:47)
>>>>>>>>  at
>> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:32)
>>>>>>>>  at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:68)
>>>>>>>>  at
>>>>> org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:390)
>>>>>>>>  at
>> org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:105)
>>>>>>>>  at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>>>>  at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>>>>  at java.lang.Thread.run(Thread.java:744)
>>>>>>>> 
>>>>>>>> On Fri, Nov 6, 2015 at 9:42 AM, Jacques Nadeau <
>>> jacques@dremio.com>
>>>>>> wrote:
>>>>>>>>> It seems like we have four potentially show stopping
issues at
>>> the
>>>>>>>> moment:
>>>>>>>>> 
>>>>>>>>> DRILL-4042: Windows build doesn't include right version
of
>>> Hadoop
>>>>>>>>> dependencies
>>>>>>>>> DRILL-3480: Random message propagation timeouts
>>>>>>>>> DRILL-4041: Reference count issue
>>>>>>>>> DRILL-4046: Performance regression for some TPCH queries
>>>>>>>>> 
>>>>>>>>> Proposed next steps:
>>>>>>>>> 
>>>>>>>>> DRILL-4042 has a clear fix and reproduction. Patrick,
do you
>>> think
>>>>> can
>>>>>>>> have
>>>>>>>>> a fix up for this shortly?
>>>>>>>>> 
>>>>>>>>> For the 3480 & 4041, consistent reproductions are
missing. It
>>>> would
>>>>> be
>>>>>>>>> great if everybody could try to help find reproductions
to
>> these
>>>>>> issues.
>>>>>>>> I
>>>>>>>>> think we should take stock again at the end of the day
to
>> decide
>>>>> next
>>>>>>>> steps
>>>>>>>>> and whether we want to hold the release for these.
>>>>>>>>> 
>>>>>>>>> For 4046: I've heard that there are some performance
>> regressions
>>>>>> around a
>>>>>>>>> couple of queries but the current symptoms don't make
a lot of
>>>>> sense.
>>>>>> I'd
>>>>>>>>> like to collect some more data here and then decide next
>> steps.
>>>>>>>>> 
>>>>>>>>> Let's see if we can get repros for each of the inconsistent
>>> issues
>>>>> and
>>>>>>>>> check in again EOD.
>>>>>>>>> 
>>>>>>>>> thanks,
>>>>>>>>> Jacques
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Jacques Nadeau
>>>>>>>>> CTO and Co-Founder, Dremio
>>>>>>>>> 
>>>>>>>>> On Thu, Nov 5, 2015 at 3:36 PM, Aditya <
>> adityakishore@gmail.com
>>>> 
>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Ran into another one - DRILL-4042
>>>>>>>>>> <https://issues.apache.org/jira/browse/DRILL-4042>.
>>>>>>>>>> 
>>>>>>>>>> On Thu, Nov 5, 2015 at 1:48 PM, Jacques Nadeau <
>>>> jacques@dremio.com
>>>>>> 
>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Yeah, I think that sinks it. Weird how Rat complains
only on
>>>>>> windows...
>>>>>>>>>>> 
>>>>>>>>>>> Let's take the rest of the business day to test
the current
>>>>>> candidate
>>>>>>>> to
>>>>>>>>>>> make sure that we don't spin extra builds unnecessarily.
>>>>>>>>>>> 
>>>>>>>>>>> thanks,
>>>>>>>>>>> Jacques
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> --
>>>>>>>>>>> Jacques Nadeau
>>>>>>>>>>> CTO and Co-Founder, Dremio
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, Nov 5, 2015 at 1:24 PM, Aditya <
>>> adityakishore@gmail.com
>>>>> 
>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Oh, I thought only master/trunk branch was
protected, but
>>> now I
>>>>> see
>>>>>>>> the
>>>>>>>>>>>> mail from David Nalley.
>>>>>>>>>>>> 
>>>>>>>>>>>> In such case, I propose that the release
manager could push
>>> the
>>>>>> branch
>>>>>>>>>>>> to his/her private fork and put the URL/hash
in the vote
>>>> starter
>>>>>>>> thread.
>>>>>>>>>>>> 
>>>>>>>>>>>> The reason I was looking to the commit history
to determine
>>> if
>>>>> the
>>>>>>>>>>>> candidate suffer from DRILL-4040, which,
evidently it does.
>>>>>>>>>>>> 
>>>>>>>>>>>> -1 as the build from source is failing.
>>>>>>>>>>>> 
>>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/DRILL-4040
>>>>>>>>>>>> 
>>>>>>>>>>>> On Thu, Nov 5, 2015 at 1:12 PM, Jacques Nadeau
<
>>>>> jacques@dremio.com
>>>>>>> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> I'm not sure what to do here. INFRA just
changed the Git
>>>>> behavior
>>>>>> so
>>>>>>>> it
>>>>>>>>>>>>> is no longer possible to delete branches.
I generally
>> don't
>>>> like
>>>>>> to
>>>>>>>> have
>>>>>>>>>>>>> failed branches in a release history
(otherwise you get a
>>>>> release
>>>>>>>> branch
>>>>>>>>>>>>> with all these maven forward/backwards
commits). As such,
>> I
>>>>> would
>>>>>>>> overwrite
>>>>>>>>>>>>> candidate branches historically (dropping
the failed
>> release
>>>>>>>> commits).
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The commit is here right now:
>>>>>>>>>>>>> https://github.com/jacques-n/drill/tree/drill-1.3.0-rc0
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The parent of 4822068a006aeb251b686d2b51871573c4337e60
>>>>>>>>>>>>> is
>>>>>>>>>>>>> 3dedc158f3af8ec8320a9cd336b2798b09cc9a8d
(the tip of
>> master)
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Jacques Nadeau
>>>>>>>>>>>>> CTO and Co-Founder, Dremio
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Thu, Nov 5, 2015 at 1:01 PM, Aditya
<
>>>> adityakishore@gmail.com
>>>>>> 
>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I am having trouble determining the
git commit this
>> release
>>>> is
>>>>>> based
>>>>>>>>>>>>>> on as
>>>>>>>>>>>>>> I could not find the
>>>>>>>>>>>>>> id (4822068a006aeb251b686d2b51871573c4337e60)
captured in
>>> the
>>>>>>>>>>>>>> git.properties bundled in the
>>>>>>>>>>>>>> tarballs in the Drill Git repository.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Most likely the last commit is only
in your local branch
>>> and
>>>>>> since
>>>>>>>>>>>>>> git.properties captures only the
>>>>>>>>>>>>>> last commit, it is impossible to
find the parent commit.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Would it make sense to push the release
branch?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> aditya...
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Wed, Nov 4, 2015 at 11:08 PM,
Jacques Nadeau <
>>>>>> jacques@dremio.com
>>>>>>>>> 
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hey Everybody,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I'm happy to propose a new release
of Apache Drill,
>>> version
>>>>>> 1.3.0.
>>>>>>>>>>>>>> This is
>>>>>>>>>>>>>>> the first release candidate (rc0).
 It covers a total
>> of
>>>> ~50
>>>>>>>> closed
>>>>>>>>>>>>>> JIRAs
>>>>>>>>>>>>>>> [1].
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The tarball artifacts are hosted
at [2] and the maven
>>>>> artifacts
>>>>>>>> are
>>>>>>>>>>>>>> hosted
>>>>>>>>>>>>>>> at [3].
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The vote will be open for 72
hours ending at 11PM
>>> Pacific,
>>>>>>>> November
>>>>>>>>>>>>>> 7,
>>>>>>>>>>>>>>> 2015.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> [ ] +1
>>>>>>>>>>>>>>> [ ] +0
>>>>>>>>>>>>>>> [ ] -1
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>> Jacques
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> [1]
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820&version=12332946
>>>>>>>>>>>>>>> [2]
>>>>> http://people.apache.org/~jacques/apache-drill-1.3.0.rc0/
>>>>>>>>>>>>>>> [3]
>>> https://repository.apache.org/content/repositories/orgapachedrill-1013/
>> 

Mime
View raw message