nifi-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Wicks (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NIFI-6175) Spark Livy - Improving Livy
Date Mon, 22 Jul 2019 16:27:00 GMT

    [ https://issues.apache.org/jira/browse/NIFI-6175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16890291#comment-16890291
] 

Peter Wicks commented on NIFI-6175:
-----------------------------------

There does not appear to be a lot of appetite for testing/merging Livy changes. I've been
using Livy extensively on a project, and this PR has blossomed into a very large number of
changes as time has gone on.

Amazingly, I don't think there are any breaking changes, but to be honest, the built-in Livy
support is so basic, it's hard to imagine people using it very usefully. I will keep updating
this until my project has stabilized, and then hopefully we can find someone to merge it :). 

> Spark Livy - Improving Livy
> ---------------------------
>
>                 Key: NIFI-6175
>                 URL: https://issues.apache.org/jira/browse/NIFI-6175
>             Project: Apache NiFi
>          Issue Type: Improvement
>    Affects Versions: 1.9.2
>            Reporter: Peter Wicks
>            Assignee: Peter Wicks
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> The Livy Session Controller is missing many of the options available, and many of them
I feel are critical for this service to be useful (queue? conf? num of executors?)
>  * Would like to see all available options there, with a blanket "conf" option for users
to provide custom configuration.
>  * When the controller service shuts down, sessions are left running, with no option
to shut them down.  Add in functionality to shutdown open sessions.
>  * If the controller service finds no Idle Livy Sessions, it will create a new session...
until the queue runs out of resources :). Need to have a Min/Max/Should be elastic or strict
option
>  * When Livy starts up, it searches for existing sessions, but does not verify that those
sessions belong to it.
>  ** The Kerberos identity should be used to verify the identity on the session matches
the identity on the controller service.
>  ** Also, if a Proxy user has been specified, that should also be verified. If no proxy
user was specified, then the Proxy user on the Livy session should match the Kerberos identity.
>  * The initialization of the SSL Context is not implemented in a thread safe way. This
leads to exceptions when multiple threads are running against the same Controller Service.
>  ** SSL Context init should be made thread safe.
>  * There is a bug in Livy that causes running sessions to be killed if they run longer
than the timeout value: https://issues.apache.org/jira/browse/LIVY-547.
>  ** The processor should support the work around described in the discussion, by pinging
the session to record activity on sessions to keep them alive. [https://github.com/apache/incubator-livy/pull/138#issuecomment-455352091] 
> Livy should also support Batch mode.
>  * Include a controller service to re-use configs, but controller service is basically
just a config holder
>  * Processor named `ExecuteSparkBatch`. This is harder than Session because Batch mode
only supports code submission through a file path. So users will need to upload to HDFS first.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Mime
View raw message