livy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcelo Vanzin <van...@cloudera.com>
Subject Re: How to cancel the running streaming job using livy?
Date Wed, 24 Jan 2018 22:46:39 GMT
Then that has nothing to do with Livy.

You need to store a reference to your StreamingQuery (returned by start())
somewhere, and if you want to stop it, call its "stop()" method by
submitting a new Livy job that does it.

On Wed, Jan 24, 2018 at 2:42 PM, kant kodali <kanth909@gmail.com> wrote:

> Ok Let me paste some code to try and avoid the confusion. In the below
> code I am running two streaming queries. Now here are my two simple
> questions.
>
> 1) Does each Streaming Query below spawn one job or multiple jobs?
> 2) What should I do if I need to kill everything related to streaming
> query1 but not streaming query2?
>
>
> public Void call(JobContext ctx) throws Exception {
>    SparkSession sparkSession = ctx.sparkSession();
>    Dataset<Row> df = sparkSession.readStream().format("kafka").load();
>    df.createOrReplaceTempView("table");
>
>    Dataset<Row> resultSet1 = sparkSession.sql("select * from table");
>
>    resultSet1.writeStream().format("console").start();  //Streaming query1 started
>
>    Dataset<Row> resultSet2 = sparkSession.sql("select count(*) from table"); //Streaming
query2 started
>
>    sparkSession.streams().awaitAnyTermination();
>
>    return null;
>
> }
>
>
> Thanks!
>
>
> On Wed, Jan 24, 2018 at 1:47 PM, Marcelo Vanzin <vanzin@cloudera.com>
> wrote:
>
>> I'm a little confused about what is meant as a job here, after all this
>> discussion...
>>
>> For "interactive sessions", stopping a session means stopping the
>> SparkContext. So the final state of any running jobs in that session should
>> be the same as if you stopped the SparkContext without explicitly stopping
>> the jobs in a normal, non-Livy application.
>>
>> For batches, stopping a batch means killing the Spark application, so all
>> bets are off as to what happens there.
>>
>>
>> On Wed, Jan 24, 2018 at 1:08 PM, Alex Bozarth <ajbozart@us.ibm.com>
>> wrote:
>>
>>> You are correct that you are using the term Job incorrectly (at least
>>> according to how Spark/Livy uses it). Each spark-submit is a a single Spark
>>> Application and can include many jobs (which are broken down themselves
>>> into stages and tasks). In Livy using sessions would be like using
>>> spark-shell rather than spark-submit, you probably want to use batches
>>> instead (which utilize spark-submit), then you would use that delete
>>> command as mentioned earlier. As for the result being listed as FAILED and
>>> not CANCELLED, that is as intended. When a Livy Session is stopped
>>> (deleted) is sends a command to all the running jobs (in your case each of
>>> you apps only have one "Job") to set as failed.
>>>
>>> @Marcelo you wrote the code that does this, do you remember why you had
>>> Jobs killed instead of cancelled when a Livy session is stopped? Otherwise
>>> we may be able to open a JIRA and change this, but I am unsure of any
>>> potential consequences.
>>>
>>>
>>> *Alex Bozarth*
>>> Software Engineer
>>> Spark Technology Center
>>> ------------------------------
>>> *E-mail:* *ajbozart@us.ibm.com* <ajbozart@us.ibm.com>
>>> *GitHub: **github.com/ajbozarth* <https://github.com/ajbozarth>
>>>
>>>
>>> 505 Howard Street
>>> <https://maps.google.com/?q=505+Howard+Street+San+Francisco,+CA+94105+United+States&entry=gmail&source=g>
>>> San Francisco, CA 94105
>>> <https://maps.google.com/?q=505+Howard+Street+San+Francisco,+CA+94105+United+States&entry=gmail&source=g>
>>> United States
>>> <https://maps.google.com/?q=505+Howard+Street+San+Francisco,+CA+94105+United+States&entry=gmail&source=g>
>>>
>>>
>>>
>>> [image: Inactive hide details for kant kodali ---01/23/2018 11:44:26
>>> PM---I tried POST to sessions/{session id}/jobs/{job id}/cancel a]kant
>>> kodali ---01/23/2018 11:44:26 PM---I tried POST to sessions/{session
>>> id}/jobs/{job id}/cancel and that doesn't seem to cancel either.
>>>
>>> From: kant kodali <kanth909@gmail.com>
>>> To: user@livy.incubator.apache.org
>>> Date: 01/23/2018 11:44 PM
>>>
>>> Subject: Re: How to cancel the running streaming job using livy?
>>> ------------------------------
>>>
>>>
>>>
>>> I tried  POST to sessions/{session id}/jobs/{job id}/cancel and that
>>> doesn't seem to cancel either. I think first of all the word "job" is used
>>> in so many context that it might be misleading.
>>>
>>> Imagine for a second I don't have livy and I just use spark-submit
>>> command line to spawn . say I do that following
>>>
>>> spark-submit hello1.jar // streaming job1 (runs forever)
>>> spark-submit hello2.jar //streaming job2 (runs forever)
>>>
>>> The number of jobs I spawned is two and now I want to be able to cancel
>>> one of them..These jobs reads data from kafka and will be split into stages
>>> and task now sometimes these tasks are also called jobs according to SPARK
>>> UI for some reason. And looks like live may be is cancelling those with the
>>> above end point.
>>>
>>> It would be great help if someone could try from their end and see if
>>> they are able to cancel the jobs?
>>>
>>> Thanks!
>>>
>>> On Fri, Jan 19, 2018 at 4:03 PM, Alex Bozarth <*ajbozart@us.ibm.com*
>>> <ajbozart@us.ibm.com>> wrote:
>>>
>>>    Ah, that's why I couldn't find cancel in JobHandle, but it was
>>>    implemented in all it's implementations, which all implement it as would be
>>>    expected.
>>>
>>>
>>>
>>> *Alex Bozarth*
>>> Software Engineer
>>> Spark Technology Center
>>> ------------------------------
>>> *E-mail:* *ajbozart@us.ibm.com* <ajbozart@us.ibm.com>
>>> *GitHub: **github.com/ajbozarth*
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ajbozarth&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=S1_S7Dymu4ZL6g7L21O78VQZ53vEnAyZ-cx37DPYDyo&m=Io6A_oOejKvX7wP9hqKWr0NXa729OGgy1e-qdIwelfI&s=fDK7aF_qwcx3-sCSfUCbzeju-yaB8rqcutS_AuW_BRs&e=>
>>>
>>>
>>> *505 Howard Street*
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D505-2BHoward-2BStreet-2BSan-2BFrancisco-2C-2BCA-2B94105-2BUnited-2BStates-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=S1_S7Dymu4ZL6g7L21O78VQZ53vEnAyZ-cx37DPYDyo&m=Io6A_oOejKvX7wP9hqKWr0NXa729OGgy1e-qdIwelfI&s=GCO_bHHbb3d10NSMTDbyhfJqnEzkvlFZJoH4oND7x2w&e=>
>>> *San Francisco, CA 94105*
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D505-2BHoward-2BStreet-2BSan-2BFrancisco-2C-2BCA-2B94105-2BUnited-2BStates-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=S1_S7Dymu4ZL6g7L21O78VQZ53vEnAyZ-cx37DPYDyo&m=Io6A_oOejKvX7wP9hqKWr0NXa729OGgy1e-qdIwelfI&s=GCO_bHHbb3d10NSMTDbyhfJqnEzkvlFZJoH4oND7x2w&e=>
>>> *United States*
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D505-2BHoward-2BStreet-2BSan-2BFrancisco-2C-2BCA-2B94105-2BUnited-2BStates-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=S1_S7Dymu4ZL6g7L21O78VQZ53vEnAyZ-cx37DPYDyo&m=Io6A_oOejKvX7wP9hqKWr0NXa729OGgy1e-qdIwelfI&s=GCO_bHHbb3d10NSMTDbyhfJqnEzkvlFZJoH4oND7x2w&e=>
>>>
>>>
>>>
>>>
>>>    [image: Inactive hide details for Marcelo Vanzin ---01/19/2018
>>>    03:55:43 PM---A JobHandle (which you get by submitting a Job) is a Futur]Marcelo
>>>    Vanzin ---01/19/2018 03:55:43 PM---A JobHandle (which you get by submitting
>>>    a Job) is a Future, and Futures have a "cancel()" method.
>>>
>>>    From: Marcelo Vanzin <*vanzin@cloudera.com* <vanzin@cloudera.com>>
>>>    To: *user@livy.incubator.apache.org* <user@livy.incubator.apache.org>
>>>    Date: 01/19/2018 03:55 PM
>>>
>>>    Subject: Re: How to cancel the running streaming job using livy?
>>>    ------------------------------
>>>
>>>
>>>
>>>    A JobHandle (which you get by submitting a Job) is a Future, and
>>>    Futures have a "cancel()" method.
>>>
>>>    I don't remember the details about how "cancel()" is implemented in
>>>    Livy, though.
>>>
>>>    On Fri, Jan 19, 2018 at 3:52 PM, Alex Bozarth <*ajbozart@us.ibm.com*
>>>    <ajbozart@us.ibm.com>> wrote:
>>>       Ok so I looked into this a bit more. I misunderstood you a bit
>>>          before, the delete call is for ending livy sessions using the rest API,
not
>>>          jobs and not via the Java API. As for the Job state that makes sense,
if
>>>          you end the session the session kills all currently running jobs. What
you
>>>          want to to send cancel requests to the jobs the session is running.
From my
>>>          research I found that there is a way to do this via the REST API, but
it
>>>          isn't documented for some reason. Doing a POST to /{session id}/jobs/{job
>>>          id}/cancel will cancel a job. As for the Java API, the feature isn't
part
>>>          of the Java interface, but most implementations of it add it, such as
the
>>>          Scala API which ScalaJobHandle class on sumbit which has a cancel function.
>>>          I'm not sure how you're submitting you jobs, but there should be a cancel
>>>          function available to you somewhere depending on the client you're using.
>>>          From this discussion I've realized our current documentation is even
more
>>>          lacking that I had thought.
>>>       *Alex Bozarth*
>>>    Software Engineer
>>>    Spark Technology Center
>>>
>>> ------------------------------
>>> *E-mail:* *ajbozart@us.ibm.com* <ajbozart@us.ibm.com>
>>> *GitHub: **github.com/ajbozarth*
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ajbozarth&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=S1_S7Dymu4ZL6g7L21O78VQZ53vEnAyZ-cx37DPYDyo&m=dcr5yrMAHblD8Ur9vfpBsXcOzNGHtaEF9jk5yMBv4Kk&s=gMcUXOnL9YD3_CIOpwNX4jqFVWhx0l6DAsJYTKN9HVU&e=>
>>>
>>>
>>> *505 Howard Street*
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D505-2BHoward-2BStreet-2BSan-2BFrancisco-2C-2BCA-2B94105-2BUnited-2BStates-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=S1_S7Dymu4ZL6g7L21O78VQZ53vEnAyZ-cx37DPYDyo&m=dcr5yrMAHblD8Ur9vfpBsXcOzNGHtaEF9jk5yMBv4Kk&s=Iu4BJQb_gsqB3B1AXW2WTuFJsI-peBqIQyczkuK3MMU&e=>
>>> *San Francisco, CA 94105*
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D505-2BHoward-2BStreet-2BSan-2BFrancisco-2C-2BCA-2B94105-2BUnited-2BStates-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=S1_S7Dymu4ZL6g7L21O78VQZ53vEnAyZ-cx37DPYDyo&m=dcr5yrMAHblD8Ur9vfpBsXcOzNGHtaEF9jk5yMBv4Kk&s=Iu4BJQb_gsqB3B1AXW2WTuFJsI-peBqIQyczkuK3MMU&e=>
>>> *United States*
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D505-2BHoward-2BStreet-2BSan-2BFrancisco-2C-2BCA-2B94105-2BUnited-2BStates-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=S1_S7Dymu4ZL6g7L21O78VQZ53vEnAyZ-cx37DPYDyo&m=dcr5yrMAHblD8Ur9vfpBsXcOzNGHtaEF9jk5yMBv4Kk&s=Iu4BJQb_gsqB3B1AXW2WTuFJsI-peBqIQyczkuK3MMU&e=>
>>>
>>>
>>>
>>>
>>>          [image: Inactive hide details for kant kodali ---01/18/2018
>>>          06:09:59 PM---Also just tried the below and got the state. It ended
up in "]kant
>>>          kodali ---01/18/2018 06:09:59 PM---Also just tried the below and got
the
>>>          state. It ended up in "FAILED" stated when I expected it to be
>>>
>>>          From: kant kodali <*kanth909@gmail.com* <kanth909@gmail.com>>
>>>          To: *user@livy.incubator.apache.org*
>>>          <user@livy.incubator.apache.org>
>>>          Date: 01/18/2018 06:09 PM
>>>          Subject: Re: How to cancel the running streaming job using
>>>          livy?
>>>
>>>          ------------------------------
>>>
>>>
>>>
>>>          Also just tried the below and got the state. It ended up in
>>>          "FAILED" stated when I expected it to be in "CANCELLED" state. Also
from
>>>          the docs it is not clear if it kills the session or the job? if it kills
>>>          the session I can't spawn any other Job. Sorry cancelling job had been
a
>>>          bit confusing for me.
>>>          DELETE /sessions/0
>>>
>>>
>>>
>>>          On Thu, Jan 18, 2018 at 5:55 PM, kant kodali <
>>>          *kanth909@gmail.com* <kanth909@gmail.com>> wrote:
>>>             oh this raises couple questions.
>>>
>>>                      1) Is there a programmatic way to cancel a job?
>>>
>>>                      2) is  there any programmatic way to get session
>>>                      id? If not, how do I get a sessionId when I spawn multiple
jobs or multiple
>>>                      sessions?
>>>
>>>
>>>                      On Thu, Jan 18, 2018 at 5:39 PM, Alex Bozarth <
>>>                      *ajbozart@us.ibm.com* <ajbozart@us.ibm.com>> wrote:
>>>                      You make a DELETE call as detailed here:
>>>                      *http://livy.apache.org/docs/latest/rest-api.html#response*
>>>                      <https://urldefense.proofpoint.com/v2/url?u=http-3A__livy.apache.org_docs_latest_rest-2Dapi.html-23response&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=S1_S7Dymu4ZL6g7L21O78VQZ53vEnAyZ-cx37DPYDyo&m=nI9x8SjWSOnoLQr05P15W5ofGJayNWwF3InumEtLhVo&s=eAcZY6sAN_mkDv5Ves9UtZaotVvvUc3BBdkCEV_CqVg&e=>
>>>                      *Alex Bozarth*
>>>                      Software Engineer
>>>                      Spark Technology Center
>>>
>>> ------------------------------
>>> *E-mail:* *ajbozart@us.ibm.com* <ajbozart@us.ibm.com>
>>> *GitHub: **github.com/ajbozarth*
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ajbozarth&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=S1_S7Dymu4ZL6g7L21O78VQZ53vEnAyZ-cx37DPYDyo&m=nI9x8SjWSOnoLQr05P15W5ofGJayNWwF3InumEtLhVo&s=EV7HPze6ToE8xgFtDOw9zE2b3sGYWSW1rB-7ZhiJRok&e=>
>>>
>>>
>>> *505 Howard Street*
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D505-2BHoward-2BStreet-2BSan-2BFrancisco-2C-2BCA-2B94105-2BUnited-2BStates-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=S1_S7Dymu4ZL6g7L21O78VQZ53vEnAyZ-cx37DPYDyo&m=nI9x8SjWSOnoLQr05P15W5ofGJayNWwF3InumEtLhVo&s=uy43iGDrczqx4GGhTSYqjjIeyjGpxPQ0611WcWeaB_s&e=>
>>> *San Francisco, CA 94105*
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D505-2BHoward-2BStreet-2BSan-2BFrancisco-2C-2BCA-2B94105-2BUnited-2BStates-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=S1_S7Dymu4ZL6g7L21O78VQZ53vEnAyZ-cx37DPYDyo&m=nI9x8SjWSOnoLQr05P15W5ofGJayNWwF3InumEtLhVo&s=uy43iGDrczqx4GGhTSYqjjIeyjGpxPQ0611WcWeaB_s&e=>
>>> *United States*
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D505-2BHoward-2BStreet-2BSan-2BFrancisco-2C-2BCA-2B94105-2BUnited-2BStates-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=S1_S7Dymu4ZL6g7L21O78VQZ53vEnAyZ-cx37DPYDyo&m=nI9x8SjWSOnoLQr05P15W5ofGJayNWwF3InumEtLhVo&s=uy43iGDrczqx4GGhTSYqjjIeyjGpxPQ0611WcWeaB_s&e=>
>>>
>>>
>>>
>>>
>>>                      [image: Inactive hide details for kant kodali
>>>                      ---01/18/2018 05:34:07 PM---Hi All, I was able to submit
a streaming job to
>>>                      livy however]kant kodali ---01/18/2018 05:34:07
>>>                      PM---Hi All, I was able to submit a streaming job to livy
however I wasn't
>>>                      able to find
>>>
>>>                      From: kant kodali <*kanth909@gmail.com*
>>>                      <kanth909@gmail.com>>
>>>                      To: *user@livy.incubator.apache.org*
>>>                      <user@livy.incubator.apache.org>
>>>                      Date: 01/18/2018 05:34 PM
>>>                      Subject: How to cancel the running streaming job
>>>                      using livy?
>>>                      ------------------------------
>>>
>>>
>>>
>>>                      Hi All,
>>>
>>>                      I was able to submit a streaming job to livy
>>>                      however I wasn't able to find any way to cancel the running
the job? Please
>>>                      let me know.
>>>
>>>                      Thanks!
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>    --
>>>    Marcelo
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Marcelo
>>
>
>


-- 
Marcelo

Mime
View raw message