griffin-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nevena Veljkovic (Jira)" <>
Subject [jira] [Created] (GRIFFIN-293) [Service] livy.need.queue=true
Date Mon, 30 Sep 2019 11:18:00 GMT
Nevena Veljkovic created GRIFFIN-293:

             Summary: [Service] livy.need.queue=true
                 Key: GRIFFIN-293
             Project: Griffin
          Issue Type: Bug
    Affects Versions: 0.6.0
            Reporter: Nevena Veljkovic
             Fix For: 0.6.0

While using griffin in several productions environments, having x10 jobs starting at same
hour, minute, second, we figured out that 2 (or more) concurrent griffin jobs are not submitted
and executed to the end (the last was submitted multiple times, the rest never).

 2 jobs "beta_node_metrics_fact" and "beta_node_master_dimension_device", difference between
them is 1 millisecond
2019-09-28 14:00:37.090 INFO 2732 --- [ryBean_Worker-4] o.a.g.c.j.SparkSubmitJob [203] : {
 "measure.type" : "griffin",
 "id" : 60560,
 "name" : "beta_node_metrics_fact",

2019-09-28 14:00:37.091 INFO 2732 --- [ryBean_Worker-5] o.a.g.c.j.SparkSubmitJob [203] : {
 "measure.type" : "griffin",
 "id" : 63751,
 "name" : "beta_node_master_dimension_device",
livy submitted 2 jobs/tasks, both contained "beta_node_master_dimension_device"

That's why decided to use setting "livy.need.queue=true".
 During testing we figured out queueing does not work at all as LivyTaskSubmitHelper's member
sparkSubmitJob was not instantiated

We fixed this and continue with testing.

During testing we figured out that curConcurrentTaskNum does not decrease finished tasks (state

We fixed this also.


This message was sent by Atlassian Jira

View raw message