yunikorn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [incubator-yunikorn-core] jameschen1519 opened a new issue #90: Completed (and sometimes deleted) pods are still marked as "Running" and consume resources
Date Fri, 21 Feb 2020 21:09:17 GMT
jameschen1519 opened a new issue #90: Completed (and sometimes deleted) pods are still marked
as "Running" and consume resources
URL: https://github.com/apache/incubator-yunikorn-core/issues/90
 
 
   When running spark jobs via spark-submit on Kubernetes and applying the Yunikorn scheduler
to a driver/executor pairing, upon termination the Yunikorn scheduler does not mark these
jobs as complete, and the jobs are still marked as "Running". This takes up resources in the
job queues to which the driver/executors are assigned, eventually resulting in resource starvation
until the drivers are manually deleted. Unfortunately, deletion of these pods might not necessarily
free up resources--too many cycles of starting and stopping Yunikorn-scheduled spark pods
results in all resources being consumed, even when there are no more Yunikorn-scheduled spark
pods available.
   
   (It is also worth noting that the driver and executor jobs enter the same queue regardless
of what the executor podTemplateFile specifies. We are unsure of if this is a feature or a
bug.)
   
   Listed below are the reproduction steps. Please let me know if any clarification is needed;
thanks.
   
   ~~~~~~~~~~~~~~~~~~~~~~~~
   
   Environment setup:
   For setting up the Yunikorn pods (Used the helm chart, but also tried setting it up manually):
   `helm install ./yunikorn --namespace test ./yunikorn --generate-name`
   
   queues.yaml snippet to be put into yunikorn with `kubectl -n test edit configmap yunikorn-scheduler`:
   ```
     queues.yaml: |
       partitions:
         - name: default
           placementrules:
             - name: provided
               create: false
           queues:
             - name: root
               submitacl: '*'
               queues:
                 - name: driver
                   resources:
                     guaranteed:
                       memory: 10000
                       vcore: 1000
                     max:
                       memory: 40000
                       vcore: 9000
                 - name: executors
                   resources:
                     guaranteed:
                       memory: 1000
                       vcore: 1000
                     max:
                       memory: 15000
                       vcore: 6000
   
   ```
   
   Command used:
   
   ```
   spark-submit     \
   --master k8s://https://<YOUR K8S IP>:6443     \
   --deploy-mode cluster     \
   --name spark-pi     \
   --class org.apache.spark.examples.SparkPi     \
   --conf spark.kubernetes.container.image=<YOUR SPARK IMAGE>     \
   --conf spark.kubernetes.namespace=test     \
   --conf spark.driver.extraClassPath=/opt/hadoop/etc/hadoop     \
   --conf spark.ssl.enabled=false     \
   --conf spark.authenticate=false     \
   --conf spark.kubernetes.driver.podTemplateFile=/driver.yaml \
   --conf spark.kubernetes.executor.podTemplateFile=/executor.yaml \
   --conf spark.network.crypto.enabled=false \
   <YOUR SPARK JAR FILE; E.G. hdfs://<YOUR HDFS URL>/sparkexample.jar> &
   ```
   
   Under driver.yaml:
   ```
   apiVersion: v1
   kind: Pod
   metadata:
     labels:
       spark-app-id: spark-00001
       queue: root.driver
   spec:
     schedulerName: yunikorn
   
   ```
   
   Under executor.yaml:
   ```
   apiVersion: v1
   kind: Pod
   metadata:
     labels:
       spark-app-id: spark-00001
       queue: root.executors
   spec:
     schedulerName: yunikorn
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: dev-help@yunikorn.apache.org


Mime
View raw message