Run a SparkApplication
Note
SparkApplicationIntegration is currently an alpha feature and is disabled by default.
You can enable it by editing the SparkApplicationIntegration feature gate. Check the Installation guide for details on feature gate configuration.
This page shows how to leverage Kueue’s scheduling and resource management capabilities when running Spark Operator SparkApplication.
This guide is for batch users that have a basic understanding of Kueue. For more information, see Kueue’s overview.
Before you begin
Check administer cluster quotas for details on the initial cluster setup.
Check the Spark Operator installation guide.
You can modify kueue configurations from installed releases to include SparkApplication as an allowed workload.
Note
In order to use SparkApplication integration, you must install Spark Operator v2.4.0 or above.
Please also remember belows:
- You will have to activate namespaces that you will deploy SparkApplication as described in the official installation docs.
- You will have to create spark serviceaccount and attach a proper role beforehand. Please refer to Spark Operator’s Getting Started Guide for details.
Note
In order to use SparkApplication, prior to v0.8.1, you need to restart Kueue after the installation. You can do it by running:kubectl delete pods -l control-plane=controller-manager -n kueue-system.
Spark Operator definition
a. Queue selection
The target local queue should be specified in the metadata.labels section of the SparkApplication configuration.
metadata:
labels:
kueue.x-k8s.io/queue-name: user-queue
Note
SparkApplication integration does not support dynamic allocation. If you setspec.dynamicAllocation.enabled=true in SparkApplication, Kueue will reject such resources in the webhook.
b. Optionally set Suspend field in SparkOperation
spec:
suspend: true
By default, Kueue will set suspend to true via webhook and unsuspend it when the SparkApplication is admitted.
Sample SparkApplication
apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
name: spark-pi
labels:
kueue.x-k8s.io/queue-name: user-queue
spec:
type: Scala
mode: cluster # spark-operator supports "cluster" mode only
sparkVersion: 4.0.0
image: spark:4.0.0
imagePullPolicy: IfNotPresent
mainClass: org.apache.spark.examples.SparkPi
mainApplicationFile: local:///opt/spark/examples/jars/spark-examples.jar
arguments:
- "50000"
memoryOverheadFactor: "0" # spark adds extra memory on memory limits
# for non-JVM tasks. 0 can avoid it.
driver:
coreRequest: "1"
memory: 1g # In Java format (e.g. 512m, 2g)
serviceAccount: spark # You need to create this service account beforehand,
# and the service account should have proper role
# ref: https://github.com/kubeflow/spark-operator/blob/master/config/rbac/spark-application-rbac.yaml
executor:
instances: 2
coreRequest: "1"
memory: 1g # In Java format (e.g. 512m, 2g)
deleteOnTermination: false # to keep terminated executor pods for demo purpose
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.