Select your cookie preferences

We use cookies and similar tools to enhance your experience, provide our services, deliver relevant advertising, and make improvements. Approved third parties also use these tools to help us deliver advertising and provide certain site features.

start_job_run

start_job_run(**kwargs)

Starts a job run. A job run is a unit of work, such as a Spark jar, PySpark script, or SparkSQL query, that you submit to Amazon EMR on EKS.

See also: AWS API Documentation

Request Syntax

response = client.start_job_run(
    name='string',
    virtualClusterId='string',
    clientToken='string',
    executionRoleArn='string',
    releaseLabel='string',
    jobDriver={
        'sparkSubmitJobDriver': {
            'entryPoint': 'string',
            'entryPointArguments': [
                'string',
            ],
            'sparkSubmitParameters': 'string'
        },
        'sparkSqlJobDriver': {
            'entryPoint': 'string',
            'sparkSqlParameters': 'string'
        }
    },
    configurationOverrides={
        'applicationConfiguration': [
            {
                'classification': 'string',
                'properties': {
                    'string': 'string'
                },
                'configurations': {'... recursive ...'}
            },
        ],
        'monitoringConfiguration': {
            'persistentAppUI': 'ENABLED'|'DISABLED',
            'cloudWatchMonitoringConfiguration': {
                'logGroupName': 'string',
                'logStreamNamePrefix': 'string'
            },
            's3MonitoringConfiguration': {
                'logUri': 'string'
            }
        }
    },
    tags={
        'string': 'string'
    },
    jobTemplateId='string',
    jobTemplateParameters={
        'string': 'string'
    },
    retryPolicyConfiguration={
        'maxAttempts': 123
    }
)
Parameters
  • name (string) -- The name of the job run.
  • virtualClusterId (string) --

    [REQUIRED]

    The virtual cluster ID for which the job run request is submitted.

  • clientToken (string) --

    [REQUIRED]

    The client idempotency token of the job run request.

    This field is autopopulated if not provided.

  • executionRoleArn (string) -- The execution role ARN for the job run.
  • releaseLabel (string) -- The Amazon EMR release version to use for the job run.
  • jobDriver (dict) --

    The job driver for the job run.

    • sparkSubmitJobDriver (dict) --

      The job driver parameters specified for spark submit.

      • entryPoint (string) -- [REQUIRED]

        The entry point of job application.

      • entryPointArguments (list) --

        The arguments for job application.

        • (string) --
      • sparkSubmitParameters (string) --

        The Spark submit parameters that are used for job runs.

    • sparkSqlJobDriver (dict) --

      The job driver for job type.

      • entryPoint (string) --

        The SQL file to be executed.

      • sparkSqlParameters (string) --

        The Spark parameters to be included in the Spark SQL command.

  • configurationOverrides (dict) --

    The configuration overrides for the job run.

    • applicationConfiguration (list) --

      The configurations for the application running by the job run.

      • (dict) --

        A configuration specification to be used when provisioning virtual clusters, which can include configurations for applications and software bundled with Amazon EMR on EKS. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file.

        • classification (string) -- [REQUIRED]

          The classification within a configuration.

        • properties (dict) --

          A set of properties specified within a configuration classification.

          • (string) --
            • (string) --
        • configurations (list) --

          A list of additional configurations to apply within a configuration object.

    • monitoringConfiguration (dict) --

      The configurations for monitoring.

      • persistentAppUI (string) --

        Monitoring configurations for the persistent application UI.

      • cloudWatchMonitoringConfiguration (dict) --

        Monitoring configurations for CloudWatch.

        • logGroupName (string) -- [REQUIRED]

          The name of the log group for log publishing.

        • logStreamNamePrefix (string) --

          The specified name prefix for log streams.

      • s3MonitoringConfiguration (dict) --

        Amazon S3 configuration for monitoring log publishing.

        • logUri (string) -- [REQUIRED]

          Amazon S3 destination URI for log publishing.

  • tags (dict) --

    The tags assigned to job runs.

    • (string) --
      • (string) --
  • jobTemplateId (string) -- The job template ID to be used to start the job run.
  • jobTemplateParameters (dict) --

    The values of job template parameters to start a job run.

    • (string) --
      • (string) --
  • retryPolicyConfiguration (dict) --

    The retry policy configuration for the job run.

    • maxAttempts (integer) -- [REQUIRED]

      The maximum number of attempts on the job's driver.

Return type

dict

Returns

Response Syntax

{
    'id': 'string',
    'name': 'string',
    'arn': 'string',
    'virtualClusterId': 'string'
}

Response Structure

  • (dict) --

    • id (string) --

      This output displays the started job run ID.

    • name (string) --

      This output displays the name of the started job run.

    • arn (string) --

      This output lists the ARN of job run.

    • virtualClusterId (string) --

      This output displays the virtual cluster ID for which the job run was submitted.

Exceptions

  • EMRContainers.Client.exceptions.ValidationException
  • EMRContainers.Client.exceptions.ResourceNotFoundException
  • EMRContainers.Client.exceptions.InternalServerException