EMRServerless / Client / create_application

create_application#

EMRServerless.Client.create_application(**kwargs)#

Creates an application.

See also: AWS API Documentation

Request Syntax

response = client.create_application(
    name='string',
    releaseLabel='string',
    type='string',
    clientToken='string',
    initialCapacity={
        'string': {
            'workerCount': 123,
            'workerConfiguration': {
                'cpu': 'string',
                'memory': 'string',
                'disk': 'string',
                'diskType': 'string'
            }
        }
    },
    maximumCapacity={
        'cpu': 'string',
        'memory': 'string',
        'disk': 'string'
    },
    tags={
        'string': 'string'
    },
    autoStartConfiguration={
        'enabled': True|False
    },
    autoStopConfiguration={
        'enabled': True|False,
        'idleTimeoutMinutes': 123
    },
    networkConfiguration={
        'subnetIds': [
            'string',
        ],
        'securityGroupIds': [
            'string',
        ]
    },
    architecture='ARM64'|'X86_64',
    imageConfiguration={
        'imageUri': 'string'
    },
    workerTypeSpecifications={
        'string': {
            'imageConfiguration': {
                'imageUri': 'string'
            }
        }
    },
    runtimeConfiguration=[
        {
            'classification': 'string',
            'properties': {
                'string': 'string'
            },
            'configurations': {'... recursive ...'}
        },
    ],
    monitoringConfiguration={
        's3MonitoringConfiguration': {
            'logUri': 'string',
            'encryptionKeyArn': 'string'
        },
        'managedPersistenceMonitoringConfiguration': {
            'enabled': True|False,
            'encryptionKeyArn': 'string'
        },
        'cloudWatchLoggingConfiguration': {
            'enabled': True|False,
            'logGroupName': 'string',
            'logStreamNamePrefix': 'string',
            'encryptionKeyArn': 'string',
            'logTypes': {
                'string': [
                    'string',
                ]
            }
        },
        'prometheusMonitoringConfiguration': {
            'remoteWriteUrl': 'string'
        }
    },
    interactiveConfiguration={
        'studioEnabled': True|False,
        'livyEndpointEnabled': True|False
    },
    schedulerConfiguration={
        'queueTimeoutMinutes': 123,
        'maxConcurrentRuns': 123
    }
)
Parameters:
  • name (string) – The name of the application.

  • releaseLabel (string) –

    [REQUIRED]

    The Amazon EMR release associated with the application.

  • type (string) –

    [REQUIRED]

    The type of application you want to start, such as Spark or Hive.

  • clientToken (string) –

    [REQUIRED]

    The client idempotency token of the application to create. Its value must be unique for each request.

    This field is autopopulated if not provided.

  • initialCapacity (dict) –

    The capacity to initialize when the application is created.

    • (string) –

      Worker type for an analytics framework.

      • (dict) –

        The initial capacity configuration per worker.

        • workerCount (integer) – [REQUIRED]

          The number of workers in the initial capacity configuration.

        • workerConfiguration (dict) –

          The resource configuration of the initial capacity configuration.

          • cpu (string) – [REQUIRED]

            The CPU requirements for every worker instance of the worker type.

          • memory (string) – [REQUIRED]

            The memory requirements for every worker instance of the worker type.

          • disk (string) –

            The disk requirements for every worker instance of the worker type.

          • diskType (string) –

            The disk type for every worker instance of the work type. Shuffle optimized disks have higher performance characteristics and are better for shuffle heavy workloads. Default is STANDARD.

  • maximumCapacity (dict) –

    The maximum capacity to allocate when the application is created. This is cumulative across all workers at any given point in time, not just when an application is created. No new resources will be created once any one of the defined limits is hit.

    • cpu (string) – [REQUIRED]

      The maximum allowed CPU for an application.

    • memory (string) – [REQUIRED]

      The maximum allowed resources for an application.

    • disk (string) –

      The maximum allowed disk for an application.

  • tags (dict) –

    The tags assigned to the application.

    • (string) –

      • (string) –

  • autoStartConfiguration (dict) –

    The configuration for an application to automatically start on job submission.

    • enabled (boolean) –

      Enables the application to automatically start on job submission. Defaults to true.

  • autoStopConfiguration (dict) –

    The configuration for an application to automatically stop after a certain amount of time being idle.

    • enabled (boolean) –

      Enables the application to automatically stop after a certain amount of time being idle. Defaults to true.

    • idleTimeoutMinutes (integer) –

      The amount of idle time in minutes after which your application will automatically stop. Defaults to 15 minutes.

  • networkConfiguration (dict) –

    The network configuration for customer VPC connectivity.

    • subnetIds (list) –

      The array of subnet Ids for customer VPC connectivity.

      • (string) –

    • securityGroupIds (list) –

      The array of security group Ids for customer VPC connectivity.

      • (string) –

  • architecture (string) – The CPU architecture of an application.

  • imageConfiguration (dict) –

    The image configuration for all worker types. You can either set this parameter or imageConfiguration for each worker type in workerTypeSpecifications.

    • imageUri (string) –

      The URI of an image in the Amazon ECR registry. This field is required when you create a new application. If you leave this field blank in an update, Amazon EMR will remove the image configuration.

  • workerTypeSpecifications (dict) –

    The key-value pairs that specify worker type to WorkerTypeSpecificationInput. This parameter must contain all valid worker types for a Spark or Hive application. Valid worker types include Driver and Executor for Spark applications and HiveDriver and TezTask for Hive applications. You can either set image details in this parameter for each worker type, or in imageConfiguration for all worker types.

    • (string) –

      Worker type for an analytics framework.

      • (dict) –

        The specifications for a worker type.

        • imageConfiguration (dict) –

          The image configuration for a worker type.

          • imageUri (string) –

            The URI of an image in the Amazon ECR registry. This field is required when you create a new application. If you leave this field blank in an update, Amazon EMR will remove the image configuration.

  • runtimeConfiguration (list) –

    The Configuration specifications to use when creating an application. Each configuration consists of a classification and properties. This configuration is applied to all the job runs submitted under the application.

    • (dict) –

      A configuration specification to be used when provisioning an application. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file.

      • classification (string) – [REQUIRED]

        The classification within a configuration.

      • properties (dict) –

        A set of properties specified within a configuration classification.

        • (string) –

          • (string) –

      • configurations (list) –

        A list of additional configurations to apply within a configuration object.

  • monitoringConfiguration (dict) –

    The configuration setting for monitoring.

    • s3MonitoringConfiguration (dict) –

      The Amazon S3 configuration for monitoring log publishing.

      • logUri (string) –

        The Amazon S3 destination URI for log publishing.

      • encryptionKeyArn (string) –

        The KMS key ARN to encrypt the logs published to the given Amazon S3 destination.

    • managedPersistenceMonitoringConfiguration (dict) –

      The managed log persistence configuration for a job run.

      • enabled (boolean) –

        Enables managed logging and defaults to true. If set to false, managed logging will be turned off.

      • encryptionKeyArn (string) –

        The KMS key ARN to encrypt the logs stored in managed log persistence.

    • cloudWatchLoggingConfiguration (dict) –

      The Amazon CloudWatch configuration for monitoring logs. You can configure your jobs to send log information to CloudWatch.

      • enabled (boolean) – [REQUIRED]

        Enables CloudWatch logging.

      • logGroupName (string) –

        The name of the log group in Amazon CloudWatch Logs where you want to publish your logs.

      • logStreamNamePrefix (string) –

        Prefix for the CloudWatch log stream name.

      • encryptionKeyArn (string) –

        The Key Management Service (KMS) key ARN to encrypt the logs that you store in CloudWatch Logs.

      • logTypes (dict) –

        The types of logs that you want to publish to CloudWatch. If you don’t specify any log types, driver STDOUT and STDERR logs will be published to CloudWatch Logs by default. For more information including the supported worker types for Hive and Spark, see Logging for EMR Serverless with CloudWatch.

        • Key Valid Values: SPARK_DRIVER, SPARK_EXECUTOR, HIVE_DRIVER, TEZ_TASK

        • Array Members Valid Values: STDOUT, STDERR, HIVE_LOG, TEZ_AM, SYSTEM_LOGS

        • (string) –

          Worker type for an analytics framework.

          • (list) –

            • (string) –

              Log type for a Spark/Hive job-run.

    • prometheusMonitoringConfiguration (dict) –

      The monitoring configuration object you can configure to send metrics to Amazon Managed Service for Prometheus for a job run.

      • remoteWriteUrl (string) –

        The remote write URL in the Amazon Managed Service for Prometheus workspace to send metrics to.

  • interactiveConfiguration (dict) –

    The interactive configuration object that enables the interactive use cases to use when running an application.

    • studioEnabled (boolean) –

      Enables you to connect an application to Amazon EMR Studio to run interactive workloads in a notebook.

    • livyEndpointEnabled (boolean) –

      Enables an Apache Livy endpoint that you can connect to and run interactive jobs.

  • schedulerConfiguration (dict) –

    The scheduler configuration for batch and streaming jobs running on this application. Supported with release labels emr-7.0.0 and above.

    • queueTimeoutMinutes (integer) –

      The maximum duration in minutes for the job in QUEUED state. If scheduler configuration is enabled on your application, the default value is 360 minutes (6 hours). The valid range is from 15 to 720.

    • maxConcurrentRuns (integer) –

      The maximum concurrent job runs on this application. If scheduler configuration is enabled on your application, the default value is 15. The valid range is 1 to 1000.

Return type:

dict

Returns:

Response Syntax

{
    'applicationId': 'string',
    'name': 'string',
    'arn': 'string'
}

Response Structure

  • (dict) –

    • applicationId (string) –

      The output contains the application ID.

    • name (string) –

      The output contains the name of the application.

    • arn (string) –

      The output contains the ARN of the application.

Exceptions

  • EMRServerless.Client.exceptions.ValidationException

  • EMRServerless.Client.exceptions.ResourceNotFoundException

  • EMRServerless.Client.exceptions.InternalServerException

  • EMRServerless.Client.exceptions.ConflictException