LookoutEquipment / Client / describe_data_ingestion_job
describe_data_ingestion_job#
- LookoutEquipment.Client.describe_data_ingestion_job(**kwargs)#
Provides information on a specific data ingestion job such as creation time, dataset ARN, and status.
See also: AWS API Documentation
Request Syntax
response = client.describe_data_ingestion_job( JobId='string' )
- Parameters:
JobId (string) –
[REQUIRED]
The job ID of the data ingestion job.
- Return type:
dict
- Returns:
Response Syntax
{ 'JobId': 'string', 'DatasetArn': 'string', 'IngestionInputConfiguration': { 'S3InputConfiguration': { 'Bucket': 'string', 'Prefix': 'string', 'KeyPattern': 'string' } }, 'RoleArn': 'string', 'CreatedAt': datetime(2015, 1, 1), 'Status': 'IN_PROGRESS'|'SUCCESS'|'FAILED'|'IMPORT_IN_PROGRESS', 'FailedReason': 'string', 'DataQualitySummary': { 'InsufficientSensorData': { 'MissingCompleteSensorData': { 'AffectedSensorCount': 123 }, 'SensorsWithShortDateRange': { 'AffectedSensorCount': 123 } }, 'MissingSensorData': { 'AffectedSensorCount': 123, 'TotalNumberOfMissingValues': 123 }, 'InvalidSensorData': { 'AffectedSensorCount': 123, 'TotalNumberOfInvalidValues': 123 }, 'UnsupportedTimestamps': { 'TotalNumberOfUnsupportedTimestamps': 123 }, 'DuplicateTimestamps': { 'TotalNumberOfDuplicateTimestamps': 123 } }, 'IngestedFilesSummary': { 'TotalNumberOfFiles': 123, 'IngestedNumberOfFiles': 123, 'DiscardedFiles': [ { 'Bucket': 'string', 'Key': 'string' }, ] }, 'StatusDetail': 'string', 'IngestedDataSize': 123, 'DataStartTime': datetime(2015, 1, 1), 'DataEndTime': datetime(2015, 1, 1), 'SourceDatasetArn': 'string' }
Response Structure
(dict) –
JobId (string) –
Indicates the job ID of the data ingestion job.
DatasetArn (string) –
The Amazon Resource Name (ARN) of the dataset being used in the data ingestion job.
IngestionInputConfiguration (dict) –
Specifies the S3 location configuration for the data input for the data ingestion job.
S3InputConfiguration (dict) –
The location information for the S3 bucket used for input data for the data ingestion.
Bucket (string) –
The name of the S3 bucket used for the input data for the data ingestion.
Prefix (string) –
The prefix for the S3 location being used for the input data for the data ingestion.
KeyPattern (string) –
The pattern for matching the Amazon S3 files that will be used for ingestion. If the schema was created previously without any KeyPattern, then the default KeyPattern {prefix}/{component_name}/* is used to download files from Amazon S3 according to the schema. This field is required when ingestion is being done for the first time.
Valid Values: {prefix}/{component_name}_* | {prefix}/{component_name}/* | {prefix}/{component_name}[DELIMITER]* (Allowed delimiters : space, dot, underscore, hyphen)
RoleArn (string) –
The Amazon Resource Name (ARN) of an IAM role with permission to access the data source being ingested.
CreatedAt (datetime) –
The time at which the data ingestion job was created.
Status (string) –
Indicates the status of the
DataIngestionJob
operation.FailedReason (string) –
Specifies the reason for failure when a data ingestion job has failed.
DataQualitySummary (dict) –
Gives statistics about a completed ingestion job. These statistics primarily relate to quantifying incorrect data such as MissingCompleteSensorData, MissingSensorData, UnsupportedDateFormats, InsufficientSensorData, and DuplicateTimeStamps.
InsufficientSensorData (dict) –
Parameter that gives information about insufficient data for sensors in the dataset. This includes information about those sensors that have complete data missing and those with a short date range.
MissingCompleteSensorData (dict) –
Parameter that describes the total number of sensors that have data completely missing for it.
AffectedSensorCount (integer) –
Indicates the number of sensors that have data missing completely.
SensorsWithShortDateRange (dict) –
Parameter that describes the total number of sensors that have a short date range of less than 14 days of data overall.
AffectedSensorCount (integer) –
Indicates the number of sensors that have less than 14 days of data.
MissingSensorData (dict) –
Parameter that gives information about data that is missing over all the sensors in the input data.
AffectedSensorCount (integer) –
Indicates the number of sensors that have atleast some data missing.
TotalNumberOfMissingValues (integer) –
Indicates the total number of missing values across all the sensors.
InvalidSensorData (dict) –
Parameter that gives information about data that is invalid over all the sensors in the input data.
AffectedSensorCount (integer) –
Indicates the number of sensors that have at least some invalid values.
TotalNumberOfInvalidValues (integer) –
Indicates the total number of invalid values across all the sensors.
UnsupportedTimestamps (dict) –
Parameter that gives information about unsupported timestamps in the input data.
TotalNumberOfUnsupportedTimestamps (integer) –
Indicates the total number of unsupported timestamps across the ingested data.
DuplicateTimestamps (dict) –
Parameter that gives information about duplicate timestamps in the input data.
TotalNumberOfDuplicateTimestamps (integer) –
Indicates the total number of duplicate timestamps.
IngestedFilesSummary (dict) –
Gives statistics about how many files have been ingested, and which files have not been ingested, for a particular ingestion job.
TotalNumberOfFiles (integer) –
Indicates the total number of files that were submitted for ingestion.
IngestedNumberOfFiles (integer) –
Indicates the number of files that were successfully ingested.
DiscardedFiles (list) –
Indicates the number of files that were discarded. A file could be discarded because its format is invalid (for example, a jpg or pdf) or not readable.
(dict) –
Contains information about an S3 bucket.
Bucket (string) –
The name of the specific S3 bucket.
Key (string) –
The Amazon Web Services Key Management Service (KMS key) key being used to encrypt the S3 object. Without this key, data in the bucket is not accessible.
StatusDetail (string) –
Provides details about status of the ingestion job that is currently in progress.
IngestedDataSize (integer) –
Indicates the size of the ingested dataset.
DataStartTime (datetime) –
Indicates the earliest timestamp corresponding to data that was successfully ingested during this specific ingestion job.
DataEndTime (datetime) –
Indicates the latest timestamp corresponding to data that was successfully ingested during this specific ingestion job.
SourceDatasetArn (string) –
The Amazon Resource Name (ARN) of the source dataset from which the data used for the data ingestion job was imported from.
Exceptions