BETA

Missing Data

This feature is part of our Machine Learning APIs that are available in the Google Cloud Regions except Australia (Google Cloud, Sydney).

Identifies missing data to improve quality.

The Missing Data API endpoints help to ensure data quality by identifying products which lack important data. Currently, you can query for products found to be missing the following:

  • Product attributes
  • Product images
  • Product prices

In the future, these endpoints will provide automated recommendations on how to fill missing data.

Requests for missing data APIs are asynchronous.

This feature is still in beta. If you have feedback or further feature requests, contact Support Portal.

Missing Attributes

A product's ProductType defines its product attributes. However, attributes are only given values when creating a ProductVariant. Sometimes product variants have product attributes with no values or only use a subset of the available product attributes.

The Missing Attributes API identifies products and product variants that:

  • Are missing attributes entirely
  • Are missing attribute values

The default settings identify product variants which meet either conditions.

Representations

MissingAttributes

  • productId - Reference to a Product
  • productTypeId - Reference to a ProductType
  • variantId - Integer
    ID of a ProductVariant.
  • missingAttributeValues - Array of Strings
    The names of the attributes found without values in a product variant, sorted by attribute importance in descending order.
  • missingAttributeNames - Array of Strings - Optional
    The names of the attributes of the product type that the variant is missing, sorted by attribute importance in descending order.
  • attributeCount - AttributeCount - Optional
  • attributeCoverage - AttributeCoverage - Optional

AttributeCount

  • productTypeAttributes - Integer
    Number of attributes defined in the product type.
  • variantAttributes - Integer
    Number of attributes defined in the variant.
  • missingAttributeValues - Integer
    Number of attributes missing values in the variant.

AttributeCoverage

  • names - Float
    Range: [0.0 - 1.0]
    The percentage of attributes from the product type defined in the product variant. A value of 1.0 indicates a product variant contains all attributes defined in the product type.
  • values - Float
    Range: [0.0 - 1.0]
    Represents the percentage of attributes in the product variant that contain values.

MissingAttributesMeta

MissingAttributesProductLevel

  • total - Integer
    Number of products scanned.
  • missingAttributeNames - Integer
    Number of products missing attribute names.
  • missingAttributeValues - Integer
    Number of products missing attribute values.

MissingAttributesVariantLevel

  • total - Integer
    Number of variants scanned.
  • missingAttributeNames - Integer
    Number of variants missing attribute names.
  • missingAttributeValues - Integer
    Number of variants missing attribute values.

MissingAttributesSearchRequest

  • limit - Number - Optional
  • offset - Number - Optional
  • staged - Boolean - Optional
    Default value: false
    If true, searches data from staged products in addition to published products.
  • productSetLimit - Number - Optional
    Default value: 100000 - Range: [1 - 100000]
    Maximum number of products to scan. If you need to scan more than 100000 products, contact Support Portal.
  • includeVariants - Boolean - Optional
    Default value: true
    If true, searches all product variants. If false, only searches master variants.
  • coverageMin - Float - Optional
    Default value:0.0 - Range: [0.0 - 1.0]
    Minimum attribute coverage of variants to display, applied to both coverage types.
  • coverageMax - Float - Optional
    Default value:1.0 - Range: [0.0 - 1.0]
    Maximum attribute coverage of variants to display, applied to both coverage types.
  • sortBy - String - Optional
    Default value: coverageAttributeValues - Allowed values: [coverageAttributeValues, coverageAttributeNames]
    coverageAttributeValues shows the product variants with the most missing attribute values first and coverageAttributeNames the ones with the most missing attribute names.
  • showMissingAttributeNames - Boolean - Optional
    Default value: true
    If true, the missingAttributeNames will be included in the results.
  • productIds - Array of Strings - Optional
    Filters results by the provided Product IDs. Cannot be applied in combination with any other filter.
  • productTypeIds - Array of Strings - Optional
    Filters results by the provided product type IDs. Cannot be applied in combination with any other filter.
  • attributeName - String - Optional
    Filters results by the provided attribute name. If provided, products are only checked for this attribute. Therefore, only products of product types which define the attribute name are considered. These product type IDs are then listed in MissingAttributesMeta. The attributeCount and attributeCoverage fields are not part of the response when using this filter. Cannot be applied in combination with any other filter.

Query Missing Attributes

Initiation endpoint

Host: one of the Machine Learning hosts.
Endpoint: /{projectKey}/missing-data/attributes
Method: POST
OAuth 2.0 Scopes: view_products:{projectKey}
Request Representation: MissingAttributesSearchRequest
Response Representation: TaskToken

Status endpoint

Host: one of the [Machine Learning hosts](/ml#hosts).\ Endpoint: /{projectKey}/missing-data/attributes/status/{task_id}\ Method: GET\ OAuth 2.0 Scopes: view_products:{projectKey}\ Response Representation: [TaskStatus](/ml#taskstatus) of a [PagedQueryResult](/general-concepts#pagedqueryresult) with resultscontaining an array of [MissingAttributes](#missingattributes). The results array is sorted first by the selectedsortBycoverage value in ascending order and secondly by the other. Themeta` has the MissingAttributesMeta representation.

Example RequestTerminal
curl -X POST https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}/missing-data/attributes \
-H "Content-Type: application/json" \
-H 'Authorization: Bearer {access_token}' \
-d \
'
{
"staged": true,
"limit": 1
}
'
Example Task Token Responsejson
{
"taskId": "b37e87f3-1d2b-4550-83ef-bd2e7e7f1b09",
"uriPath": "/{projectKey}/missing-data/attributes/status/b37e87f3-1d2b-4550-83ef-bd2e7e7f1b09"
}
Request to poll the resultTerminal
curl -H 'Authorization: Bearer {access_token}' https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}/missing-data/attributes/status/b37e87f3-1d2b-4550-83ef-bd2e7e7f1b09
Example Responsejson
{
"result": {
"count": 1,
"offset": 0,
"total": 26137,
"meta": {
"productLevel": {
"total": 2968,
"missingAttributeNames": 2968,
"missingAttributeValues": 2968
},
"variantLevel": {
"total": 26137,
"missingAttributeNames": 26137,
"missingAttributeValues": 26137
}
},
"results": [
{
"product": {
"id": "a8e01ea0-4181-4b00-9a2d-198504c6e4bd",
"typeId": "product"
},
"productType": {
"id": "e7878071-7713-4f37-9ddd-dfc99b9b33dc",
"typeId": "product-type"
},
"variantId": 1,
"attributeCount": {
"productTypeAttributes": 15,
"variantAttributes": 12,
"missingAttributeValues": 2
},
"attributeCoverage": {
"names": 0.80,
"values": 0.83
},
"missingAttributeNames": [
"designer",
"color",
"style"
],
"missingAttributeValues": [
"completeTheLook",
"lookProducts"
]
}
]
},
"state": "SUCCESS",
"expires": "2019-01-19T15:00:56.546614Z"
}
Example Request with attribute filterTerminal
curl -X POST https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}missing-data/attributes \
-H "Content-Type: application/json" \
-H 'Authorization: Bearer {access_token}' \
-d \
'
{
"staged": true,
"limit": 1,
"attributeName":"color"
}
'
Example Task Token Responsejson
{
"taskId": "37fa717f-7d17-4a27-8593-1a73ea7e4f2c",
"uriPath": "/{projectKey}/missing-data/attributes/status/37fa717f-7d17-4a27-8593-1a73ea7e4f2c"
}
Request to poll the resultTerminal
curl -H 'Authorization: Bearer {access_token}' https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}/missing-data/attributes/status/37fa717f-7d17-4a27-8593-1a73ea7e4f2c
Example Response with attribute filterjson
{
"result": {
"count": 1,
"offset": 0,
"total": 173,
"meta": {
"productLevel": {
"total": 1268,
"missingAttributeNames": 47,
"missingAttributeValues": 0
},
"variantLevel": {
"total": 16137,
"missingAttributeNames": 173,
"missingAttributeValues": 0
},
"productTypeIds": [
"e7878071-7713-4f37-9ddd-dfc99b9b33dc"
]
},
"results": [
{
"product": {
"id": "a8e01ea0-4181-4b00-9a2d-198504c6e4bd",
"typeId": "product"
},
"productType": {
"id": "e7878071-7713-4f37-9ddd-dfc99b9b33dc",
"typeId": "product-type"
},
"variantId": 1,
"missingAttributeNames": [
"color"
],
"missingAttributeValues": []
}
]
},
"state": "SUCCESS",
"expires": "2019-02-12T15:55:26.860951Z"
}

Missing Images

This API searches for products with missing images. The default settings return product variants which do not have an image.

Additional parameters can search for products that have less than a specified number of images (threshold) or less than the median number of images per variant (autoThreshold) in the project.

Representations

MissingImages

MissingImagesMeta

MissingImagesProductLevel

  • total - Integer
    Number of products scanned.
  • missingImages - Integer
    Number of products missing images.

MissingImagesVariantLevel

  • total - Integer
    Number of product variants scanned.
  • missingImages - Integer
    Number of product variants missing images.

MissingImagesSearchRequest

  • limit - Number - Optional
  • offset - Number - Optional
  • staged - Boolean - Optional
    Default value: false
    If true, searches data from staged products in addition to published products.
  • productSetLimit - Number - Optional
    Default value: 100000 - Range: [1 - 100000]
    Maximum number of products to scan. If you need to scan more than 100000 products, contact Support Portal.
  • includeVariants - Boolean - Optional
    Default value: true
    If true, searches all product variants. If false, only searches master variants.
  • autoThreshold - Boolean - Optional
    Default value: false
    If true, uses the median number of images per product variant as a threshold value.
  • threshold - Number - Optional
    Default value: 1
    The minimum number of images a product variant must have. Anything below this value is considered a product variant with missing images.
  • productIds - Array of Strings - Optional
    Filters results by the provided Product IDs. Cannot be applied in combination with any other filter.
  • productTypeIds - Array of Strings - Optional
    Filters results by the provided product type IDs. It cannot be applied in combination with any other filter.

Query Missing Images

Initiation endpoint

Host: one of the Machine Learning hosts.
Endpoint: /{projectKey}/missing-data/images
Method: POST
OAuth 2.0 Scopes: view_products:{projectKey}
Request Representation: MissingImagesSearchRequest
Response Representation: TaskToken

Status endpoint

Host: one of the Machine Learning hosts.
Endpoint: /{projectKey}/missing-data/images/status/{task_id}
Method: GET
OAuth 2.0 Scopes: view_products:{projectKey}
Response Representation: TaskStatus of a PagedQueryResult with results containing an array of MissingImages and the meta information of MissingImagesMeta.

Example RequestTerminal
curl -X POST https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}/missing-data/images \
-H "Content-Type: application/json" \
-H 'Authorization: Bearer {access_token}' \
-d \
'
{
"staged": true,
"limit": 1
}
'
Example Task Token Responsejson
{
"taskId": "3508ab41-59cf-4130-be4f-cfe79c78d436",
"uriPath": "/{projectKey}/missing-data/images/status/3508ab41-59cf-4130-be4f-cfe79c78d436"
}
Request to poll the resultTerminal
curl -H 'Authorization: Bearer {access_token}' https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}/missing-data/images/status/3508ab41-59cf-4130-be4f-cfe79c78d436
Example Responsejson
{
"result": {
"count": 1,
"offset": 0,
"total": 7,
"meta": {
"threshold": 1,
"productLevel": {
"total": 2968,
"missingImages": 1
},
"variantLevel": {
"total": 29105,
"missingImages": 7
}
},
"results": [
{
"product": {
"id": "a2a9db01-00fe-436d-b56e-2545642984c0",
"typeId": "product"
},
"variantId": 2,
"imageCount": 0
}
]
},
"state": "SUCCESS",
"expires": "2019-01-19T15:09:27.791377Z"
}

Missing Prices

This API identifies products with missing prices. The default settings return product variants that do not contain prices or have some empty prices.

Additional parameters can be used to identify prices set to 0 also as missing (zeroAsEmpty) and to check whether there are variants with no valid prices for a specified date range (validFrom, validUntil).

Representations

MissingPrices

MissingPricesMeta

MissingPricesProductLevel

  • total - Integer
    Number of products scanned.
  • missingPrices - Integer
    Number of products missing prices.

MissingPricesVariantLevel

  • total - Integer
    Number of product variants scanned.
  • missingPrices - Integer
    Number of product variants missing prices.

MissingPricesSearchRequest

  • limit - Number - Optional
  • offset - Number - Opional
  • staged - Boolean - Optional
    Default value: false
    If true, searches data from staged products in addition to published products.
  • productSetLimit - Number - Optional
    Default value: 100000 - Range: [1 - 100000]
    Maximum number of products to scan. If you need to scan more than 100000 products, contact Support Portal.
  • includeVariants - Boolean - Optional
    Default value: true
    If true, searches all product variants. If false, only searches master variants.
  • currencyCode - String - Optional
    If used, only checks if a product variant has a price in the provided ISO 4217 currency code.
  • checkDate - Boolean - Optional
    Default value: false
    If true, checks if there are prices for the specified date range and time.
  • validFrom - DateTime - Optional
    Starting date of the range to check. If no value is given, checks prices valid at the time the search is initiated.
  • validUntil - DateTime - Optional
    Ending date of the range to check. If no value is given, it is equal to validFrom.
  • productIds - Array of Strings - Optional
    Filters results by the provided Product IDs. Cannot be applied in combination with the productTypeIds filter.
  • productTypeIds - Array of Strings - Optional
    Filters results by the provided product type IDs. Cannot be applied in combination with the productIds filter.

Query Missing Prices

Initiation endpoint

Host: one of the Machine Learning hosts.
Endpoint: /{projectKey}/missing-data/prices
Method: GET
OAuth 2.0 Scopes: view_products:{projectKey}
Request Representation: MissingPricesSearchRequest
Response Representation: TaskToken

Status endpoint

Host: one of the Machine Learning hosts.
Endpoint: /{projectKey}/missing-data/prices/status/{task_id}
Method: GET
OAuth 2.0 Scopes: view_products:{projectKey}
Response Representation: TaskStatus of a PagedQueryResult with results containing an array of MissingPrices and the meta information of MissingImagesMeta.

Example RequestTerminal
curl -X POST https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}/missing-data/prices \
-H "Content-Type: application/json" \
-H 'Authorization: Bearer {access_token}' \
-d \
'
{
"staged": true,
"limit": 2
}
'
Example Task Token Responsejson
{
"taskId": "34a771e8-cd39-44c4-8989-181e6b588a7f",
"uriPath": "/{projectKey}/missing-data/prices/status/34a771e8-cd39-44c4-8989-181e6b588a7f"
}
Request to poll the resultTerminal
curl -H 'Authorization: Bearer {access_token}' https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}/missing-data/prices/status/34a771e8-cd39-44c4-8989-181e6b588a7f
Example Responsejson
{
"result": {
"count": 2,
"offset": 0,
"total": 122,
"meta": {
"productLevel": {
"total": 2828,
"missingPrices": 122
},
"variantLevel": {
"total": 2828,
"missingPrices": 122
}
},
"results": [
{
"product": {
"id": "d637f362-0940-4902-9e8e-a361ff71d569",
"typeId": "product"
},
"variantId": 1
},
{
"product": {
"id": "9a47c2be-c842-4403-a779-64f8872d587d",
"typeId": "product"
},
"variantId": 1
}
]
},
"state": "SUCCESS",
"expires": "2019-01-19T16:18:45.121950Z"
}