Missing Data
This feature is part of our Machine Learning APIs that are available in the Google Cloud Regions except Australia (Google Cloud, Sydney).
Identifies missing data to improve quality.
The Missing Data API endpoints help to ensure data quality by identifying products which lack important data. Currently, you can query for products found to be missing the following:
- Product attributes
- Product images
- Product prices
In the future, these endpoints will provide automated recommendations on how to fill missing data.
Requests for missing data APIs are asynchronous.
This feature is still in beta. If you have feedback or further feature requests, contact Support Portal.
Missing Attributes
A product's ProductType defines its product attributes. However, attributes are only given values when creating a ProductVariant. Sometimes product variants have product attributes with no values or only use a subset of the available product attributes.
The Missing Attributes API identifies products and product variants that:
- Are missing attributes entirely
- Are missing attribute values
The default settings identify product variants which meet either conditions.
Representations
MissingAttributes
productId
- Reference to a ProductproductTypeId
- Reference to a ProductTypevariantId
- Integer
ID of a ProductVariant.missingAttributeValues
- Array of Strings
The names of the attributes found without values in a product variant, sorted by attribute importance in descending order.missingAttributeNames
- Array of Strings - Optional
The names of the attributes of the product type that the variant is missing, sorted by attribute importance in descending order.attributeCount
- AttributeCount - OptionalattributeCoverage
- AttributeCoverage - Optional
AttributeCount
productTypeAttributes
- Integer
Number of attributes defined in the product type.variantAttributes
- Integer
Number of attributes defined in the variant.missingAttributeValues
- Integer
Number of attributes missing values in the variant.
AttributeCoverage
names
- Float
Range: [0.0
-1.0
]
The percentage of attributes from the product type defined in the product variant. A value of1.0
indicates a product variant contains all attributes defined in the product type.values
- Float
Range: [0.0
-1.0
]
Represents the percentage of attributes in the product variant that contain values.
MissingAttributesMeta
productLevel
- MissingAttributesProductLevelvariantLevel
- MissingAttributesVariantLevelproductTypeIds
- Array of Strings - Optional
The IDs of the product types containing the requestedattributeName
.
MissingAttributesProductLevel
total
- Integer
Number of products scanned.missingAttributeNames
- Integer
Number of products missing attribute names.missingAttributeValues
- Integer
Number of products missing attribute values.
MissingAttributesVariantLevel
total
- Integer
Number of variants scanned.missingAttributeNames
- Integer
Number of variants missing attribute names.missingAttributeValues
- Integer
Number of variants missing attribute values.
MissingAttributesSearchRequest
limit
- Number - Optionaloffset
- Number - Optionalstaged
- Boolean - Optional
Default value:false
If true, searches data from staged products in addition to published products.productSetLimit
- Number - Optional
Default value:100000
- Range: [1
-100000
]
Maximum number of products to scan. If you need to scan more than 100000 products, contact Support Portal.includeVariants
- Boolean - Optional
Default value:true
If true, searches all product variants. If false, only searches master variants.coverageMin
- Float - Optional
Default value:0.0
- Range: [0.0
-1.0
]
Minimum attribute coverage of variants to display, applied to both coverage types.coverageMax
- Float - Optional
Default value:1.0
- Range: [0.0
-1.0
]
Maximum attribute coverage of variants to display, applied to both coverage types.sortBy
- String - Optional
Default value:coverageAttributeValues
- Allowed values: [coverageAttributeValues
,coverageAttributeNames
]coverageAttributeValues
shows the product variants with the most missing attribute values first andcoverageAttributeNames
the ones with the most missing attribute names.showMissingAttributeNames
- Boolean - Optional
Default value:true
If true, themissingAttributeNames
will be included in the results.productIds
- Array of Strings - Optional
Filters results by the provided Product IDs. Cannot be applied in combination with any other filter.productTypeIds
- Array of Strings - Optional
Filters results by the provided product type IDs. Cannot be applied in combination with any other filter.attributeName
- String - Optional
Filters results by the provided attribute name. If provided, products are only checked for this attribute. Therefore, only products of product types which define the attribute name are considered. These product type IDs are then listed inMissingAttributesMeta
. TheattributeCount
andattributeCoverage
fields are not part of the response when using this filter. Cannot be applied in combination with any other filter.
Query Missing Attributes
Initiation endpoint
Host: one of the Machine Learning hosts.
Endpoint: /{projectKey}/missing-data/attributes
Method: POST
OAuth 2.0 Scopes: view_products:{projectKey}
Request Representation: MissingAttributesSearchRequest
Response Representation: TaskToken
Status endpoint
Host: one of the [Machine Learning hosts](/ml#hosts).\ Endpoint:
/{projectKey}/missing-data/attributes/status/{task_id}\ Method:
GET\ OAuth 2.0 Scopes:
view_products:{projectKey}\ Response Representation: [TaskStatus](/ml#taskstatus) of a [PagedQueryResult](/general-concepts#pagedqueryresult) with
resultscontaining an array of [MissingAttributes](#missingattributes). The results array is sorted first by the selected
sortBycoverage value in ascending order and secondly by the other. The
meta` has the MissingAttributesMeta representation.
curl -X POST https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}/missing-data/attributes \-H "Content-Type: application/json" \-H 'Authorization: Bearer {access_token}' \-d \'{"staged": true,"limit": 1}'
{"taskId": "b37e87f3-1d2b-4550-83ef-bd2e7e7f1b09","uriPath": "/{projectKey}/missing-data/attributes/status/b37e87f3-1d2b-4550-83ef-bd2e7e7f1b09"}
curl -H 'Authorization: Bearer {access_token}' https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}/missing-data/attributes/status/b37e87f3-1d2b-4550-83ef-bd2e7e7f1b09
{"result": {"count": 1,"offset": 0,"total": 26137,"meta": {"productLevel": {"total": 2968,"missingAttributeNames": 2968,"missingAttributeValues": 2968},"variantLevel": {"total": 26137,"missingAttributeNames": 26137,"missingAttributeValues": 26137}},"results": [{"product": {"id": "a8e01ea0-4181-4b00-9a2d-198504c6e4bd","typeId": "product"},"productType": {"id": "e7878071-7713-4f37-9ddd-dfc99b9b33dc","typeId": "product-type"},"variantId": 1,"attributeCount": {"productTypeAttributes": 15,"variantAttributes": 12,"missingAttributeValues": 2},"attributeCoverage": {"names": 0.80,"values": 0.83},"missingAttributeNames": ["designer","color","style"],"missingAttributeValues": ["completeTheLook","lookProducts"]}]},"state": "SUCCESS","expires": "2019-01-19T15:00:56.546614Z"}
curl -X POST https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}missing-data/attributes \-H "Content-Type: application/json" \-H 'Authorization: Bearer {access_token}' \-d \'{"staged": true,"limit": 1,"attributeName":"color"}'
{"taskId": "37fa717f-7d17-4a27-8593-1a73ea7e4f2c","uriPath": "/{projectKey}/missing-data/attributes/status/37fa717f-7d17-4a27-8593-1a73ea7e4f2c"}
curl -H 'Authorization: Bearer {access_token}' https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}/missing-data/attributes/status/37fa717f-7d17-4a27-8593-1a73ea7e4f2c
{"result": {"count": 1,"offset": 0,"total": 173,"meta": {"productLevel": {"total": 1268,"missingAttributeNames": 47,"missingAttributeValues": 0},"variantLevel": {"total": 16137,"missingAttributeNames": 173,"missingAttributeValues": 0},"productTypeIds": ["e7878071-7713-4f37-9ddd-dfc99b9b33dc"]},"results": [{"product": {"id": "a8e01ea0-4181-4b00-9a2d-198504c6e4bd","typeId": "product"},"productType": {"id": "e7878071-7713-4f37-9ddd-dfc99b9b33dc","typeId": "product-type"},"variantId": 1,"missingAttributeNames": ["color"],"missingAttributeValues": []}]},"state": "SUCCESS","expires": "2019-02-12T15:55:26.860951Z"}
Missing Images
This API searches for products with missing images. The default settings return product variants which do not have an image.
Additional parameters can search for products that have less than a specified number of images (threshold
) or less than the
median number of images per variant (autoThreshold
) in the project.
Representations
MissingImages
productId
- Reference to a ProductvariantId
- Integer
ID of the ProductVariant.imageCount
- Integer
Number of images the variant contains.
MissingImagesMeta
productLevel
- MissingImagesProductLevelvariantLevel
- MissingImagesVariantLevelthreshold
- Integer
The minimum number of images a product variant must have. Anything below this value is considered a product variant with missing images.
MissingImagesProductLevel
total
- Integer
Number of products scanned.missingImages
- Integer
Number of products missing images.
MissingImagesVariantLevel
total
- Integer
Number of product variants scanned.missingImages
- Integer
Number of product variants missing images.
MissingImagesSearchRequest
limit
- Number - Optionaloffset
- Number - Optionalstaged
- Boolean - Optional
Default value:false
If true, searches data from staged products in addition to published products.productSetLimit
- Number - Optional
Default value:100000
- Range: [1
-100000
]
Maximum number of products to scan. If you need to scan more than 100000 products, contact Support Portal.includeVariants
- Boolean - Optional
Default value:true
If true, searches all product variants. If false, only searches master variants.autoThreshold
- Boolean - Optional
Default value:false
If true, uses the median number of images per product variant as a threshold value.threshold
- Number - Optional
Default value:1
The minimum number of images a product variant must have. Anything below this value is considered a product variant with missing images.productIds
- Array of Strings - Optional
Filters results by the provided Product IDs. Cannot be applied in combination with any other filter.productTypeIds
- Array of Strings - Optional
Filters results by the provided product type IDs. It cannot be applied in combination with any other filter.
Query Missing Images
Initiation endpoint
Host: one of the Machine Learning hosts.
Endpoint: /{projectKey}/missing-data/images
Method: POST
OAuth 2.0 Scopes: view_products:{projectKey}
Request Representation: MissingImagesSearchRequest
Response Representation: TaskToken
Status endpoint
Host: one of the Machine Learning hosts.
Endpoint: /{projectKey}/missing-data/images/status/{task_id}
Method: GET
OAuth 2.0 Scopes: view_products:{projectKey}
Response Representation: TaskStatus of a PagedQueryResult with
results
containing an array of MissingImages and the meta
information of MissingImagesMeta.
curl -X POST https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}/missing-data/images \-H "Content-Type: application/json" \-H 'Authorization: Bearer {access_token}' \-d \'{"staged": true,"limit": 1}'
{"taskId": "3508ab41-59cf-4130-be4f-cfe79c78d436","uriPath": "/{projectKey}/missing-data/images/status/3508ab41-59cf-4130-be4f-cfe79c78d436"}
curl -H 'Authorization: Bearer {access_token}' https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}/missing-data/images/status/3508ab41-59cf-4130-be4f-cfe79c78d436
{"result": {"count": 1,"offset": 0,"total": 7,"meta": {"threshold": 1,"productLevel": {"total": 2968,"missingImages": 1},"variantLevel": {"total": 29105,"missingImages": 7}},"results": [{"product": {"id": "a2a9db01-00fe-436d-b56e-2545642984c0","typeId": "product"},"variantId": 2,"imageCount": 0}]},"state": "SUCCESS","expires": "2019-01-19T15:09:27.791377Z"}
Missing Prices
This API identifies products with missing prices. The default settings return product variants that do not contain prices or have some empty prices.
Additional parameters can be used to identify prices set to 0
also as missing (zeroAsEmpty
) and to check whether there are variants with no valid prices for a specified date range
(validFrom
, validUntil
).
Representations
MissingPrices
productId
- Reference to a ProductvariantId
- Integer
Id of the ProductVariant.
MissingPricesMeta
productLevel
- MissingPricesProductLevelvariantLevel
- MissingPricesVariantLevel
MissingPricesProductLevel
total
- Integer
Number of products scanned.missingPrices
- Integer
Number of products missing prices.
MissingPricesVariantLevel
total
- Integer
Number of product variants scanned.missingPrices
- Integer
Number of product variants missing prices.
MissingPricesSearchRequest
limit
- Number - Optionaloffset
- Number - Opionalstaged
- Boolean - Optional
Default value:false
If true, searches data from staged products in addition to published products.productSetLimit
- Number - Optional
Default value:100000
- Range: [1
-100000
]
Maximum number of products to scan. If you need to scan more than 100000 products, contact Support Portal.includeVariants
- Boolean - Optional
Default value:true
If true, searches all product variants. If false, only searches master variants.currencyCode
- String - Optional
If used, only checks if a product variant has a price in the provided ISO 4217 currency code.checkDate
- Boolean - Optional
Default value:false
If true, checks if there are prices for the specified date range and time.validFrom
- DateTime - Optional
Starting date of the range to check. If no value is given, checks prices valid at the time the search is initiated.validUntil
- DateTime - Optional
Ending date of the range to check. If no value is given, it is equal tovalidFrom
.productIds
- Array of Strings - Optional
Filters results by the provided Product IDs. Cannot be applied in combination with theproductTypeIds
filter.productTypeIds
- Array of Strings - Optional
Filters results by the provided product type IDs. Cannot be applied in combination with theproductIds
filter.
Query Missing Prices
Initiation endpoint
Host: one of the Machine Learning hosts.
Endpoint: /{projectKey}/missing-data/prices
Method: GET
OAuth 2.0 Scopes: view_products:{projectKey}
Request Representation: MissingPricesSearchRequest
Response Representation: TaskToken
Status endpoint
Host: one of the Machine Learning hosts.
Endpoint: /{projectKey}/missing-data/prices/status/{task_id}
Method: GET
OAuth 2.0 Scopes: view_products:{projectKey}
Response Representation: TaskStatus of a PagedQueryResult with
results
containing an array of MissingPrices and the meta
information of MissingImagesMeta.
curl -X POST https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}/missing-data/prices \-H "Content-Type: application/json" \-H 'Authorization: Bearer {access_token}' \-d \'{"staged": true,"limit": 2}'
{"taskId": "34a771e8-cd39-44c4-8989-181e6b588a7f","uriPath": "/{projectKey}/missing-data/prices/status/34a771e8-cd39-44c4-8989-181e6b588a7f"}
curl -H 'Authorization: Bearer {access_token}' https://ml-{mlRegion}.europe-west1.gcp.commercetools.com/{projectKey}/missing-data/prices/status/34a771e8-cd39-44c4-8989-181e6b588a7f
{"result": {"count": 2,"offset": 0,"total": 122,"meta": {"productLevel": {"total": 2828,"missingPrices": 122},"variantLevel": {"total": 2828,"missingPrices": 122}},"results": [{"product": {"id": "d637f362-0940-4902-9e8e-a361ff71d569","typeId": "product"},"variantId": 1},{"product": {"id": "9a47c2be-c842-4403-a779-64f8872d587d","typeId": "product"},"variantId": 1}]},"state": "SUCCESS","expires": "2019-01-19T16:18:45.121950Z"}