Best practices

To best utilize the Import API, here are some best practices that we recommend while implementing the Import API for best results.

Using Import Containers effectively

Organizing Import Containers

It is entirely up to you how you want to organize your Import Containers, we recommend breaking down your Import Containers by data source, resource type or as a reusable container for recurring import activities.

Use CasePossible Import Container breakdown
Import Product and CategoryCreate separate containers for Product and Category.
Import Price changes daily at 5 PMCreate one container that could be reused, in case there is more than 200000 imports per day, may be breakdown by some other business logic or temporary container for excess counts.
Import Product changes from multiple sourcescreate container per source for products imports.

Optimizing performance

We recommend having up to 200000 Import Requests at a time in a single Import Container. This way it will not be costly to perform monitoring activities on the container level. Since Import Operations are automatically deleted 48 hours after their creation time, you can reuse Import Containers by utilizing this limit over time.

Example.

Import Operation total countImport Containers (Import Operation count)
Day 1100000container-a (100000)
Day 2500000container-a (100000), container-b (200000), container-c (200000)
Day 3400000container-a (0), container-b (200000), container-c (200000)
Day 4200000container-a (200000), container-b (0), container-c (0)

On Day 3, the container-a will be empty as all the Import Operations reached to 48 hours timeline and get expired, so you can reuse it. Similarly container-b and c get cleaned up on Day 4.

Limits

Although there is no limit on the number of Import Containers as of now, you may expect to have some expiration or deletion policy on them in the future. Although there is no limit on the number of Import Operations or the number of Import Requests per container, we recommend breaking down containers per 200000 Import Operations especially if you would be querying these containers. In case you have an event based architecture and you do not plan to actively monitor these containers, you could have more Import Operations per container, even up to 500000 or more.

Removing unused Import Containers

You can delete Import Containers at your own convenience. As of now there is no limit on the number of Import Containers. We recommend using more containers with less data over more data in less containers. You may expect to have some expiration or deletion policy on containers in the future.

Cleaning up data from Import Containers

You do not need to clean up an Import Container as Import Operations are automatically deleted 48 hours after they are created. If you need, you can delete the Import Container. This will immediately delete all Import Operations in the container. However, already imported data will not be deleted from the platform.

How to send Import Requests to Import Containers

As the batch size is limited to 20 per Import Request, if you have a huge number of resources to import, we recommend doing thread optimization to send your data as fast as possible to an Import Container. Please note that as soon as the first Import Request is received by the Import Container, it already starts to import asynchronously.

When to use product draft vs product/variant/prices individually

Please see the difference between Product Draft and Product endpoints.

How to effectively monitor the import progress?

  • Two of our monitoring endpoints, Import Summary and Query Import Operations, are container based. You may call the Import Summary endpoint for a quick summary and later fetch details using the Query Import Operations endpoint.
  • The Import Summary endpoint should be used to get an aggregated summary of the progress, which gives you the information if you have any errors, unresolved or completed states.
  • Query Import Operations should be used with filters like states to query specific situations. For example, query to fetch all the errors to fix those, or to fetch the unresolved to resolve those.
  • You can use debug mode to fetch the unresolved references if there is something in the unresolved states. This way, you know which references to resolve.

How to best utilize the 48 hours lifetime of the import operations?

The purpose to keep the Import Operations for 48 hours is to allow you to send other referenced data (unresolved refernences) during this time period,

Example.

Suppose one of your teams is responsible for product import but the business validation usually delays the product import for 1 -2 days, and there is another team that imports Prices and is very fast in importing the data. The Import API will keep the price data for maximum of 48 hours and wait for the product to be imported.

What is a very large size import? Limitations of Import API?

What is the recommendation on retries?

You need to retry only if your Import Operation has the rejected status. In any other cases, you do not need to retry. The Import API will handle the retry internally.

What not to do

  • You should not send duplicate import requests concurrently. Since the Import API imports data asynchronously, as of now the order is not guaranteed. It may also lead to a concurrent modification error.
  • You should not query Import Operations or the Import Summary endpoint repeatedly in case of errors without fixing the problems. It may slow down the import process. You can use debug mode if required to find out more details and retry after fixing the problems.