Data Pipeline

Use Data Pipeline to sync Stripe data to a data warehouse.

Data Pipeline is a no-code product that sends all your Stripe data to a variety of data storage destinations. This lets you centralise your Stripe data with other business data to help close your books and get more detailed business insights. If you have questions regarding support for your data destination, contact Stripe support.

With Data Pipeline, you can:

Automatically export your complete Stripe data in a fast and reliable manner.
Stop relying on third-party extract, transform, and load (ETL) pipelines or home-built API integrations.
Combine data from all your Stripe accounts into one data warehouse.
Integrate Stripe data with your other business data for more complete business insights.

Caution

Because of data localisation requirements, Stripe doesn’t offer Data Pipeline services to customers, businesses, or users in India.

Destination support

Stripe Data Pipeline supports two variations of destinations:

Data warehouses (Snowflake, Amazon Redshift)
- For data warehouse destinations, Stripe sends a data share to your data warehouse.
- After you accept the data share, you can access your core Stripe data in Snowflake or Amazon Redshift within 12 hours.
- After the initial load, your Stripe data refreshes regularly, delivering a full load of data every 3 hours.
Cloud storage (Google Cloud Storage, Azure Blob Storage, Amazon S3)
- For our cloud storage destinations, Stripe sends Parquet files directly to a cloud storage location you own.
- After the initial load, your Stripe data refreshes regularly, delivering a new full load of your data every 3 hours.

Database schemas

Your warehouse data is split into two database schemas based on the API mode you used to create the data.

Schema name	Description
`STRIPE`	Data populated from live mode
`STRIPE_TESTMODE`	Data populated from sandboxes and test mode

If you share data from multiple Stripe accounts with the same data warehouse, you can identify these separately. Every table has a merchant_id column, which allows you to filter the data by account.

Use Organisations to manage multiple data pipelines

If you use Organisations, you can see all of the accounts that are sharing data externally. You can :

Create a new data pipeline if you have the Super Administrator or Administrator roles.
Add an account to an existing data warehouse setup without extra verification.
Unsubscribe one or more accounts from a data pipeline.
Delete the pipeline setup.

If you remove an account from an organisation, your data share stops immediately for that account.

Combine proprietary and Stripe data

In some cases, you might want to combine information from your proprietary data with Stripe data. The following schema shows an orders table that lists data about an order for a company. This table doesn’t contain data regarding transaction fees or payouts because that data exists solely within Stripe.

date	order_no	stripe_txn_no	customer_name	price	items
14/08/2025	1	bt_xcVXgHcBfi83m94	John Smith	5	1 book

In Stripe, the balance_transactions table contains the following information, but lacks proprietary data regarding customer names and items purchased:

id	amount	available_on	fee	net	automatic_transfer_id
bt_xcVXgHcBfi83m94	500	14/08/2025	50	450	po_rC4ocAkjGy8zl3j

To access your proprietary data alongside your Stripe data, combine the orders table with Stripe’s balance_transactions table:

select
  orders.date,
  orders.order_no,
  orders.stripe_txn_no,
  bts.amount,
  bts.fee,
  bts.automatic_transfer_id
from mycompany.orders join stripe.balance_transactions bts
on orders.stripe_txn_no = bts.id;

After it completes, the following information is available:

date	order_no	Stripe_txn_no	amount	fee	automatic_transfer_id
14/08/2025	1	bt_xcVXgHcBfi83m94	500	50	po_rC4ocAkjGy8zl3j

Datasets

You can see a list of available datasets under Datasets in the schema documentation.

Available datasets might vary by region, subject to local product availability and regulations. Data Pipeline separately shares each dataset, which contains one or more warehouse tables, as data becomes available. Data Pipeline updates some tables on different schedules based on the availability of new data. See data freshness for more information on available datasets and refresh schedules.

Sandbox support

You can use a sandbox, which is a risk-free environment, to test Data Pipeline functionality. With a sandbox, you can assess data synchronisation without affecting your live production data. During testing, any free trial of Data Pipeline remains unaffected, ensuring you’re never billed for sandbox activities.

To view sandbox data, access the TESTMODE schema, and filter by your specific sandbox merchant ID. This setup allows you to analyse your test data alongside your existing analytics without financial implications.

For more detailed guidance on setting up and managing sandboxes, see Sandboxes.

Turn off Data Pipeline

You can turn off Data Pipeline in the Dashboard by clicking Manage plan.