# How Data Pipeline works

Learn what you can do with Data Pipeline.

[Data Pipeline](https://dashboard.stripe.com/settings/stripe-data-pipeline) is a no-code product that sends all your Stripe data to a variety of data storage destinations. This lets you centralize your Stripe data with other business data to help close your books and get more detailed business insights. If you have questions regarding support for your data destination, contact [Stripe support](https://support.stripe.com/contact/email?topic=third_party_integrations&subject=Stripe%20Data%20Pipeline%20\(SDP\)).

With Data Pipeline, you can:

- Automatically export your complete Stripe data in a fast and reliable manner.
- Stop relying on third-party extract, transform, and load (ETL) pipelines or home-built API integrations.
- Combine data from all your Stripe accounts into one data warehouse.
- Integrate Stripe data with your other business data for more complete business insights.

> Because of data localization requirements, Stripe doesn’t offer Data Pipeline services to customers, businesses, or users in India.

## Destination support 

Stripe Data Pipeline supports two variations of destinations:

- [Data warehouses](https://docs.stripe.com/data/access-data-in-warehouse/data-warehouses.md) (Snowflake, Amazon Redshift, Databricks)

  - For data warehouse destinations, Stripe sends a data share to your data warehouse.

  - After you accept the data share, you can access your core Stripe data in Snowflake, Amazon Redshift, or Databricks within 12 hours.

  - After the initial load, your Stripe data [refreshes regularly](https://docs.stripe.com/data/data-pipeline/data-freshness.md), delivering a full load of data every 3 hours.

- [Cloud storage](https://docs.stripe.com/data/access-data-in-warehouse/cloud-storage.md) (Google Cloud Storage, Azure Blob Storage, Amazon S3)

  - For our cloud storage destinations, Stripe sends [Parquet](https://parquet.apache.org/) files directly to a cloud storage location you own.

  - After the initial load, your Stripe data [refreshes regularly](https://docs.stripe.com/data/data-pipeline/data-freshness.md), delivering a new full load of your data every 3 hours.

## Database schemas 

Your warehouse data is split into two database schemas based on the API mode you used to create the data.

| Schema name       | Description                                                                                                                                                                                                                                        |
| ----------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `STRIPE`          | Data populated from *live mode* (Use this mode when you’re ready to launch your app. Card networks or payment providers process payments)                                                                                                          |
| `STRIPE_TESTMODE` | Data populated from *sandboxes* (A sandbox is an isolated test environment that allows you to test Stripe functionality in your account without affecting your live integration. Use sandboxes to safely experiment with new features and changes) |

If you share data from multiple Stripe accounts with the same data warehouse, you can identify these separately. Every table has a `merchant_id` column, which allows you to filter the data by account.

## Use Organizations to manage multiple data pipelines 

If you use [Organizations](https://docs.stripe.com/get-started/account/orgs.md), you can see all of the accounts that are sharing data externally. You can  :

- Create a new data pipeline if you have the [Super Administrator or Administrator](https://docs.stripe.com/get-started/account/teams/roles.md) roles.
- Add an account to an existing data warehouse setup without extra verification.
- Unsubscribe one or more accounts from a data pipeline.
- Delete the pipeline setup.

If you remove an account from an organization, your data share stops immediately for that account.

## Combine proprietary and Stripe data 

In some cases, you might want to combine information from your proprietary data with Stripe data. The following schema shows an `orders` table that lists data about an order for a company. This table doesn’t contain data regarding transaction fees or *payouts* (A payout is the transfer of funds to an external account, usually a bank account, in the form of a deposit) because that data exists solely within Stripe.

| date      | order_no | stripe_txn_no      | customer_name | price | items  |
| --------- | -------- | ------------------ | ------------- | ----- | ------ |
| 5/27/2026 | 1        | bt_xcVXgHcBfi83m94 | John Smith    | 5     | 1 book |

In Stripe, the `balance_transactions` table contains the following information, but lacks proprietary data regarding customer names and items purchased:

| id                 | amount | available_on | fee | net | automatic_transfer_id |
| ------------------ | ------ | ------------ | --- | --- | --------------------- |
| bt_xcVXgHcBfi83m94 | 500    | 5/27/2026    | 50  | 450 | po_rC4ocAkjGy8zl3j    |

To access your proprietary data alongside your Stripe data, combine the `orders` table with Stripe’s `balance_transactions` table:

```sql
select
  orders.date,
  orders.order_no,
  orders.stripe_txn_no,
  bts.amount,
  bts.fee,
  bts.automatic_transfer_id
from mycompany.orders join stripe.balance_transactions bts
on orders.stripe_txn_no = bts.id;
```

After it completes, the following information is available:

| date      | order_no | Stripe_txn_no      | amount | fee | automatic_transfer_id |
| --------- | -------- | ------------------ | ------ | --- | --------------------- |
| 5/27/2026 | 1        | bt_xcVXgHcBfi83m94 | 500    | 50  | po_rC4ocAkjGy8zl3j    |

## Datasets 

You can see a list of available datasets under **Datasets** in the [schema documentation](https://docs.stripe.com/data/schema.md).

Available datasets might vary by region, subject to local product availability and regulations. Data Pipeline separately shares each dataset, which contains one or more warehouse tables, as data becomes available. Data Pipeline updates some tables on different schedules based on the availability of new data. See [data freshness](https://docs.stripe.com/data/data-pipeline/data-freshness.md) for more information on available datasets and refresh schedules.

## Sandbox support 

You can use a sandbox, which is a risk-free environment, to test Data Pipeline functionality. With a sandbox, you can assess data synchronization without affecting your live production data. During testing, any free trial of Data Pipeline remains unaffected, ensuring you’re never billed for sandbox activities.

To view sandbox data, access the `TESTMODE` schema, and filter by your specific sandbox merchant ID. This setup allows you to analyze your test data alongside your existing analytics without financial implications. For more information on setting up and managing sandboxes, see [Sandboxes](https://docs.stripe.com/sandboxes.md).

## Turn off Data Pipeline 

You can turn off Data Pipeline in the Dashboard by clicking **Manage plan**.

## See also

- [Export data to a data warehouse](https://docs.stripe.com/data/access-data-in-warehouse/data-warehouses.md)
- [Export data to cloud storage](https://docs.stripe.com/data/access-data-in-warehouse/cloud-storage.md)
- [Data freshness](https://docs.stripe.com/data/data-pipeline/data-freshness.md)