Export data to Amazon S3 with data pipeline
Automate recurring data exports from Stripe to your AWS S3 Storage bucket with Data Pipeline.
AWS S3 Storage destination
Data pipeline can deliver copies of all your Stripe data as Parquet files into your AWS S3 Storage bucket. It includes a directory of files for each table that’s delivered and updated every 6 hours.
Prerequisites
Before starting the integration, make sure you have an active AWS account and permission to:
- Create an AWS S3 bucket.
- Create an IAM role enabling Stripe to create objects in the provisioned bucket.
- Access the Stripe Dashboard with an admin or developer role.
Create a bucket
- Navigate to your Amazon S3 console in your chosen account region.
- If needed, create a new storage bucket.
- If you don’t currently have an S3 bucket, follow the AWS guidelines for creating your first bucket. We recommend including “stripe” in the name, such as “<name>-stripe-data.”
- Take note of this bucket name and the region because you’ll need them for future steps.
Start the onboarding process
- Visit the Data Pipeline Dashboard.
- Click Get started.
- Select the AWS S3 logo and click Next.
- On this permissions step, you see code blocks that you can use to create the IAM role and trust policy.
Create a new permission policy
To create a new permission policy:
- In your AWS IAM console, click Policies > Create policy > JSON.
- Paste in the supplied JSON snippet from the Stripe onboarding step.
- In the Resource section of the JSON snippet, replace
BUCKET_
with the bucket name you set.RESOURCE - Provide a name for the new policy (for example,
stripe-data-pipeline-policy
).
Create a new trust role using a custom policy
To create a new role using a custom policy:
- In your AWS IAM console, click Roles > Create role > Custom Trust Policy.
- Paste in the supplied JSON snippet from the Stripe onboarding step.
- Click Next on the permissions page, then add the new role to your new policy.
- Select the newly created policy name (for example,
stripe-data-pipeline-policy
). - Save the role with the following name: stripe-data-pipeline-s3-role. You must use this exact name.
Establishing your AWS S3 connection
- Return to the Stripe Data Pipeline onboarding process.
- Enter the AWS Account ID, bucket name and region generated in the previous step.
- Select your data encryption option. If you chose to use a customer managed key, upload your public key. Check the step to generate encryptions keys to see how to create one.
- Click Next. Clicking Next sends test data to the bucket you provided, but not production data.
- When you confirm test data delivery, go to your S3 bucket.
- Open the bucket, navigate to the
penny_
directory, and open the acct_ prefixed sub-directory to locate the deliveredtest account_
test file.validation. csv - Download the
account_
file.validation. csv - Upload this test file in your data pipeline onboarding step.
- Click Confirm value.
- When you confirm the test value, click Subscribe. This subscribes you to the product and schedules the initial full load of data for delivery to your AWS S3 bucket, a process that can take 6-12 hours.