The Stripebot web crawler

Learn how Stripe uses a web crawler to access user websites.

Stripebot is the Stripe automated web crawler that collects data from our users’ websites. We use the collected data to provide services to our users and to comply with financial regulations.

Stripebot uses algorithms to minimise web server traffic. Stripe wants to make sure our crawler doesn’t impact the speed or accessibility of our users’ websites.

Identify Stripebot

Stripebot identifies itself with the following user-agent information:

Mozilla/5.0 (X11; Linux {version}) AppleWebKit/{version} (KHTML, like Gecko) Chrome/{version} Safari/{version} (Stripebot/{version}; +https://docs.stripe.com/stripebot-crawler)

To verify that a web crawler accessing your server is actually Stripebot, use DNS verification to check whether the IP address logged on your server resolves to the Stripe-designated domain:

Use a command line tool to run a reverse DNS lookup on the logged IP address. Verify that it resolves to a URL within the crawl.stripe.com domain. For example, if the IP address in your logs is 1.2.3.4:

Command Line
$ host 1.2.3.4
4.3.2.1.in-addr.arpa domain name pointer 1-2-3-4.crawl.stripe.com

The resolved URL is in the crawl.stripe.com domain, so it’s probably Stripebot.

Make sure that the URL points to the logged IP address by running a forward DNS lookup. For example:

Command Line
$ host 1-2-3-4.crawl.stripe.com
1-2-3-4.crawl.stripe.com has address 1.2.3.4

The IP address matches the address logged on your server, which indicates that it’s Stripebot.

Control Stripebot access to pages

Stripebot mostly follows the RFC 9309 Robots Exclusion Protocol. It recognises the following lines (case-insensitive) in a robots.txt file:

User-Agent: The bot that the following rule group applies to
Allow: A URL path that the bot can crawl
Disallow: A URL path that the bot can’t crawl

Stripebot follows the rules in the first group that has a User-Agent of Stripebot. If it can’t find a matching group, it follows the rules in the first group that has a User-Agent of *. In either case, if it finds multiple matching groups, it only follows the rules in the first one.

For example, the following rule group explicitly allows Stripebot to access the /stripe-stuff path, and blocks it from accessing the /private path:

User-Agent: Stripebot
Allow: /stripe-stuff
Disallow: /private

robots.txt caching

If you update the robots.txt file, caching might prevent Stripebot from immediately recognising the changes. Also, if robots.txt exists, but attempting to read the file returns an error, Stripebot might use a cached version (if available).

Get help with Stripebot

If you have any questions or concerns about Stripebot, email us at stripebot@stripe.com. If your issue involves any specific domain names, include them in your message.