The Stripebot web crawler
Learn how Stripe uses a web crawler to access user websites.
Stripebot is the Stripe automated web crawler that collects data from our users’ websites. We use the collected data to provide services to our users and to comply with financial regulations.
Stripebot uses algorithms to minimize web server traffic. Stripe wants to make sure our crawler doesn’t impact the speed or accessibility of our users’ websites.
Identify Stripebot
Stripebot identifies itself with the following user-agent
information, although the version number might change:
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36 (Stripebot/1.0)
To verify that a web crawler accessing your server is actually Stripebot, use DNS verification to check whether the IP address logged on your server resolves to the Stripe-designated domain:
- Use a command line tool to run a reverse DNS lookup on the logged IP address. Verify that it resolves to a URL within the
crawl.
domain. For example, if the IP address in your logs isstripe. com 1.
:2. 3. 4
$ host 1.2.3.4 4.3.2.1.in-addr.arpa domain name pointer 1-2-3-4.crawl.stripe.com
The resolved URL is in the crawl.
domain, so it’s probably Stripebot.
- Make sure that the URL points to the logged IP address by running a forward DNS lookup. For example:
$ host 1-2-3-4.crawl.stripe.com 1-2-3-4.crawl.stripe.com has address 1.2.3.4
The IP address matches the address logged on your server, which indicates that it’s Stripebot.
Control Stripebot access to pages
Stripebot mostly follows the RFC 9309 Robots Exclusion Protocol. It recognizes the following lines (case-insensitive) in a robots.
file:
User-Agent
: The bot that the following rule group applies toAllow
: A URL path that the bot can crawlDisallow
: A URL path that the bot can’t crawl
Stripebot follows the rules in the first group that has a User-Agent
of Stripebot
. If it can’t find a matching group, it follows the rules in the first group that has a User-Agent
of *
. In either case, if it finds multiple matching groups, it only follows the rules in the first one.
For example, the following rule group explicitly allows Stripebot to access the /stripe-stuff
path, and blocks it from accessing the /private
path:
User-Agent: Stripebot Allow: /stripe-stuff Disallow: /private
robots.txt caching
If you update the robots.
file, caching might prevent Stripebot from immediately recognizing the changes. Also, if robots.
exists, but attempting to read the file returns an error, Stripebot might use a cached version (if available).
Get help with Stripebot
If you have any questions or concerns about Stripebot, email us at stripebot@stripe.com. If your issue involves any specific domain names, include them in your message.