Show HN: Free tool to see how much AI bots are costing your site

(botcost.dev)

14 points | by plaintosapp 1 hour ago

4 comments

nottorp 23 minutes ago
I ran it on my personal domain that has just some kind of landing page and nothing else on it, and it's not advertised anywhere:
0% of traffic is non-human · <$0.01/yr projected
Well, 0% of traffic is AI bots. 99% of traffic is vulnerability scanners actually.
[-]
- plaintosapp 19 minutes ago
  Ha, that's a fair point — vulnerability scanners are a whole separate category I haven't tackled yet. BotCost is specifically focused on AI training and search crawlers right now (GPTBot, ClaudeBot, Bytespider etc.) rather than security scanners.
  Good news: 0% AI bot traffic on an unadvertised landing page makes sense — those bots tend to follow links and sitemaps. If you run it on a site with real content and traffic you'll likely see a different picture.
  Vulnerability scanners on the other hand... that's a different problem worth solving too.
plaintosapp 39 minutes ago
Also wrote up the background on Dev.to if anyone wants more context on how it works: [https://dev.to/plaintos_app_fd54e75a054e/i-built-a-free-tool...]
[-]
- quinncom 29 minutes ago
  The article doesn't really get into the details. Does it analyze the user agent and compare it to a list of known bot user agents? What about all the bots that spoof user agent values – does it do something special to detect those?
  [-]
  - plaintosapp 15 minutes ago
    Yes exactly — it matches against a database of 18 known AI bot user agent tokens (GPTBot, ClaudeBot, CCBot, Bytespider etc.) plus their known IP ranges where available. GPTBot for example publishes its IP ranges officially so we can match on both UA and IP.
    The spoofing problem is the hard one. Bots that fully spoof Chrome headers are invisible to any UA-based tool including this one. The honest answer is that BotCost catches the "polite" bots that identify themselves — which covers the major AI companies (OpenAI, Anthropic, Google, Meta) since they all self-identify. The truly malicious scrapers that spoof identities are a harder problem requiring behavioral analysis.
    So it's accurate for what it is — catching known AI training and search crawlers — but not a complete bot detection solution.
len_chapaty 43 minutes ago
nice, how are you calculation the cost?
[-]
- plaintosapp 37 minutes ago
  Good question. The cost estimate uses two components:
  1. Bandwidth: total bytes served to bots divided by 1GB, multiplied by $0.09/GB (AWS/Cloudflare blended average rate)
  2. Compute: total bot requests divided by 1 million, multiplied by $0.40 (Vercel/Lambda average per million invocations)
  Both rates are configurable assumptions — the real value is seeing the relative breakdown between bots and the order of magnitude of waste. Your actual cost depends on your specific hosting provider.
  [-]
  - len_chapaty 15 minutes ago
    Got it, makes sense. Worth noting intra-region vs inter-region transfer can differ a lot too. As a blended average for an order-of-magnitude estimate, this is really useful.
    [-]
    - plaintosapp 5 minutes ago
      Good point — intra vs inter-region transfer costs can vary significantly, especially on AWS. The $0.09 is deliberately conservative as a blended estimate. A future version could let users input their actual hosting provider rates for a more precise number. Adding that to the roadmap. Thank you.
  - roysting 11 minutes ago
    Was that response written and/or auto-replied by AI?
    [-]
    - plaintosapp 7 minutes ago
      Fair question given the context. I used AI tools to help build the product, and I do use AI to help draft responses — but I read, edit, and post every reply myself. Nothing is auto-posted.
- newscombinatorY 39 minutes ago
  Hopefully not by using another AI bot... ( ͡° ͜ʖ ͡°)
  [-]
  - plaintosapp 36 minutes ago
    Nope, pure JavaScript in your browser. No AI bots were harmed or employed in the making of this tool.
smy_smy 49 minutes ago
interesting!
[-]
- plaintosapp 48 minutes ago
  Thanks! Curious what you're running — are you seeing AI bot traffic on your site? Would love to know if the log formats you use are covered (Nginx, Apache, Cloudflare CSV, Vercel JSON supported right now).