Introducing the next era of Duende IdentityServer.

Read our CEO’s announcement

Stop AI Bots from Wasting Your Server's Time (and Your Money)

Khalid Abuhakmeh
Two blue circles
Summary: AI bots are increasingly wasting server resources and money by hammering human-focused web applications designed. A pragmatic solution is a lightweight AiBotBlockerMiddleware for ASP.NET Core, which enforces robots.txt rules by checking the User-Agent of incoming requests. However, since some bots spoof their identities, a layered defense strategy is recommended, incorporating rate limiting and more robust tools such as Cloudflare or Anubis.

A while ago, a developer posted a discussion on the Duende Software community board that caught my attention. They were running a Blazor app with a BFF (Backend for Frontend) layer in front of Duende IdentityServer, a standard, well-architected setup. Then they noticed something wrong: traffic was spiking, and it wasn’t humans.

Automated scripts, written in Python using aiohttp and asyncio, were programmatically logging in, establishing sessions, and hammering their BFF endpoints. Some identified themselves honestly via their User-Agent header. Others spoofed it to look like Chrome.

Their assumption was that the BFF could only be used from a browser. As Wesley, of Duende Customer Success, pointed out, the Duende BFF Security Framework is just an ASP.NET Core application. Anything that can make an HTTP request can hit it. curl, wget, Invoke-WebRequest, a Python script: they’re all valid HTTP clients.

This is a growing problem, and it’s not just about BFF apps. Any web application built for humans is now a target for AI bots.

The Real Cost of Bot Traffic

Let’s be blunt about what these bots are doing to your infrastructure:

Wasted compute. Every request a bot makes follows the same path as a human request: through your middleware pipeline, your authentication layer, and your database queries. Your server doesn’t know (or care) whether the response is being fed into a training pipeline or rendered in a browser. It does the work either way.

Wasted money. Cloud hosting bills scale with traffic. If 30% of your requests are AI crawlers scraping an app that has no public content worth indexing, that’s 30% of your compute bill going to someone else’s business model.

Wasted energy. This one’s easy to overlook. Servers consume electricity. AI bots hammering applications designed for authenticated human users are burning real energy for zero value. Not to the developer, not to the users, and arguably not even to the AI company scraping a login-gated app.

Polluted observability. Your logs, metrics, and dashboards become noisy. Error rates spike from bot requests failing auth. P95 latency gets skewed. When something actually breaks, you’re sifting through bot noise to find the signal.

A Lightweight First Line of Defense

I built a simple ASP.NET Core middleware, AiBotBlockerMiddleware, that addresses the most common case: bots that honestly identify themselves.

The idea is straightforward:

  1. You already have a robots.txt file telling crawlers to go away.
  2. Most AI bots actually respect their User-Agent identity (even if they ignore robots.txt directives).
  3. So enforce what robots.txt declares. If a bot identifies itself as GPTBot and your robots.txt says Disallow: /, return 403 Forbidden immediately.

The middleware runs early in the pipeline, before routing, static files, or any real work happens. Blocked requests never touch your application logic.

How It Works

The system has four small pieces. You can drop them all into a Middleware folder in your project. The code below relies on ASP.NET Core’s implicit usings (enabled by default in modern project templates), so types like IMiddleware, HttpContext, and IServiceCollection are available without explicit using statements.

1. Options: configure the filename and the rejection message:

Csharp

namespace BlockAIde.Middleware;

public class AiBotBlockerOptions
{
    /// <summary>
    /// The file name (relative to WebRootPath) used to load blocking rules.
    /// Defaults to "robots.txt".
    /// </summary>
    public string FileName { get; set; } = "robots.txt";

    /// <summary>
    /// The plain-text message written to the response body when a bot is blocked.
    /// Defaults to "Forbidden: AI bots are not permitted to access this resource.".
    /// </summary>
    public string BlockedMessage { get; set; } =
        "Forbidden: AI bots are not permitted to access this resource.";
}

2. RobotsTxt parser: reads your robots.txt, extracts the User-Agent/Disallow rules, and watches the file for changes so you can update the block list without restarting your app:

Csharp

using Microsoft.Extensions.FileProviders;
using Microsoft.Extensions.Primitives;

namespace BlockAIde.Middleware;

public sealed class RobotsTxt
{
    private readonly record struct AgentRule(string Agent, string[] DisallowedPaths);

    private volatile AgentRule[] rules = [];
    private readonly ILogger<RobotsTxt> logger;
    private readonly IFileProvider fileProvider;
    private readonly string fileName;

    public RobotsTxt(IWebHostEnvironment env, ILogger<RobotsTxt> logger, AiBotBlockerOptions options)
    {
        this.logger = logger;
        fileProvider = env.WebRootFileProvider;
        fileName = options.FileName;

        // Initial parse
        Reload();

        // Watch for changes and reload
        ChangeToken.OnChange(
            () => fileProvider.Watch(fileName),
            Reload
        );
    }

    public bool IsBlocked(string? userAgent, string? requestPath)
    {
        if (string.IsNullOrEmpty(userAgent))
            return false;

        var snapshot = rules;
        foreach (var (agent, paths) in snapshot)
        {
            if (!userAgent.Contains(agent, StringComparison.OrdinalIgnoreCase))
                continue;

            foreach (var path in paths)
            {
                if (requestPath is not null &&
                    requestPath.StartsWith(path, StringComparison.OrdinalIgnoreCase))
                    return true;
            }
        }

        return false;
    }

    private void Reload()
    {
        try
        {
            var fileInfo = fileProvider.GetFileInfo(fileName);
            if (!fileInfo.Exists)
            {
                logger.LogWarning(
                    "{FileName} not found in wwwroot — no AI bots will be blocked", fileName);
                rules = [];
                return;
            }

            using var stream = fileInfo.CreateReadStream();
            using var reader = new StreamReader(stream);

            var ruleMap = new Dictionary<string, HashSet<string>>(StringComparer.OrdinalIgnoreCase);
            string? currentAgent = null;

            while (reader.ReadLine() is { } line)
            {
                line = line.Trim();

                if (line.Length == 0 || line[0] == '#')
                {
                    if (line.Length == 0)
                        currentAgent = null;
                    continue;
                }

                if (line.StartsWith("User-agent:", StringComparison.OrdinalIgnoreCase))
                {
                    currentAgent = line["User-agent:".Length..].Trim();
                }
                else if (line.StartsWith("Disallow:", StringComparison.OrdinalIgnoreCase)
                         && currentAgent is not null
                         && currentAgent != "*")
                {
                    if ("Disallow:".Length <= line.Length)
                    {
                        var path = line["Disallow:".Length..].Trim();
                        if (path.Length > 0)
                        {
                            if (!ruleMap.TryGetValue(currentAgent, out var paths))
                            {
                                paths = [];
                                ruleMap[currentAgent] = paths;
                            }
                            paths.Add(path);
                        }
                    }
                }
            }

            rules = ruleMap
                .Select(kvp => new AgentRule(kvp.Key, kvp.Value.ToArray()))
                .ToArray();

            logger.LogInformation(
                "Loaded {RuleCount} blocked AI bot agents from {FileName}",
                rules.Length, fileName);
        }
        catch (Exception ex)
        {
            logger.LogError(ex, "Failed to parse {FileName} — keeping previous block list", fileName);
        }
    }
}

3. The middleware itself: checks the User-Agent, short-circuits with 403 if blocked, and still allows bots to read robots.txt (so they at least know they’re unwelcome):

Csharp

namespace BlockAIde.Middleware;

public class AiBotBlockerMiddleware(RobotsTxt blockList, AiBotBlockerOptions options) : IMiddleware
{
    private readonly PathString robotsPath = new("/" + options.FileName.Trim().TrimStart('/'));

    public async Task InvokeAsync(HttpContext context, RequestDelegate next)
    {
        var userAgent = context.Request.Headers.UserAgent.ToString();

        // Let bots read the robots.txt file so they know they're not welcome
        if (context.Request.Path.Equals(robotsPath))
        {
            await next(context);
            return;
        }

        if (blockList.IsBlocked(userAgent, context.Request.Path))
        {
            context.Response.StatusCode = StatusCodes.Status403Forbidden;
            context.Response.ContentType = "text/plain";
            await context.Response.WriteAsync(options.BlockedMessage);
            return;
        }

        await next(context);
    }
}

4. Extension methods: two-line registration:

Csharp

namespace BlockAIde.Middleware;

public static class AiBotBlockerMiddlewareExtensions
{
    public static void AddAiBotBlocker(
        this IServiceCollection serviceCollection,
        Action<AiBotBlockerOptions>? configure = null)
    {
        var options = new AiBotBlockerOptions();
        configure?.Invoke(options);

        serviceCollection.AddSingleton(options);
        serviceCollection.AddSingleton<RobotsTxt>();
        serviceCollection.AddSingleton<AiBotBlockerMiddleware>();
    }

    public static void UseAiBotBlocker(this WebApplication app)
    {
        // Eagerly initialize the blocklist (parse robots.txt + start file watcher)
        app.Services.GetRequiredService<RobotsTxt>();
        app.UseMiddleware<AiBotBlockerMiddleware>();
    }
}

Wiring It Up

In your Program.cs, add it early, before static files and routing:

Csharp

using BlockAIde.Middleware;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddAiBotBlocker();

var app = builder.Build();

app.UseAiBotBlocker();

app.UseStaticFiles();
app.MapGet("/", () => "Hello World!");

app.Run();

The robots.txt File

Place this in your wwwroot folder. It serves a dual purpose: compliant crawlers read it as a standard robots.txt file, and the middleware parses it to enforce blocking at the HTTP level.

# OpenAI
User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: OAI-SearchBot
Disallow: /

# Anthropic
User-agent: ClaudeBot
Disallow: /

User-agent: Claude-Web
Disallow: /

User-agent: anthropic-ai
Disallow: /

# Google AI
User-agent: Google-Extended
Disallow: /

User-agent: Google-Agent
Disallow: /

User-agent: GoogleAgent-Mariner
Disallow: /

User-agent: Gemini-Deep-Research
Disallow: /

# Meta
User-agent: FacebookBot
Disallow: /

User-agent: Meta-ExternalAgent
Disallow: /

User-agent: Meta-ExternalFetcher
Disallow: /

# Apple
User-agent: Applebot-Extended
Disallow: /

# Amazon
User-agent: Amazonbot
Disallow: /

# ByteDance
User-agent: Bytespider
Disallow: /

User-agent: TikTokSpider
Disallow: /

# Perplexity
User-agent: PerplexityBot
Disallow: /

# Common Crawl
User-agent: CCBot
Disallow: /

# Cohere
User-agent: cohere-ai
Disallow: /

# DeepSeek
User-agent: DeepSeekBot
Disallow: /

# Mistral
User-agent: MistralAI-User
Disallow: /

# Other AI crawlers
User-agent: Diffbot
Disallow: /

User-agent: YouBot
Disallow: /

User-agent: Scrapy
Disallow: /

User-agent: FirecrawlAgent
Disallow: /

User-agent: PetalBot
Disallow: /

# Allow everyone else
User-agent: *
Allow: /

You can edit this file at any time. The middleware watches it and reloads automatically. No restart needed.

You can test your new AI-blocking middleware using the .http client extensions found in your favorite .NET IDEs such as Visual Studio, JetBrains Rider, or VS Code. Create a new test.http file and paste the following code as its content.

### AI Block Smoke Tests
### Run the app first, then execute these requests in order.

### Test 1: Normal request → 200
GET http://localhost:5247/

### Test 2: GPTBot blocked on root → 403
GET http://localhost:5247/
User-Agent: GPTBot/1.0

### Test 3: GPTBot blocked on subpath (Disallow: / covers all) → 403
GET http://localhost:5247/some/deep/path
User-Agent: GPTBot/1.0

### Test 4: robots.txt served to normal clients → 200
GET http://localhost:5247/robots.txt

### Test 5: ClaudeBot can still read robots.txt → 200
GET http://localhost:5247/robots.txt
User-Agent: ClaudeBot

### Test 6: Substring match with full GPTBot UA → 403
GET http://localhost:5247/
User-Agent: Mozilla/5.0 (compatible; GPTBot/1.0; +https://openai.com/gptbot)

### Test 7: Bytespider blocked → 403
GET http://localhost:5247/
User-Agent: Bytespider

### Test 8: Normal browser passes through → 200
GET http://localhost:5247/
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36

### Test 9: Anthropic blocked → 403
GET http://localhost:5247/
User-Agent: anthropic-ai

### Test 10: PerplexityBot blocked → 403
GET http://localhost:5247/
User-Agent: PerplexityBot/1.0

### Test 11: DeepSeekBot blocked → 403
GET http://localhost:5247/
User-Agent: DeepSeekBot

Design Choices Worth Noting

Single source of truth. A single robots.txt file governs both the standard crawler protocol and the enforcement middleware. You don’t maintain two lists.

Substring matching, not regex. The User-Agent check uses string.Contains() with OrdinalIgnoreCase. It’s simple, fast, and immune to regex backtracking. When GPTBot sends a User-Agent like Mozilla/5.0 (compatible; GPTBot/1.0; +https://openai.com/gptbot), the substring match catches it.

Volatile array swap for thread safety. The RobotsTxt class uses a volatile AgentRule[] field. When the file changes, a new array is built and swapped in atomically. Request threads always read a consistent snapshot, and no locks are needed.

Fail-open. If the robots.txt file is missing or can’t be parsed, the middleware logs a warning and lets all traffic through. It won’t accidentally block legitimate users.

The Limitation: Bots That Lie

This middleware catches bots that honestly identify themselves, and many do. OpenAI’s GPTBot, Anthropic’s ClaudeBot, Google-Extended, Bytespider: they all use recognizable User-Agent strings because their operators want to maintain some semblance of legitimacy.

But as the developer in Discussion #520 discovered, not all bots play nice. Some change their User-Agent to match Chrome. At that point, User-Agent-based blocking is like a lock on a screen door. It only stops the people who were going to knock anyway.

For the bots that don’t take the hint, you need heavier tools.

When You Need to Escalate

Cloudflare

The most accessible option for most developers. Put Cloudflare in front of your origin, and you get:

  • Bot Management that uses behavioral analysis, TLS fingerprinting, and JS challenges, not just User-Agent strings.
  • WAF rules to block specific patterns, geographies, or request signatures.
  • A free tier with basic bot protection. Paid tiers get the full Bot Management suite.

For most apps, this is the “just do it” answer. Change your DNS, enable the protections, and move on with your life.

Anubis

Anubis is an open-source project (18k+ GitHub stars) that takes a more aggressive approach. It sits as a reverse proxy in front of your application and “weighs the soul of incoming HTTP requests,” requiring clients to solve proof-of-work challenges before they can reach your backend.

The key insight: most scraper bots can’t execute JavaScript. Anubis requires it. If a client can’t solve the browser-based challenge, it never reaches your app.

The project is honest about its trade-offs. From their README:

Anubis is a bit of a nuclear response. This will result in your website being blocked from smaller scrapers and may inhibit “good bots” like the Internet Archive.

They offer configurable bot policy definitions to allowlist specific crawlers, and they’re building a curated set of “known good” bots. But the default posture is: prove you’re a browser, or go away.

A Layered Approach

None of these tools is a silver bullet on its own. The right approach is defense in depth:

  1. robots.txt + AiBotBlockerMiddleware catches honest bots before they waste a single CPU cycle. Costs you essentially nothing.
  2. Rate limiting throttles suspicious patterns. The Duende blog post on rate-limiting IdentityServer endpoints is a good starting point, and ASP.NET Core has built-in rate-limiting middleware.
  3. Cloudflare or Anubis for bots that ignore the polite hints. Choose based on whether you want a managed service (Cloudflare) or a self-hosted solution (Anubis).

Wrapping Up

The developer in Discussion #520 made a reasonable assumption: their BFF app is for browsers, so only browsers should use it. The reality is that anything on the internet is fair game for bots, and AI crawlers are making this worse every month.

The AiBotBlockerMiddleware is a pragmatic starting point. It’s four small files, zero NuGet dependencies, and two lines to wire up. It won’t stop a determined attacker spoofing Chrome, but it will stop the known AI crawlers that honestly identify themselves, and it’ll do it before your app wastes a single cycle on them.

Your app was built for humans. Make the bots prove they deserve to get in.

What’s Next

All the code you need is in this post. Copy the four files into a Middleware folder, drop the robots.txt into wwwroot, and wire it up in Program.cs.

If you’re running the Duende BFF Security Framework, check out the original community discussion that inspired this post for more context on securing BFF endpoints. For rate-limiting strategies specific to identity endpoints, read our post on rate-limiting IdentityServer endpoints.

Related Articles