# --------------------------------------------------------------------------- # STRATEGY: Block Training & Aggressive Scraping, Allow Live Search # --------------------------------------------------------------------------- # 1. Block Aggressive/Useless Scrapers (Save Server CPU) User-agent: Bytespider Disallow: / User-agent: CCBot Disallow: / User-agent: Amazonbot Disallow: / User-agent: MJ12bot Disallow: / User-agent: FacebookBot Disallow: / # 2. Block AI *Training* (Optional - Keep these if you don't want to feed the AI) User-agent: GPTBot # OpenAI Training Bot Disallow: / User-agent: ClaudeBot # Anthropic Training Bot Disallow: / # 3. ALLOW Live Browsing (Important for Visibility) # ChatGPT-User is what visits your site when a human asks ChatGPT to "read this link" User-agent: ChatGPT-User Allow: / User-agent: Google-Extended # Used for Gemini/Bard Allow: / # --------------------------------------------------------------------------- # SECTION 2: General Rules for All Other Bots (Google, Bing, etc.) # --------------------------------------------------------------------------- User-agent: * # 1. Allow the site content Allow: / # 2. Block Next.js Internals & API # These save "crawl budget" so Google focuses on your actual pages Disallow: /api/ Disallow: /admin Disallow: /thank-you # 3. Block Search/Filter Parameters # WARNING: Only keep this line if your pagination does NOT use '?' (e.g. /blog?page=2) # If you use '?' for pagination, REMOVE the line below. Disallow: /*?* # --------------------------------------------------------------------------- # SECTION 3: Sitemap # --------------------------------------------------------------------------- Sitemap: https://softthenext.com/sitemap.xml