Back to integrations
Integration docs·aws.amazon.com

Amazon CloudFront.

Track AI bot visits at CloudFront’s edge with Lambda@Edge, then pass requests straight to your real site.

01

Overview

This integration uses Lambda@Edge on CloudFront Viewer Request to detect AI crawlers by User-Agent and post visits to xSeek. The request still continues to your origin unchanged, so humans and bots both see your real website while you capture accurate AI bot traffic.

02

What you get

Edge-based bot detection before the origin
Logs AI bot visits to xSeek over HTTPS
No HTML rewriting required
Works with any CloudFront-backed site
Keeps your existing cache and origin behavior intact
Supports bot-only logging without user impact
Fast, global execution at AWS edge locations
03

Requirements

  • AWS account with CloudFront access
  • Lambda@Edge permissions (function must be in us-east-1)
  • API key from xSeek (ai_visits:push)
  • Website ID from your xSeek dashboard
04

Setup process

  1. 01

    Create a Lambda function in us-east-1 (N. Virginia) with Node.js runtime

  2. 02

    Add XSEEK_API_KEY and XSEEK_WEBSITE_ID environment variables

  3. 03

    Paste the Lambda@Edge Viewer Request code and publish a version

  4. 04

    In the Lambda console, add a CloudFront trigger and choose your distribution

  5. 05

    Set the CloudFront event to Viewer Request

  6. 06

    Confirm deploy to Lambda@Edge when prompted

  7. 07

    Deploy the distribution and wait for propagation

  8. 08

    Verify AI bot events in the xSeek dashboard

05

Integration setup

Environment variables requiredrequired
Add these environment variables to your project:
XSEEK_API_KEY=your_api_keyXSEEK_WEBSITE_ID=your_website_id

Your API key can be found in your account settings. Make sure it has the ai_visits:push privilege.

1Create a Lambda@Edge function

In AWS Lambda (region us-east-1), create a new function, add environment variables, and paste the Viewer Request code:

javascript
// AWS Lambda@Edge (Viewer Request) - AI bot tracking (ESM)
// Also supports CloudFront log record events.
// Set Lambda environment variables: XSEEK_API_KEY, XSEEK_WEBSITE_ID

const AI_BOTS = [
  { name: 'anthropic-ai', pattern: /anthropic-ai/i },
  { name: 'claudebot', pattern: /ClaudeBot/i },
  { name: 'claude-web', pattern: /claude-web/i },
  { name: 'claude-user', pattern: /Claude-User/i },
  { name: 'claude-searchbot', pattern: /Claude-SearchBot/i },
  { name: 'claude-code', pattern: /claude-code\//i },
  { name: 'perplexitybot', pattern: /PerplexityBot/i },
  { name: 'perplexity-user', pattern: /Perplexity-User/i },
  { name: 'grokbot', pattern: /GrokBot(?!.*DeepSearch)/i },
  { name: 'grok-search', pattern: /xAI-Grok/i },
  { name: 'grok-deepsearch', pattern: /Grok-DeepSearch/i },
  { name: 'GPTBot', pattern: /GPTBot/i },
  { name: 'chatgpt-user', pattern: /ChatGPT-User/i },
  { name: 'oai-searchbot', pattern: /OAI-SearchBot/i },
  { name: 'google-extended', pattern: /Google-Extended/i },
  { name: 'Google-Agent', pattern: /Google-Agent/i },
  { name: 'applebot', pattern: /Applebot(?!-Extended)/i },
  { name: 'applebot-extended', pattern: /Applebot-Extended/i },
  { name: 'meta-external', pattern: /meta-externalagent/i },
  { name: 'meta-externalfetcher', pattern: /meta-externalfetcher/i },
  { name: 'bingbot', pattern: /Bingbot(?!.*AI)/i },
  { name: 'bingpreview', pattern: /bingbot.*Chrome/i },
  { name: 'microsoftpreview', pattern: /MicrosoftPreview/i },
  { name: 'cohere-ai', pattern: /cohere-ai/i },
  { name: 'cohere-training-data-crawler', pattern: /cohere-training-data-crawler/i },
  { name: 'youbot', pattern: /YouBot/i },
  { name: 'duckassistbot', pattern: /DuckAssistBot/i },
  { name: 'semanticscholarbot', pattern: /SemanticScholarBot/i },
  { name: 'ccbot', pattern: /CCBot/i },
  { name: 'ai2bot', pattern: /AI2Bot/i },
  { name: 'ai2bot-dolma', pattern: /AI2Bot-Dolma/i },
  { name: 'aihitbot', pattern: /aiHitBot/i },
  { name: 'amazonbot', pattern: /Amazonbot/i },
  { name: 'novaact', pattern: /NovaAct/i },
  { name: 'brightbot', pattern: /Brightbot/i },
  { name: 'bytespider', pattern: /Bytespider/i },
  { name: 'tiktokspider', pattern: /TikTokSpider/i },
  { name: 'cotoyogi', pattern: /Cotoyogi/i },
  { name: 'crawlspace', pattern: /Crawlspace/i },
  { name: 'pangubot', pattern: /PanguBot/i },
  { name: 'petalbot', pattern: /PetalBot/i },
  { name: 'sidetrade-indexer', pattern: /Sidetrade indexer bot/i },
  { name: 'timpibot', pattern: /Timpibot/i },
  { name: 'omgili', pattern: /omgili/i },
  { name: 'omgilibot', pattern: /omgilibot/i },
  { name: 'webzio-extended', pattern: /Webzio-Extended/i },
  { name: 'baiduspider', pattern: /Baiduspider/i },
  { name: 'mistralai-user', pattern: /MistralAI-User/i }
];

import https from 'node:https';

function safeDecode(value) {
  if (!value || value === '-') {
    return '';
  }
  try {
    return decodeURIComponent(value);
  } catch {
    return value;
  }
}

function postToXseek(payload) {
  return new Promise((resolve) => {
    const body = JSON.stringify(payload);
    const req = https.request(
      {
        hostname: 'www.xseek.io',
        path: '/api/track-ai-bot',
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
          'Content-Length': Buffer.byteLength(body),
          'x-api-key': 'PUT YOUR API KEY HERE',
        },
      },
      (res) => {
        res.on('data', () => {});
        res.on('end', resolve);
      }
    );
    req.on('error', resolve);
    req.write(body);
    req.end();
  });
}

function parseViewerRequest(request) {
  const headers = request.headers || {};
  const userAgent = headers['user-agent']?.[0]?.value || '';
  const host = headers.host?.[0]?.value || headers['x-host-header']?.[0]?.value || '';
  const query = request.querystring ? `?${request.querystring}` : '';
  const url = host ? `https://${host}${request.uri}${query}` : `${request.uri}${query}`;
  const ip =
    headers['x-forwarded-for']?.[0]?.value?.split(',')[0]?.trim() ||
    request.clientIp ||
    '';
  const referer = headers.referer?.[0]?.value || undefined;

  return { userAgent, url, ip, referer };
}

function parseLogRecord(record) {
  const rawUserAgent = record['cs(User-Agent)'] || '';
  const userAgent = safeDecode(rawUserAgent);
  const host = record['cs(Host)'] || record['x-host-header'] || '';
  const uri = record['cs-uri-stem'] || '/';
  const query = record['cs-uri-query'];
  const queryString = query && query !== '-' ? `?${query}` : '';
  const url = host ? `https://${host}${uri}${queryString}` : `${uri}${queryString}`;
  const ip =
    record['c-ip'] ||
    (record['x-forwarded-for'] || '').split(',')[0]?.trim() ||
    '';
  const refererRaw = record['cs(Referer)'];
  const referer = refererRaw && refererRaw !== '-' ? safeDecode(refererRaw) : undefined;

  return { userAgent, url, ip, referer };
}

function extractRequestInfo(event) {
  const viewerRequest = event?.Records?.[0]?.cf?.request;
  if (viewerRequest) {
    return parseViewerRequest(viewerRequest);
  }

  const record = Array.isArray(event) ? event[0] : event?.Records?.[0] ?? event;
  if (record && typeof record === 'object' && (record['cs(User-Agent)'] || record['cs-uri-stem'])) {
    return parseLogRecord(record);
  }

  return null;
}

export const handler = async (event) => {
  const request = event?.Records?.[0]?.cf?.request;
  const info = extractRequestInfo(event);
  const userAgent = info?.userAgent || '';

  let detectedBot = null;
  for (const bot of AI_BOTS) {
    if (bot.pattern.test(userAgent)) {
      detectedBot = bot.name;
      break;
    }
  }

  if (detectedBot && info) {
    await postToXseek({
      botName: detectedBot,
      userAgent,
      url: info.url,
      ip: info.ip || undefined,
      referer: info.referer,
      websiteId: 'PUT YOUR WEBSITE ID HERE',
    });
  }

  // Always continue to the origin (Viewer Request). For log events, return the event.
  return request || event;
};

2Attach to CloudFront

Publish a version of the function, then associate it with your CloudFront distribution on Viewer Request.

Event type: Viewer Request

Confirm deployment to Lambda@Edge in the trigger settings:

AWS Lambda trigger settings showing deploy to Lambda@Edge

3Deploy and verify

Deploy the CloudFront distribution and check your xSeek dashboard for new AI bot visits.

All set!

Requests continue to your origin unchanged, and AI bot visits are logged to xSeek.

Need more help?

We can walk you through the setup or review your configuration.

Contact support