Search on Pages Sites
It’s easy to add search functionality to a site.
We recommend using Search.gov, a free site search and search analytics service for federal web sites. You will need to register for Search.gov and follow their instructions to integrate this service with your Pages site. For full details, visit Search.gov.
If you’d prefer another solution, you can configure a tool like lunrjs that creates a search function run using the client browser. An example of this is at the 18F blog. This avoids any dependency on another service, but the search results are not as robust.
Crawl/Index Pages sites
Pages automatically handles search engine visibility for preview URLs via the Pages proxy. For traffic served through a preview site, the Pages proxy automatically serves the appropriate HTTP robots header, robots:none
. Preview URLs are not crawlable or indexable by design. Only webpages on the production domain are served with the robots: all
directive, indicating to crawlers and bots such as search.gov to index the site and enable search capabilities.
Priority | Method to manage robot behavior | How to prevent indexing/crawling | How to allow indexing/crawling |
---|---|---|---|
1 | robots.txt in your Pages site |
User-agent: * disallow: / directory |
N/A, crawling is allowed by default |
2 | X-Robots-Tag HTTP header (served by Pages via the Pages proxy) |
robots: none (this is automatically served to visitors of all Pages preview builds) |
robots: all (this is automatically served to visitors of custom/production domains) |
3 | <meta name="robots"> in your Pages site webpage HTML |
content="noindex, nofollow” |
N/A, indexing is allowed by default |
If you want to disable crawling and indexing for specific pages of your production site, you can include the noindex/nofollow
meta tag in the head of those pages, or include those folders in your robots.txt
, if your site generates one.
Conditionally set robots - Eleventy (11ty)
Take advantage of Pages-provided environment variables to enable environment-specific functionality. Hardcode the condition and meta tags to check the branch from the process.env
environment variable. This differs from how it is dealt with on a Jekyll site, you are able to add specificity with process.env.BRANCH
.
You can use this code sample
<meta name="robots" content="noindex, nofollow">
See additional documentation on build environment variables.