This document describes how the cloud.gov team approaches configuration management of the core platform. Before configuration changes go into production, they need to pass our significant change rubric, as described in our Feature Lifecycle and Story Lifecycle.
What goes into configuration management?
In short, everything needed to run and operate the platform that is not a secret. (See Secret Key Management for that.)
Here are some examples that should be in configuration management:
- CI pipelines (Concourse)
- Infrastructure/network configuration (Terraform)
- VM setup and quantity (BOSH)
- Software configuration (BOSH)
- Cloud.gov-developed code
Special cases: For changing settings that we currently cannot manage as configuration files in version control, such as GitHub repository settings and Nessus scan settings, you must first get agreement from another cloud.gov team member that the change should be made (such as over Slack or Hangouts).
Where should all this configuration go?
All configuration must be stored in GitHub using the following “Change Workflow” unless it is a secret.
How do we test these changes?
If possible, first test the changes locally. After that, upload them to a development environment where either manual or automated testing needs to be run. Security tests need to be executed in the development environment where changes are applied.
- All configuration changes must flow through a git repository, centrally managed through GitHub, unless they contain sensitive information. In these cases, sensitive information should be stored in an S3 bucket or CredHub with a proper security policy and encryption, versioned such that changes can be easily rolled back.
- A change is initiated and discussed, following the steps in our Story Lifecycle.
- In the appropriate GitHub repository for the component, a pull request (PR) against the main branch is created that addresses the change (note - sometimes this branch is called
main, other times it is not, be sure to check).
- If the repository contains cloud.gov-developed code, the PR must have an automated checks in GitHub Action or Concourse, which must pass before the PR can be merged.
- The PR is reviewed by someone other than the committer. Pairing via screen-sharing is encouraged and qualifies as a review. Review should include assessment of architectural design, DRY principles, security and code quality. The reviewer approves the PR via GitHub.
- The reviewer merges the approved PR. Further updates invalidate the approval. The committer may merge an approved PR if the changes made are time-sensitive.
- A continuous integration (CI) server handles automated tests and continuous deployment (CD) of the merged changes.
- All changes are deployed to a testing environment, such as development.
- Any and all automated tests are run.
- If all tests pass, changes can be promoted for deployment to production in the pipeline.
- The CI/CD tool uses GitHub repositories and S3-stored sensitive content as the canonical source of truth for what the platform should look like. If there are manual changes, it will reset the state of all systems to match.
Checklist for new repositories
Before we put a new repository into production:
- Give it a name. Historically these started with
cg-when we shared the 18F repo, but no special prefix is needed.
READMEfiles (to support open source reuse of our work).
- Configure a protected main branch (CM-9).
- Enable “Require pull request reviews before merging”
- Enable “Dismiss stale pull request approvals when new commits are pushed”
- Enable “Require status checks to pass before merging”
- Enable “Require branches to be up to date before merging”
- Enable “Include administrators”
- Configure permissions (CM-3):
- Set up CI/CD for changes (CM-3)
- Set up for static code analysis if it’s a code or configuration repo. This in flux. Ask in the #cg-platform channel for details.
- Open a PR to add it to the repos list for pre-merge checks
What if a configuration changed and it is not in Configuration Management?
If possible, Configuration Management tools need to be set up to always roll back to a known state. Other than that, these tools need to be able to “recreate” all settings from the known configurations.
Roles and responsibilities
- All team members
- Follow the configuration management plan.
- Make suggestions (such as in PRs) if you have ideas for improving the plan.
- Cloud Ops (Platform squad)
- Ensure Concourse, Terraform, BOSH, GitHub, AWS, and other resources are correctly set up to implement the technical aspects of the plan.
- Review the plan in our quarterly Security Policy and Account Review meetings.
- Program Manager
- Ensure the team follows the Feature Lifecycle, Story Lifecycle, and other operational aspects of the plan.
- System Owner
- Ensure that team members uphold their responsibilities.
- Approve any major changes to the plan, and coordinate with JAB representatives as necessary.
GitHub contribution guidelines
Because cloud.gov was originally built by 18F, and we maintain close operational alignment with other parts of TTS, we follow the TTS requirements for using GitHub. These are our team practices within those requirements.
Forking vs. branching
Both forking and branching are welcome in our repositories. Contributors inside cloud.gov can use forking or branching according to their personal preferences, and contributors outside cloud.gov can fork repositories.
The team often practices branching. The rationale for branching within a team is that paired collaboration on a single branch avoids certain types of friction:
- Having to create multiple forked PRs in order to contribute to the branch
- Having to add new users to forked repositories as collaborators in order to have people directly contribute on short-lived forked branches
When contributing directly on a branch, we’re able to modify work-in-progress (WIP) pull requests and encourage collaboration across the Cloud Operations team.
For the cloud.gov team, when forking an upstream repository to add a patch or bugfix, the fork should go to your personal GitHub user account. The
cloud-gov org is for code maintained by cloud.gov, whether that’s original code or a long-lived fork (discouraged, but sometimes necessary) for code we are running in production.
Squashing commits is allowed but discouraged, except in rare instances.
Rebase or merge
The team prefers rebasing over merging.
When should a PR be created?
Work-in-progress PRs are encouraged. When a PR is ready for review, it should be tagged in GitHub
review-needed label. If you create a work-in-progress PR, you might also make it plain in the PR name with a
Should PRs be assigned?
PRs are typically not assigned in GitHub, unless someone specifically needs to sign off on the change.
You can request a review using GitHub’s built-in tools, mention someone in the PR with the
@ notation, or contact them outside the GitHub context to request a review.
When reviewing a PR, should the change be tested locally?
Whenever possible, the proposed changes should be tested locally. Because of the nature of many of the cloud.gov repositories and deployment environments, local testing is not always possible or practical. Visual code review, however, is always required.