An official website of the United States government US flag signifying that this is a United States Federal Government website

S3

S3

You can store application content in S3 using a managed service that provides direct access to S3.

Plans

Plan Name Description Price
basic A single private bucket with unlimited storage Will be paid per GB per month
basic-public A single public bucket with unlimited storage, where the files are all public to read Will be paid per GB per month

Pricing

Instances will have pricing per GB per month. Learn about managed service pricing.

How to create an instance

First decide if the S3 bucket contents should be private or public. Objects placed in a private bucket are only accessible using the bucket credentials unless specifically shared with others. Objects placed in a public bucket are accessible to anyone with the link.

To create a private bucket use the basic plan:

cf create-service s3 basic <SERVICE_INSTANCE_NAME>

To create a public bucket use the basic-public plan:

cf create-service s3 basic-public <SERVICE_INSTANCE_NAME>

More information

Using S3 from your application

To make the bucket usable from your application, you must bind it:

cf bind-service <APP_NAME> <SERVICE_INSTANCE_NAME>
cf restage <APP_NAME>

This will put the S3 access information in the application’s environment variables. You can inspect these values with cf env <APP_NAME> if necessary.

If you get an error, see the managed service documentation.

If you need to access multiple S3 buckets using the same access credentials–for example, to copy files from one bucket to another–you can use the additional_instances option when binding:

cf bind-service <APP_NAME> <SERVICE_INSTANCE_NAME> -c '{"additional_instances": ["<ADDITIONAL_SERVICE_INSTANCE_NAME>"]}'

The credentials created for this binding will have access to both the bucket managed by <SERVICE_INSTANCE_NAME> and <ADDITIONAL_SERVICE_INSTANCE_NAME>, and the bucket managed by <ADDITIONAL_SERVICE_INSTANCE_NAME> will be listed in the additional_buckets field of the credentials.

Interacting with your S3 bucket from outside cloud.gov

You may want to use your S3 service as a repository for file transfer between humans, or for communicating content with other systems hosted outside of cloud.gov. You can manage credentials for accessing your S3 bucket using service keys. The service key details provide you with the credentials that are used with common file transfer programs by humans or configured in external systems. Typically you would create a unique service key for each external client of the bucket to make it easy to rotate credentials in case they are leaked.

You can create a service key by running cf create-service-key <SERVICE_INSTANCE_NAME> <KEY_NAME>. (For example, to create a key for user Bob for the S3 bucket mybucket, you would run cf create-service-key mybucket Bob.) To later revoke access (eg when no longer required, or when compromised), you can run cf delete-service-key <SERVICE_INSTANCE_NAME> <KEY_NAME>. To get the credentials from the service key, you can run cf service-key <SERVICE_INSTANCE_NAME> <KEY_NAME>; you will see a JSON description of the credentials.

Clients will need the following information from this JSON description:

  • access_key_id
  • secret_access_key
  • region
  • bucket

Treat these values as sensitive!

To create a service key that can access multiple buckets, you can use the additional_instances option described above.

Using the S3 credentials

For interactive file transfer tools, you will usually see a dialog box or form that asks for the values above. You can just copy and paste them in.

For automated or CLI processes, you may want to use the jq tool to automate extraction of the necessary values from the service key.

For example, you might want to use the AWS Command Line Interface to add, modify, and download files in a bucket. The AWS CLI requires the values above as environment variables. This script will set them:

SERVICE_INSTANCE_NAME=your-s3-service-instance-name-here
KEY_NAME=your-service-key-name-here

cf create-service-key "${SERVICE_INSTANCE_NAME}" "${KEY_NAME}"
S3_CREDENTIALS=`cf service-key "${SERVICE_INSTANCE_NAME}" "${KEY_NAME}" | tail -n +2`

export AWS_ACCESS_KEY_ID=`echo "${S3_CREDENTIALS}" | jq -r .access_key_id`
export AWS_SECRET_ACCESS_KEY=`echo "${S3_CREDENTIALS}" | jq -r .secret_access_key`
export BUCKET_NAME=`echo "${S3_CREDENTIALS}" | jq -r .bucket`
export AWS_DEFAULT_REGION=`echo "${S3_CREDENTIALS}" | jq -r '.region'`

This will set the AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION and BUCKET_NAME environment variables, enabling the AWS CLI tool to add, download and modify files as needed:

# Copy a file
aws s3 cp ./mylocalfile s3://${BUCKET_NAME}/

# Download a file
aws s3 cp s3://${BUCKET_NAME}/mys3file .

# See all files
aws s3 ls s3://${BUCKET_NAME}

Bucket URLs

Objects in your bucket can be accessed via the following endpoint:

  • https://s3-${AWS_DEFAULT_REGION}.amazonaws.com/${BUCKET_NAME}/

If you plan to enable “website mode” for the bucket and use it that way, you will need to use a different version of the URL:

  • http://${BUCKET_NAME}.s3-website-${AWS_DEFAULT_REGION}.amazonaws.com/

Note that “website mode” URLs don’t support HTTPS, and they aren’t appropriate for production use unless fronted by a CloudFront distribution.

Either way, if the bucket is private, attempting to access resources will result in AccessDenied errors unless your application generates pre-signed URLs for objects that need to be shared.

Allowing client-side web access from external applications

By default, browsers only allow JavaScript to make HTTP requests to the same domain as the page serving the JavaScript, as part of the web’s Same Origin Policy. This means that by default, S3 buckets can only be accessed by JavaScript when that JavaScript is served directly from the same origin as the S3 bucket.

If an application wishes to allow client-side JavaScript in other applications to access its S3 buckets, this can be done by setting a CORS policy. CORS allows an origin to “opt in” to allowing its contents to be accessed by JavaScript in specified origins (or any origin) to access the bucket.

You can set a CORS policy restricting access to a specific list of sites and methods (such as the example below), or allow arbitrary origins (using *) to read bucket data:

# Adjust CORS AllowedOrigins to known locations such as a IP address
cat << EOF > cors.json
{
  "CORSRules": [
    {
      "AllowedOrigins": ["otherapp.hostname.gov", "anotherapp.hostname.com"],
      "AllowedHeaders": ["*"],
      "AllowedMethods": ["HEAD", "GET"],
      "ExposeHeaders": ["ETag"]
    }
  ]
}
EOF

### Upload the CORS policy to the bucket
aws s3api put-bucket-cors --bucket $BUCKET_NAME --cors-configuration file://cors.json

You can add additional method types along with HEAD and GET (such as PUT, POST, and DELETE) as needed.

Backup and retention

By default, S3 data will stay where it is. You must empty the bucket before cf delete-service can be run on the <APP_NAME>-s3 service. This however is not a substitute for backups, and it provides no protection from users accidentally deleting the contents.

You can implement a backup scheme by storing your buckets under /data/year/month/day and keeping multiple copies in S3, using the AWS CLI. Your Org Manager can create a space called “backups”, and your team can run the process above again to create backup buckets.

Rotating credentials

The S3 service creates unique IAM credentials for each application binding or service key. To rotate credentials associated with an application binding, unbind and rebind the service instance to the application. To rotate credentials associated with a service key, delete and recreate the service key.

The Broker in GitHub

You can find the broker here: https://github.com/cloudfoundry-community/s3-broker.