How To Jeremy

Some thoughts on learning how to Jeremy

Building a static website with AWS

Everything old is new again.

In the early days of the web, you just had a web server serve out static content but that seemed pretty lame.

So web frameworks became the hot thing. This flipped the script now every page was rendered on request. This went probably a bit too far in the opposite direction.

Do you really need a server to render every page on every request?

When you try to scale you realize you need caching. So you end up going back and putting caching around most pages anyway.

Now we are moving back to that static first kind of world.

I think it is a good world. Where you start by creating static cacheable pages first, then add in the dynamic pages second. A world where start with the constraint of simple static pages and then build out a backend API to support JavaScript calls.

You can still handle SEO by building those pages on a regular basis.

So I wanted a way to build a basic static website in AWS that can serve as this blog.

The second thing I wanted was a simple template that I could use to spin up any number of new sites.

Goals:

  • Secure (HTTPS)
  • Cacheable
  • Serverless
  • Template for creation/deploy

Infrastructure as code

To reach the goal of having a base template to create many sites I am going with a tool called Terraform.

Terraform is part of a really cool movement to codify all the things including infrastructure.

Often infrastructure is setup using a UI or some command lines tools and then hopefully forgotten. Small changes are only known to the people that made them.

More and more the infrastructure is the app. The infrastructure needs to change at a faster pace than before. We can't just set up boxes and forget them.

Using a UI or command line tools it is not easy to track changes, or roll back changes.

Another constant problem is that staging and prod are always different. This has been a constant problem where things work in staging and die in prod because the infrastructure is not the same. Due to the difficulties in keeping them in sync they easily drift apart.

Also, let's say in AWS you want to have a disaster recovery region. AWS us-east-1 takes a hit you can failover to another region with Terraform it would be easy to set up an exact duplicate copy. There are instances when you want to move infrastructure across regions, maybe to save costs by putting compute close to the data source.

Terraform makes all of this possible. You can track changes by using git, you can easily rollback, and by just using aliasing you can have the exact config for prod and staging.

Terraform works by using the cloud providers API's to manage the infrastructure. It works against all the major cloud providers including AWS.

Aws does have its version of this called CloudFormation. The downside to Cloud Formation is that it is only for AWS so if you need to manage multiple cloud providers then Terraform is a better solution.

My other complaint about CloudFormation is that it is JSON based. JSON is not bad but editing large JSON is not always fun. You could probably use YAML on top of it but still prefer Terraform's approach.

How Terraform works

At a high level Terraform reads one or more configuration files. From these files it determines the correct cloud api's to call.

Terraform keeps a local state file of what it thinks the current state of the infrastructure is.

This allows it keep track of changes that may occur outside of Terraform. This is great when someone goes into the UI and updates something. Terraform will know that something is not right and give you an error.

End to end overview

When a user enters the domain name, dns will be resolved by the AWS DNS servers and the domain configuration in Route53. Which in turn returns the CloudFront endpoint.

CloudFront is a Content Delivery Network or CDN. Basically AWS maintains a large number of servers all over the world. Each server keeps a cached version of the pages of the website. When a user resolves the CloudFront dns endpoint they are returned an edge server that is close to them. So not only can we use CloudFront to handle the HTTPS/SSL portion we get low latency as well.

CloudFront has caching rules, but if the cache expires or there is a cache miss it will load the request from S3.

S3 is an object store. In some ways you can think of it as a filesytem but in reality it is more a place where you can put files with a key name. So a file is in reality a key called "css/style.css", not a folder called "css" with a file "style.css" underneath it. When you are using the UI or command lines tools it let you view things "under" the folder. This may seem like a weird disctinction, as S3 has "folders" but these are more virtual compared to a true filesystem.

So in the S3 bucket just lives all the static files. In pretty much any way desired.

Browser -> Route53 (DNS) -> CloudFront CDN -> S3 Bucket -> Static Html/JS/CSS

Prerequistes

  • This tutorial assumes thay you have an AWS account: AWS
  • That you bought a domain for your account using Route53: Route53

Also, keep in mind that if you start getting traffic to your site AWS will start to bill you. If you just created your AWS account many services have a free tier. CloudFront will give you 50GB of data transfer and 2 million requests per month for free for the first 12 months.

So look through the pricing guides to get a sense of your monthly charges. If the site is small your bill will be pretty small. Also, the nice thing about serverless is you only pay for the usage. If nothing happens no charge.

Setting up Terraform

We need to install Terraform and then setup an AWS IAM account that it can use.

If you are on Mac and have HomeBrew then super easy.

brew install terraform

If not then you can look at Terraform install

IAM

AWS has a service for Identity and Access Management or IAM. IAM allows you to manage users and groups within your AWS account. Just like you don't want to run as root on your operating system, you don't want to run things under your main AWS account user.

You want to create new users, groups or roles and then only give them the bare minium of access rights.

I included a AWS doc at them bottom but this is pretty straight foward when you got to the IAM service tab.

You create a new user and select the checkbox for "Programmatic access" and hit next. On this screen it will give you the option to "Attach existing policies directly" select that and add the following policies.

  • AWSCertificateManagerFullAccess
  • CloudFrontFullAccess
  • AmazonRoute53FullAccess
  • AmazonS3FullAccess

These permissions are for all the different AWS services we are going to configure through Terraform.

On the last page of the setup AWS UI will show an "Access key ID" and "Secret access key". Save those some place safe for the moment.

Setting up local permissions

Terraform works by invoking AWS's API's to create all the entities in AWS. To do that it needs to be able to authenticate the Terraform application running locally on your machine.

There are many ways to do this but the easiest way to do this is to use the access keys stored from before.

We can do that by updating two files (these names will vary if you are not on Mac/Linux:

  • ~/.aws/credentials
  • ~/.aws/config
For the credientials file this is where you put the access keys
[default]
aws_access_key_id=
aws_secret_access_key=
[default]
region=us-east-1
output=json

The "default" in the brackets is the name of the profile. You can have as many of them as you want and you could change that to anything. They just have to match between files.

Terraform Code

Ok, finally here we go with actual code.

The first thing we need to do is start our terraform configuration file.

So I would create a new directory with a new file called: main.tf

Once you have that, take the code below, paste into main.tf and save the file.

There are a few sections that you need to fill in with your personal info

  • The name of the profile you created in the .aws/config file
  • Your AWS Account number which you can find under Account
  • Update base_name with just your domain name - superawesomewebsite
  • If not using .com update base_with_suffix after the . to your domain suffix - org

This code doesn't create or do anything yet in AWS.

This code just tells Terraform which permissions to use with the AWS api and what region to create things in. After these updates you shouldn't have to make any more changes to the code, just copy/paste.

  
provider "aws" {
  region  = "us-east-1" #can change this if you want a different region
  profile = "enter your .aws/config profile here"

  allowed_account_ids = ["enter your aws account id"]
}

provider "aws" {
  alias   = "acm"
  region  = "us-east-1" #this needs to be us-east-1, do not change
  profile = "enter your .aws/config profile here"

  allowed_account_ids = ["enter your aws account id"]
}

locals {
  base_name        = "your website name"
  base_with_suffix = "${local.base_name}.com" #if not .com change to your suffix
  s3_origin_id     = "{local.base_name}_s3_origin"
}
  

Once you have that saved and updated those fields, in the shell run

terraform init

This only needs to be run once. This setups up Terraform and creates its state files.

Setting up the buckets

We are going to create two buckets. One is the main bucket where we store all our html/js/css.

The other just serves to redirect requests to www to our main bucket.

This is a bit goofy but without this, if the user types in the domain without www then the lookup will fail.

Having an extra bucket is kind of annoying and there maybe a better way to handle this but I got bored and this is the solution I found based on the stackoverflow below.

  
resource "aws_s3_bucket" "content_s3_bucket" {
  bucket = "www.${local.base_with_suffix}"
  acl    = "public-read"

  policy = <<POLICY
{
    "Version": "2012-10-17",
    "Statement": [
      {
        "Sid": "PublicReadGetObject",
        "Effect": "Allow",
        "Principal": "*",
        "Action": "s3:GetObject",
        "Resource": "arn:aws:s3:::www.${local.base_with_suffix}/*"
      }]
  }
  
POLICY

}

#This bucket just serves to redirect the short domain name to the full www.
resource "aws_s3_bucket" "redirect_s3_bucket" {
  bucket = local.base_with_suffix
  acl    = "public-read"

  website {
    redirect_all_requests_to = "https://www.${local.base_with_suffix}"
  }
}
      

Add this code, save, and then in the shell run

terraform plan

Terraform plan looks at your configuration file(s), compares that to its state files and then talks to AWS to figure out what if anything is differrent

It then shows you what it would do if you applied those changes

Once you are happy with what Terraform will do, then you can run

terraform apply

This is the pattern we are going to apply as we go. Add some code, plan, apply and then verify in AWS.

You can verify by going to S3 and there should be now to S3 buckets.

Creating the SSL Certificates

One of the awesome advantages of getting your domain through Route53 is that it is easy and free to get certificates.

If you have ever done this on your own it can be a huge pain.

With AWS it is pretty easy and we can do the whole thing through Terraform and not even have to hit the UI.

This code will make a request to AWS for a certificate for your site. We use * so that if needed any subdomains would also be covered by this site.

    
resource "aws_acm_certificate" "site_cert" {
  domain_name       = "*.${local.base_with_suffix}"
  validation_method = "DNS"

  tags = {
    Environment = "test"
  }

  lifecycle {
    create_before_destroy = true
  }
}

            

Validating the SSL Certificate

So for security reasons you need to prove that you actually own the domain. That way you can't just gen certificates for facebook.com.

The easiest way to do this, especially since it the domain was purchased through AWS is to add a DNS record in Route53.

Then AWS can verify that record and verify ownership. This Terraform code will handle all of that.

Add this code, then plan, apply and then verify on Certificate Manager

We are looking to see that *.'s Status is issued. This means that AWS has generated and approved an SSL certificate for the site.


data "aws_route53_zone" "site_zone" {
  name = local.base_with_suffix
}

resource "aws_route53_record" "cert_validation" {
  zone_id = data.aws_route53_zone.site_zone.id

  name = aws_acm_certificate.site_cert.domain_validation_options[0].resource_record_name
  type = aws_acm_certificate.site_cert.domain_validation_options[0].resource_record_type

  records = [aws_acm_certificate.site_cert.domain_validation_options[0].resource_record_value]
  ttl     = 300
}

resource "aws_acm_certificate_validation" "default" {
  provider        = aws.acm
  certificate_arn = aws_acm_certificate.site_cert.arn
  validation_record_fqdns = [
    aws_route53_record.cert_validation.fqdn,
  ]
}


CloudFront

Ok now we are going to setup the CDN. The below Terraform code will create the basics. In general though the CDN needs to know where to get the content that it will cache and serve out.

The code below will automatically hook to the S3 bucket we created before.

There maybe a few things that you want to tweak like price_class which tells CloundFront which regions of edge nodes to put caching. The below model is the cheapest.

The next is restrictions, where you can create either a blacklist or whitelist of countries to accept requests from. The code below has a blacklist with only one country in it.

You will want to paste, run Terraform plan, apply and then grab a cup of coffee.

This will take 15-30 minutes to apply as all the edge nodes around the globe need to be setup.

  
 resource "aws_cloudfront_distribution" "cloudfront_distribution" {
  origin {
    domain_name = "www.${local.base_with_suffix}.s3.amazonaws.com"
    origin_id   = local.s3_origin_id
  }

  enabled             = true
  is_ipv6_enabled     = true
  comment             = "Some comment"
  default_root_object = "index.html"

  aliases = ["www.${local.base_with_suffix}"]

  default_cache_behavior {
    allowed_methods  = ["DELETE", "GET", "HEAD", "OPTIONS", "PATCH", "POST", "PUT"]
    cached_methods   = ["GET", "HEAD"]
    target_origin_id = local.s3_origin_id

    forwarded_values {
      query_string = false

      cookies {
        forward = "none"
      }
    }

    viewer_protocol_policy = "redirect-to-https"
    min_ttl                = 0
    default_ttl            = 3600
    max_ttl                = 86400
  }

  price_class = "PriceClass_100" #lowest price class

  restrictions {
    geo_restriction {
      restriction_type = "blacklist"
      locations        = ["AF"]
    }
  }

  tags = {
    Environment = "production"
  }

  #need to update the certificate arn
  viewer_certificate {
    acm_certificate_arn            = aws_acm_certificate.site_cert.arn #"${data.aws_acm_certificate.cert_arn.arn}"
    cloudfront_default_certificate = false
    minimum_protocol_version       = "TLSv1.1_2016"
    ssl_support_method             = "sni-only"
  }
}


      

Route53

Ok, now that we have a CDN. When a user requests the domain name in the browser the DNS entry needs to return the CloudFront CDN address we just created.

The Terraform code will create DNS records pointing to the new CloudFront Distrubtion.

  
data "aws_s3_bucket" "content_s3" {
  bucket = local.base_with_suffix
}

resource "aws_route53_record" "r53_www" {
  zone_id = data.aws_route53_zone.site_zone.zone_id
  name    = local.base_with_suffix
  type    = "A"

  alias {
    name    = data.aws_s3_bucket.content_s3.website_domain
    zone_id = data.aws_s3_bucket.content_s3.hosted_zone_id

    evaluate_target_health = true
  }
}

resource "aws_route53_record" "r53_www_cdn" {
  zone_id = data.aws_route53_zone.site_zone.zone_id
  name    = "www.${local.base_with_suffix}"
  type    = "A"

  #take the name from resource "aws_cloudfront_distribution"
  alias {
    name    = aws_cloudfront_distribution.cloudfront_distribution.domain_name
    zone_id = aws_cloudfront_distribution.cloudfront_distribution.hosted_zone_id

    evaluate_target_health = true
  }
}

  

Copying files to S3 using the AWS Command Line Interface

The last thing we need to deal with is uploading files to S3. This config will automatically point the root of the domain to a file called index.html.

So to get going all you need is an html file called index.html in the root of the content S3 bucket. AWS will automatically return that content when a user goes to .

While that is a good start you are probably going to want more files then that and css and such.

That is mostly straight forward except that we don't want to the user to see urls like superawesomewebsite.com/sweetblogpost.html.

We need a way to tell the browser that a file without the .html file does in fact have html content. The way to do that is to set the content type. That content type will be returned to the browser in the HTTP Response allowing it to know what to do with the data.

  • Html Files return content type as text/html
  • CSS - text/css
  • Javascript - test/javascript

So lets say that the end result would be something like this in your S3 bucket.

  • index.html
  • sweetblog
  • css/style.css

You could do this through the AWS S3 UI but it is much easier using the aws cli which you can find instructions how to install: Here

  
    aws s3 cp sweetblogpost s3://your bucket/sweetblogpost --content-type text/html --profile 

    aws s3 cp css/style.css s3://your bucket/css/style.css  --content-type text/css --profile 
  

In you have a bunch of files you can use the --recursive option to upload and set the content-type at the same time for many files.

Templating

After you have a basic site you will most likely start to hate editing a bunch of html or more precisely duplicating HTML with each blog post

Most web frameworks will have built in templating but we can still use that without a whole framework like Django.

This could also be used with Terraform to extend its capabilities to be even more dynamic.

There are ton of templating libs but the ones that I have looked at are:

Or you could plug in any web frontend framework like React.

Links

Setting up aws cli

Creating IAM user

Great Medium Post on Cert Validation

Solving the www issue

CloudFront Pricing