Flowchart: How should I run containers on AWS?

As of Late 2021, which AWS service should I use to run my new containerized app in production?

Reading time: a couple minutes

Writing blog posts is hard, so I made a flowchart!

I only wanted to post this on Twitter, but the accessibility text could not fit. This blog post is for folks with vision impairment or low vision. Accessibility matters!

Warning

App Runner now supports connecting to services inside a VPC, so this chart is now officially outdated!

An updated flowchart is coming as part of a larger project, but it'll be a while.

The flowchart
The flowchart

Flowchart titled “As of Late 2021, which AWS service should I use to run my new containerized app in production?

Warning, to the left of the flowchart: “In this flowchart, I am talking about production-level services that are run by teams! Yes, you can run an EKS cluster with 3 nodes for your homelab quite easily. That is completely different from a team of engineers running production services! Folks need to upskill, they need to understand, use, and then they need to maintain these services for the company’s lifetime.


  1. Q: “Do you have more than 250 engineers?

    1. If “Yes”, then “Go away, this is not for you
    2. If “No”, then go to 2
  2. Q: “Is spending at least $1M per year for tech buzzword bingo worth it?

    1. If “Yes”, then “EKS”, and end
    2. If “WTF, no !?!”, then go to 3
  3. Q: “Do you use anything in a VPC?

    1. If “No”, then go to step 4
    2. If “Yes, of course”, then go to step 5
  4. Q: “Are your apps tiny and horizontally scalable?

    1. If “Yes”, then “AppRunner”, and end.
    2. If “They're larger than 2vCPU and 4GB”, then go to step 5
  5. Q: “Did you do any dirty hacks to get around Lambda's 250MB limit?

    1. If “OMG, it was so annoying”, then “Lambda Containers”, and end
    2. If “Huh, what limit?”, then go to step 6
  6. Q: “Do you only get, say, 25 requests per day and can each request be served in less than 15 minutes?

    1. If “Yes, ±10 requests”, then “Lambda Containers”, and end
    2. If “No”, then go to step 7
  7. Q: “Do you need any of Lambda's features? (spike handling, event integrations)

    1. If “Yes”, then “Lambda Containers”, and end
    2. If “No”, then “ECS on Fargate” which appears as an end state
  8. After step 7, “But I need … more CPU, RAM, storage, GPUs”, then “re:Invent is coming soon and there’s an open containers roadmap on GitHub ;) Check that out!”, then “ECS on Fargate + ECS on EC2 for that 1 exotic service”, then end.

  9. After step 7, “But I need … software X that is only running on Kubernetes”, then “Valid, k8s has awesome software”, then “ECS on Fargate + EKS for that software (and related apps)”, then end.

  10. After step 7, “But I need … to replace and run my own equivalent of an AWS Service (Load Balancer, CloudWatch)”, then “Oh, you sweet summer child”, then “EKS” which is an end state, then “But I need custom flag X”, then “Kubernetes on EC2”, then end.

Warning

App Runner now supports connecting to services inside a VPC, so this chart is now officially outdated!

An updated flowchart is coming as part of a larger project, but it'll be a while.

The flowchart, with comments
The flowchart, with comments

Flowchart titled “As of Late 2021, which AWS service should I use to run my new containerized app in production?”, with comments for some steps.

  1. Comment for the title: “Until 2017-ish containers were rather unstable. From 2017-ish to 2019-ish, the best option was Kubernetes using kops. From 2019-ish to 2020-ish, the best option was Kubernetes using EKS. From 2020-ish to 2021-ish, the best option was and is ECS on Fargate. Maaaybe, from 2023-ish AppRunner will become the best option. We’ll see

Warning, to the left of the flowchart: “In this flowchart, I am talking about production-level services that are run by teams! Yes, you can run an EKS cluster with 3 nodes for your homelab quite easily. That is completely different from a team of engineers running production services! Folks need to upskill, they need to understand, use, and then they need to maintain these services for the company’s lifetime.

  1. Comment to the warning: “I am an Independent Consultant, so I am not selling a product. Please spare me the bullshit lies folks use to sell shit, like ‘k8s makes it easy to switch clouds' or the ‘vendor agnostic' nonsense.

  1. Q: “Do you have more than 250 engineers?

    1. If “Yes”, then “Go away, this is not for you”.
      1. Comment: “If you have more than 250 engineers, you likely already have a Platform Team or a ‘golden path' for running containers. Use that! If you have 250 people that know X and spend all day using X, then you should use X! I admit, I chose the 250 number somewhat arbitrarily: I stole it from Docker — if you have more than 250 engineers you have to buy a paid subscription to use Docker. At this stage, ‘you' should not necessarily be making this decision. At this level, there are a lot more things that influence the decision, and most of them are not technical
    2. If “No”, then go to 2
  2. Q: “Is spending at least $1M per year for tech buzzword bingo worth it?”.

    1. Comment: “Look, I worked with companies where ‘we run on Kubernetes' brought in more money than it cost to run: easier recruiting, easier raising of funds, and so on. Did that lead to success? Meh. I am not saying I agree with it, but it’s a thing. And yes, running Kubernetes will cost you at least one million dollars US every single year. No, I am not exaggerating.
    2. If “Yes”, then “EKS”, and end
    3. If “WTF, no !?!”, then go to 3
  3. Q: “Do you use anything in a VPC?

    1. If “No”, then go to step 4
    2. If “Yes, of course”, then go to step 5
  4. Q: “Are your apps tiny and horizontally scalable?

    1. If “Yes”, then “AppRunner”, and end.
      1. Comment: “Yeah, it’s a preview-level service right now, but it’s on the right path to be awesome! It may not totally take over ECS on Fargate, but it will take over most common usecases.
    2. If “They're larger than 2vCPU and 4GB”, then go to step 5
  5. Q: “Did you do any dirty hacks to get around Lambda's 250MB limit?"

    1. If “OMG, it was so annoying”, then “Lambda Containers”, and end.
      1. Comment: “Like EFS, it’s a really really really specific thing for a very specific usecase. Abusing it will lead to pain!
    2. If “Huh, what limit?”, then go to step 6
  6. Q: “Do you only get, say, 25 requests per day and can each request be served in less than 15 minutes?

    1. If “Yes, ±10 requests”, then “Lambda Containers”, and end
    2. If “No”, then go to step 7
  7. Q: “Do you need any of Lambda's features? (spike handling, event integrations)

    1. If “Yes”, then “Lambda Containers”, and end
    2. If “No”, then “ECS on Fargate” which appears as an end state.
  8. After step 7, “But I need … more CPU, RAM, storage, GPUs”, then “re:Invent is coming soon and there’s an open containers roadmap on GitHub ;) Check that out!”, then “ECS on Fargate + ECS on EC2 for that 1 exotic service”, then end.

    1. Comment: “Only valid reason to run anything new on ECS on EC2 which is in maintenance-ish mode (AWS announced they’re focusing on ECS on Fargate since they’re heavily using it internally)
  9. After step 7, “But I need … software X that is only running on Kubernetes”, then “Valid, k8s has awesome software”, then “ECS on Fargate + EKS for that software (and related apps)”, then end.

    1. Comment: “If 5 big companies already collaborate on X software, do you get any value from re-writing and maintaining that internally? Kubernetes really allows you to take advantage of shocking amounts of software.
  10. After step 7, “But I need … to replace and run my own equivalent of an AWS Service (Load Balancer, CloudWatch)”, then “Oh, you sweet summer child”, then “EKS” which is an end state, then “But I need custom flag X”, then “Kubernetes on EC2”, then end.

    1. Comment: “Be realistic about your technical powers and the alleged business benefits! By running Traefik or nginx-ingress you are saying ‘I can run my own Load Balancer better than AWS can run ALBs/NLBs/ELBs'. By running a Prometheus Push Gateway + Prometheus + metrics-server scaling pipeline you get awesome power, but you are also saying ‘I can run my own CloudWatch and AWS AutoScaling and I can do a better job than AWS'. By running cert-manager you are saying ‘My business gets value from me generating and managing my own SLL certificates'. For some folks the above statements are true. Sometimes AWS is not enough. For most people though? The above statements are not true, no matter how much they pretend otherwise.
    2. Second comment: “When should you use Kubernetes? You’ll know. Examples, other than awesome software:need to scale on multiple metrics/complex logic that CloudWatch can’t do, need to scale faster, at massive numbers, need to run at scale, with large or massive numbers (multiple k8s clusters are debatably ‘easier' than immutable infrastructure EC2s), need to optimize cost aggressively, at scale. Kubernetes was like an F1 car: you needed a whole team to even start the car. Starting the car got easier, and now everybody thinks they can race an F1 car and they’re telling high-schoolers to demand F1 cars. Also, please don’t use an F1 car to go to the grocery store. A sports car might work, but an SUV or a bike might be better. Or walking, you know? It’s healthy.