0

I've created a cluster, VPC, subnet and a Fargate service using the first run wizard of ECS on AWS console and uploaded the image on ECR and deployed successfully.

Now I need the service to access a remote database. So, I need to add the IP in the firewall's whitelist. I allocated an Elastic IP, created a NAT Gateway and updated the router table following this tutorial.

I stopped the task and tried to run it again. But then I could not pull the image from ECR to run a new task caused by the following error message:

CannotPullContainerError: Error response from daemon: Get https://account-id.dkr.ecr.sa-east-1.amazonaws.com/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

My setup:

  1. VPC with CIDR 10.0.0.0/16 (automatically created on ECS wizard)
  2. Subnet with the following router table:
    Destination |   Target
----------------|-------------
10.0.0.0/16     | local
0.0.0.0/0       | nat-<nat-id>
  1. NAT Gateway, on VPC and subnets that were created on ECS Wizard and the Elastic IP I allocated.

  2. Currently, I'm allowing all traffic in both inbound and outbound rules:

Type | Protocol | Port range | Source  | Description - optional
-----|----------|------------|---------|------------------------
All  | All      | All        |0.0.0.0/0| -

What am I missing? Is this the only way I can accomplish what I want? Is there a simpler way to achieve it? I found in Stack Overflow another way to associate an Elastic IP by using Application Load Balancer or Network Load Balancer. Is it a better approach?

Fouyer
  • 43
  • 8

2 Answers2

1

The ECS wizard creates a VPC with two public subnets 10.0.0.0/24 and 10.0.1.0/24. They both use a single RT which points to internet gateway (IGW). However, from your question it appears that you've modified it to use NAT.

Sadly, this will not work, as you've already experienced. To rectify the issue, you could create a third subnet (or more if you need for HA). The subnet will be private with no internet connection. Instead it will have a new RT which will route internet traffic to NAT. Your Fargate tasks would be launched in the private subnet(s).

The new RT of the new subnet(s) would be:

    Destination |   Target
----------------|-------------
10.0.0.0/16     | local
0.0.0.0/0       | nat-<nat-id>

The RT of the two original public subnets, should be modified to route traffic to IGW, like it was originally done:

    Destination |   Target
----------------|-------------
10.0.0.0/16     | local
0.0.0.0/0       | IGW
Marcin
  • 108,294
  • 7
  • 83
  • 138
0

I thought of explaining this. you are getting the CannotPullContainerError error is because there is no route to the internet. The traffic to the ECR go through the internet by default.

your Fargate service is running in a private subnet which does not have direct routes to the internet. In order to get internet access, the private subnet where the Fargate task is running should have routes in its route table to route the traffic to the internet via the NatGateway (you have done this already). thereore

    Destination |   Target
----------------|-------------
10.0.0.0/16     | local
0.0.0.0/0       | Natgateway

The NatGateway simply routes the traffic to the internet gateway. The Nat gateway is deployed in the public subnet and it will have routes to internet via the internet gateway. Therefore the subnet where the Nat Gateway is deployed should have the following route created.

    Destination |   Target
----------------|-------------
10.0.0.0/16     | local
0.0.0.0/0       | InternetGateway

Note: You can also talk to ECR privately without going through the internet by creating a Private ECR VPC end point.

Internet Gateway

Arun K
  • 5,979
  • 3
  • 15
  • 27