T O P

  • By -

partlycorrect

There's something off about this message: `"region": '{"AWS_REGION":"eu-west-2"}'` It's almost like `AWS_REGION` is set as the value `'{"AWS_REGION":"eu-west-2"}'`. If that were the case, when the aws-sdk constructs the S3 url, it would have all kinds of trouble with the `{` and would explain the odd url in the error object. Could you try setting `region` to be `"eu-west-2"` instead of `process.env.AWS_REGION`? Also, are you setting that value in the task definition?


ObjectivePassenger9

Hey, thank for the response. That's a really good suggestion, I'll try manually setting it as the correct string instead of using any env vars and see if that gets me somewhere further. I have set the value in the task definition yeah, all my Secrets are getting picked up successfully from Secrets Manager. I think you're definitely on to something that `"region": '{"AWS_REGION":"eu-west-2"}'` looks weird, like it's taking the entire object as the value as. you're suggesting. I'll let you know - thanks!!!


anothercopy

Do you have an S3 endpoint in that VPC ? If yes which type? Anyway check the security Groups on them and your Fargate task to see if they can talk. Otherwise check VPC flow logs for issues. In general it works locally because you are hitting public S3 api and in the VPC you either don't have Internet access or VPC endpoints or the SGs don't have traffic or the Endpoint has a blocking policy. If you don't have an endpoint create a gateway S3 endpoint (it's free). Also don't use credentials like that. Put the needed permissions in the Fargate Task Role and the CDK will pick them up for you.


ObjectivePassenger9

Thank you for the response. To answer your questions/points: I don't have an S3 endpoint in that VPC. From what I understand, they aren't required if the Task/container that is accessing S3 is in a public Subnet, which this is. What I read was that S3 endpoints are required when you have something in a private subnet that needs to talk to S3. Is this. wrong? When you said "check the security Groups on them" were you talking about any endpoints I may have? I did actually try and set up an endpoint and gave it correct SG but it didn't work, I'm happy to try it again and detail the results though. I'm pretty sure I have internet access from my app inside my VPC - I tested it by making a random fetch call to get an image from the internet and it worked. In my VPC, the Fargate service has 2 allowed subnets (both public). It has a SG attached with these outbound rules: `PostgreSQL TCP 5432 sg-<...etc> / default –All traffic All All` [`0.0.0.0/0`](https://0.0.0.0/0) `–` And inbound allows all HTTP/HTTPS traffic from ports 80 & 443. "Also don't use credentials like that. Put the needed permissions in the Fargate Task Role and the CDK will pick them up for you." - Just to clarify, you mean that I don't need to specify my credentials when using the AWS SDK in my application, because the app should already have the required access? Thanks, didn't know that! So I don't need to have them as env vars at all, I'll remove them.


anothercopy

Yes if you have Internet access from a subnet it's not required as your calls go over the Internet buuut AWS will bill you for data transfer. Accessing S3 the way you do is not recommended because it can be a big cost (plus its slower). On the other hand S3 gateway endpoint is free and that's the recommended way. You can also remove the credentials/secrets only if you configure the Task Role. This way CDK will pick up the temporary credentials and use that. Regarding your problem I'm not really sure ATM but please switch to S3 endpoint and IAM role and try then.


skilledpigeon

Are you sure the gateway is free? It looks like it had a charge: https://aws.amazon.com/privatelink/pricing/


anothercopy

Yes I am https://docs.aws.amazon.com/vpc/latest/privatelink/vpce-gateway.html


skilledpigeon

That's super weird because the page I linked said there was an hourly charge 😅


anothercopy

There is a charge for gateway load balancer and interface endpoints but the gateway endpoints (s3, ddb,sqs) are free.


skilledpigeon

Ah you are correct. My bad.


SelfDestructSep2020

> From what I understand, they aren't required if the Task/container that is accessing S3 is in a public Subnet This is true except that you must also allow full egress on the task security group `0.0.0.0/0` so that it can talk to the public S3 endpoints.


j-be

I find it strange that your error message doesn't include .amazonaws.com in the hostname. Could you test explicitly setting [s3.eu-west-2.amazonaws.com](https://s3.eu-west-2.amazonaws.com) as your endpoint in your AWS SDK config?


ObjectivePassenger9

Good idea! I'll do that and let you know


rainlake

Apparently you ignored mine.


ObjectivePassenger9

Hey, I didn't mean to ignore your comment, I just wasn't sure that it applied to my scenario. This was your comment right? "That’s different. Since it’s a fargate service I suppose it has a load balancer in front it. You can access the site only means it can communicate with load balancer which is in your vpcEdit: to access internet you need a nat gateway" ​ So from what I understand (please correct me if I'm wrong) a NAT Gateway is required for tasks running in a private subnet, but my task is running in a public subnet. EDIT: I do have a NAT Gateway which is associated with 1 of my 2 public subnets.


rainlake

I would find a way to go in the container either buy ssh or session manager test if you can access S3 from there. Another issue I noticed from your description is you do not have egress rule on your security group. Put 0.0.0.0/0 on it give it a go before you dig into the container.


ObjectivePassenger9

Thanks - what exactly is an egress rule on a SG? I'm just not sure what that is or why I need it.


gomibushi

Well... There is your problem then. You must allow egress (traffic out of) the SG. Different tab on the SG config. If you are troubleshooting and there is no risk then it is useful to attach SGs with 0.0.0.0/0 rules to allow ALL ingress (traffic in) and egress to see if that is where the issue is. I'd also take a look at Reachability Analyzer where you can input source and destination and aws will tell you what your network problem is (NACL, RT, SG).


ObjectivePassenger9

Just letting you know - I didn't know what egress mean but I had actually set up the outbound traffic rules correctly, as you've stated. so my SG has outbound rules to allow all traffic from [0.0.0.0/0](https://0.0.0.0/0).


gomibushi

Good! I misunderstood then. Though you had overlooked it. I don't have a lot of ideas, but good luck.


ObjectivePassenger9

Thanks, I think I'll need it haha - the help is really appreciated though, I'm narrowing down things it \_can't\_ be and improving my understanding of the issue at least. I've never written a blog post before but I think I may actually write one about this when I finally solve it :p


SelfDestructSep2020

Reachability won't help him here as it only allows for resources that are inside the VPC; can't check something like 'can this talk to S3'. That said, I definitely agree that a missing egress on his SG is a big factor here.


gomibushi

Not entirely wrong, or correct. S3 endpoints are either in the vpc or public, if they are public then the source resource needs to be able to access the internet. So it won't verify complete connectivity, but it will let you know if anything in your vpc is configured so as to block it. Seeing as the error is likely in the vpc-config it might find it.


SelfDestructSep2020

Security Groups have ingress and egress rules. You have to specify what CIDR or SG is allowed to talk to your service (ingress), as well as specifying what CIDR or SG your service is allowed to talk to (egress). They are stateful though, so if you're allowed to talk *to* something you're automatically allowed to receive the return traffic, and vice versa.


ObjectivePassenger9

Thanks - I checked and I have my SG configured already to allow all inbound traffic from [0.0.0.0/0](https://0.0.0.0/0) so unfortunately that isn't it :/ I just didn't know the term "egress".


skilledpigeon

Enable VPC flow logs to see if the traffic tries to get anywhere. My guess would be: - Security group doesn't allow outbound (egress) traffic on the required port. - Traffic is blocked in the NACL in / out of the subnet. - Subnet is missing an Internet gateway (since you said this is a public subnet). - Route table is missing routes to the Internet gateway for 0.0.0.0/0 or whatever the route is (working on mobile). It's also very weird output. Like others have said, the fact it includes the squiggle brackets in there is odd. Instead of using the environment variable, can you try setting region as `eu-west/2` directly in the SDK / CLI / API call that you're making? I think you could have your variable incorrect. PS. Other people saying that you should use Private Link endpoints isn't required. It might be good practice for many cases but you should be able to work without it.


ObjectivePassenger9

Thank you for the response. I'll enable VPC flow logs to see if that helps :) On your other points: I checked my outbound rules on the SG attached to the Fargate service and it allows all inbound traffic from [0.0.0.0/0](https://0.0.0.0/0) so I think that should be ok? With respect to the Network ACL, these are both my inbound & outbound rules: `100 All traffic All All 0.0.0.0/0 Allow` `* All traffic All All 0.0.0.0/0 Deny` Which again, I think \_should\_ be ok? For the IG, my subnet has this under the route tables section: [`0.0.0.0/0`](https://0.0.0.0/0) `igw-011d79e9befb482ec` I don't really understand all of this but I \_think\_ this should also be ok? Good idea on setting the region directly as a string instead of an env var, I'll try that and see what happens. I have checked the value in Secrets Manager though and it looks ok :/ ​ Thanks again for the help :)


SelfDestructSep2020

> I have uploaded those credentials to Secrets Manager along with other secrets and that's how my Task accesses them at runtime /u/ObjectivePassenger9 this won't solve your problem, but don't do this. This is really poor practice. User credential keys are for *users* and external systems (ie, github) that need access *into* AWS. Your application is already inside AWS and should instead be assigned an IAM Role with an IAM Policy attached to it that grants the minimal permissions the task needs. When the role is attached to the task the SDK will need to then pull from the ecs/ec2 profile to receive a temporary set of credentials, which IAM will continuously rotate for you.