Simplify the networking with VPC Lattice

I started writing this on April 7th, 2023, a few weeks after the release of VPC Lattice (although it's not yet available everywhere). I was curious about what it is and what it's supposed to do, so I asked ChatGPT.

As there isn't much information available yet, I'll rely on the official documentation for now.

According to the documentation, "Amazon VPC Lattice is a fully managed application networking service that you use to connect, secure, and monitor all of your services across multiple accounts and virtual private clouds (VPCs)."

In other words, it allows you to use AWS as one big and complex Kubernetes cluster, while still using separate solutions as services across VPCs. It was likely created to simplify networking management and make administrators' lives easier. It allows for grouping VPCs into logical ServiceNetworks.

Let's check it now then!

chatgpt
For now we need to rely on documentation

Tools used in this episode

  • Python
  • ECS
  • CloudFormation(again!)
  • EC2

Choosing the right infrastructure-as-code tool for VPC Lattice

As an additional note, I wanted to use Terraform today, but unfortunately, when we compare the documentation for Lattice, CloudFormation seems more prepared:

Note that this is a relatively new service, so the situation may change in the coming weeks.

Key concepts

Here are some key concepts related to VPC Lattice:

Service - A set of objects (such as EC2 instances or ECS tasks) that act as one logical object. Importantly, it's an independent solution, so it can be handled by a dedicated team and has logical boundaries.

Target group - A logical representation of our compute resources.

Listener - A process that checks for the possibility to connect with our resources.

Rule - A rule of routing attached to a listener.

Service network - In my understanding, this is an abstraction on top of VPCs, as it allows us to associate resources in different VPCs.

Service directory - A service discovery solution for all VPC Lattice resources.

Auth policies - Authorization policies for services inside VPC Lattice.

chatgpt
As my personal interpretation

Project architecture

As you probably know, I like whole solution, or just working examples. So here as always I will try to build something usable. The plan is to have:

  • EC2 with Nginx in VPC 1, private subnet with NAT gateway
  • ECS sample app in VPC 2, without public LB(internal)
  • VPC Lattice Service

Additionally:

  • static credentials should be replaced - OICD config for GitHub Action
chatgpt
Implemented solution

Implementation

In the beginning, let's focus on location. I started with eu-central-1, however after the first deployment I realized that the service it's unavailable in Europe yet, so I switched to us-east-1. After that, I just implemented a sample EC2-based template, next in the row was the ECS-based template. As CloudFormation commands were getting longer and longer I added a Makefile.

 1default: validate
 2
 3REGION := "us-east-1"
 4
 5validate:
 6	@echo "Validating CloudFormation templates"
 7	cfn-lint ec2-formation.yaml
 8	cfn-lint ecs-formation.yaml
 9	
10
11clean-ec2:
12	aws cloudformation delete-stack --stack-name lattice-ec2 --region $(REGION)
13
14clean-ecs:
15	aws cloudformation delete-stack --stack-name lattice-ecs --region $(REGION)
16
17clean-lattice:
18	aws cloudformation delete-stack --stack-name lattice-itself --region $(REGION)
19
20ec2: 
21	cfn-lint ec2-formation.yaml
22	aws cloudformation deploy --stack-name lattice-ec2 --template-file ec2-formation.yaml --region $(REGION) --capabilities CAPABILITY_NAMED_IAM
23
24ecs:
25	cfn-lint ecs-formation.yaml
26	aws cloudformation deploy --stack-name lattice-ecs --template-file ecs-formation.yaml --region $(REGION) --capabilities CAPABILITY_NAMED_IAM
27
28ecs-fastapi:
29	cfn-lint ecs-formation.yaml 
30	aws cloudformation deploy --stack-name lattice-ecs --template-file ecs-formation.yaml --region $(REGION) --capabilities CAPABILITY_NAMED_IAM --parameter-overrides ContainerImage=123441.dkr.ecr.us-east-1.amazonaws.com/fastapi-repository:latest HCPath=/healtz
31
32lattice:
33	aws cloudformation deploy --stack-name lattice-itself --template-file lattice-formation.yaml --region $(REGION) --capabilities CAPABILITY_NAMED_IAM
34
35build: validate ec2 ecs lattice
36
37clean: clean-ec2 clean-ecs clean-lattice

As you can see there is no linting on the Lattice template. The reason was simple, support it's not ready yet. Some types are broken, also the lack of general availability is painful.

Then I added a small FastAPI code, for building container images with REST API. That leads us to CI/CD and GitHub Action. Obviously, the next step was OICD support implementation. If you are interested in topic, I can recommend some documentation:

If you're lazy, here is my code:

 1  GitHubOIDC:
 2    Type: AWS::IAM::OIDCProvider
 3    Properties:
 4      Url: https://token.actions.githubusercontent.com
 5      ThumbprintList:
 6        - f879abce0008e4eb126e0097e46620f5aaae26ad # valid until 2023-11-07 23:59:59
 7      ClientIdList:
 8        - sts.amazonaws.com
 9      Tags:
10        - Key: Env
11          Value: !Ref EnvironmentName
12
13  OIDCRole:
14    Type: AWS::IAM::Role
15    Properties:
16      RoleName: !Sub "${AWS::StackName}-GitHub-to-${ServiceName}-role"
17      AssumeRolePolicyDocument:
18        Version: "2012-10-17"
19        Statement:
20          - Effect: Allow
21            Principal:
22              Federated: !Ref GitHubOIDC
23            Action: sts:AssumeRoleWithWebIdentity
24            Condition:
25              ForAnyValue:StringEquals:
26                token.actions.githubusercontent.com:sub:
27                  - !Sub "repo:${OrgName}/${RepoName}:ref:refs/heads/main"
28                  - !Sub "repo:${OrgName}/${RepoName}:ref:refs/heads/dev"
29                token.actions.githubusercontent.com:aud: sts.amazonaws.com
30      ManagedPolicyArns:
31        - "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryPowerUser"
32      Tags:
33        - Key: Env
34          Value: !Ref EnvironmentName

At the beginning take a look at ThumbprintList. It's a bit dynamic variable, based on the sha1 of the GitHub certificate. You can calculate it relatively easily according to this article. The next important thing is the policy's condition. Especially token.actions.githubusercontent.com:sub, which should be pointed to the right repository and branch - you can use ForAnyValue here! Seems easy right? Great. Now you need to obtain, OIDCRole's ARN and paste it into GitHub Action secret tab. The simple action workflow could look like that:

 1name: Building my awesome docker image
 2run-name: ${{ github.actor }} on GitHub Actions 🚀
 3on: push
 4
 5env:
 6  AWS_REGION: "us-east-1"
 7
 8permissions:
 9  id-token: write
10  contents: read
11
12jobs:
13  Build-FastAPI-docker:
14    runs-on: ubuntu-latest
15    steps:
16      - name: Checkout
17        uses: actions/checkout@v3
18      - name: configure aws credentials
19        uses: aws-actions/configure-aws-credentials@v2
20        with:
21          role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
22          aws-region: ${{ env.AWS_REGION }}
23
24      - name: Login to Amazon ECR
25        id: login-ecr
26        uses: aws-actions/amazon-ecr-login@v1
27
28      - name: Build, tag, and push docker image to Amazon ECR
29        env:
30          REGISTRY: ${{ steps.login-ecr.outputs.registry }}
31          REPOSITORY: fastapi-repository
32          IMAGE_TAG: ${{ github.sha }}
33        run: |
34          docker build -t $REGISTRY/$REPOSITORY:$IMAGE_TAG ./ecs-api/
35          docker push $REGISTRY/$REPOSITORY:$IMAGE_TAG
36          docker tag $REGISTRY/$REPOSITORY:$IMAGE_TAG $REGISTRY/$REPOSITORY:latest
37          docker push $REGISTRY/$REPOSITORY:latest          

After that, I started the main point of the article - VPC Lattice landscape. Implementation was funny, as It's not a part of the Dash doc set yet, I was forced to use web docs! And that was the place were the fun began. For example:

1Resource handler returned message: "ALB Target Group does not support health check config (Service:

Or information about usable AWS::VpcLattice::TargetGroup, Types was under Target, as part of ID description. Pure joy! Ah and only Internal ALB is supported, you can't connect internal-facing, but you can connect IP of any solution in the whole universe. So the final implementation was long and rather not exciting:

  1AWSTemplateFormatVersion: 2010-09-09
  2Description: Run CFn Lattice itself
  3
  4Resources:
  5  EC2TargetGroup:
  6    Type: AWS::VpcLattice::TargetGroup
  7    Properties:
  8      Name: ec2-lattice-tg
  9      Type: INSTANCE
 10      Config:
 11        HealthCheck:
 12          Enabled: true
 13          Path: "/"
 14          Port: 80
 15          Protocol: HTTP
 16          Matcher:
 17            HttpCode: "200"
 18        Port: 80
 19        Protocol: HTTP
 20        ProtocolVersion: HTTP1
 21        VpcIdentifier: !ImportValue ec2-vpc
 22      Targets:
 23        - Id: !ImportValue ec2-instanceid
 24          Port: 80
 25      # INSTANCE | IP | LAMBDA | ALB
 26      Tags:
 27        - Key: Owner
 28          Value: kuba
 29        - Key: Project
 30          Value: blogpost
 31
 32  ECSTargetGroup:
 33    Type: AWS::VpcLattice::TargetGroup
 34    Properties:
 35      Name: ecs-lattice-tg
 36      # INSTANCE | IP | LAMBDA | ALB
 37      Type: ALB
 38      Config:
 39        # HC not supported for ALB
 40        #HealthCheck:
 41        #  Enabled: true
 42        #  Path: "/"
 43        #  Port: 80
 44        #  Protocol: HTTP
 45        #  Matcher:
 46        #    HttpCode: "200"
 47        Port: 80
 48        Protocol: HTTP
 49        ProtocolVersion: HTTP1
 50        VpcIdentifier: !ImportValue ecs-vpc
 51      Targets:
 52        - Id: !ImportValue ecs-alb-arn
 53          Port: 80
 54
 55      Tags:
 56        - Key: Owner
 57          Value: kuba
 58        - Key: Project
 59          Value: blogpost
 60
 61  GeneralListener:
 62    Type: AWS::VpcLattice::Listener
 63    Properties:
 64      Name: ec2-80
 65      Port: 80
 66      Protocol: HTTP
 67      ServiceIdentifier: !Ref Service
 68      DefaultAction:
 69        Forward:
 70          TargetGroups:
 71            - TargetGroupIdentifier: !Ref EC2TargetGroup
 72              Weight: 10
 73            - TargetGroupIdentifier: !Ref ECSTargetGroup
 74              Weight: 10
 75      Tags:
 76        - Key: Owner
 77          Value: kuba
 78        - Key: Project
 79          Value: blogpost
 80
 81  ECSGeneralSecurityGroup:
 82    Type: AWS::EC2::SecurityGroup
 83    Properties:
 84      GroupDescription: lattice-ecs-too-open-security-group
 85      VpcId: !ImportValue ecs-vpc
 86      SecurityGroupIngress:
 87        - IpProtocol: tcp
 88          FromPort: 80
 89          ToPort: 80
 90          CidrIp: !ImportValue ecs-vpc-cidr
 91      Tags:
 92        - Key: Owner
 93          Value: kuba
 94        - Key: Project
 95          Value: blogpost
 96
 97  EC2GeneralSecurityGroup:
 98    Type: AWS::EC2::SecurityGroup
 99    Properties:
100      GroupDescription: lattice-ecs-too-open-security-group
101      VpcId: !ImportValue ec2-vpc
102      SecurityGroupIngress:
103        - IpProtocol: tcp
104          FromPort: 80
105          ToPort: 80
106          CidrIp: !ImportValue ec2-vpc-cidr
107      Tags:
108        - Key: Owner
109          Value: kuba
110        - Key: Project
111          Value: blogpost
112
113  ECSVpcAssocination:
114    Type: AWS::VpcLattice::ServiceNetworkVpcAssociation
115    Properties:
116      SecurityGroupIds: [!Ref ECSGeneralSecurityGroup]
117      ServiceNetworkIdentifier: !Ref LatticeServiceNetwork
118      VpcIdentifier: !ImportValue ecs-vpc
119      Tags:
120        - Key: Owner
121          Value: kuba
122        - Key: Project
123          Value: blogpost
124
125  EC2VpcAssocination:
126    Type: AWS::VpcLattice::ServiceNetworkVpcAssociation
127    Properties:
128      SecurityGroupIds: [!Ref EC2GeneralSecurityGroup]
129      ServiceNetworkIdentifier: !Ref LatticeServiceNetwork
130      VpcIdentifier: !ImportValue ec2-vpc
131      Tags:
132        - Key: Owner
133          Value: kuba
134        - Key: Project
135          Value: blogpost
136
137  LatticeServiceNetwork:
138    Type: AWS::VpcLattice::ServiceNetwork
139    Properties:
140      AuthType: NONE
141      Name: awesome-service-network
142      Tags:
143        - Key: Owner
144          Value: kuba
145        - Key: Project
146          Value: blogpost
147
148  Service:
149    Type: AWS::VpcLattice::Service
150    Properties:
151      AuthType: NONE
152      Name: awesome-service
153      Tags:
154        - Key: Owner
155          Value: kuba
156        - Key: Project
157          Value: blogpost
158
159  ServiceNetworkAssocination:
160    Type: AWS::VpcLattice::ServiceNetworkServiceAssociation
161    Properties:
162      ServiceIdentifier: !Ref Service
163      ServiceNetworkIdentifier: !Ref LatticeServiceNetwork
164      Tags:
165        - Key: Owner
166          Value: kuba
167        - Key: Project
168          Value: blogpost

As you can see, we need TargetGroup for Listener, which needs to be linked with Service. Service needs to be connected with ServiceNetwork. ServiceNetwork needs an association with VPC, and SecurityGroup in some way.

Testing

After implementation, I was able to execute the flow:

1make build
2git push -f # image building 
3make ecs-fastapi

Next, I just get the service DNS entry, and with that, in the clipboard, I started the SSM session with my standalone instance. The result was as expected:

1root@ip-10-192-20-222 bin]# curl -si awesome-service-03ce89237939acfce.7d67968.vpc-lattice-svcs.us-east-1.on.aws | grep content-type
2content-type: text/html
3[root@ip-10-192-20-222 bin]# curl -si awesome-service-03ce89237939acfce.7d67968.vpc-lattice-svcs.us-east-1.on.aws | grep content-type
4content-type: application/json
5[root@ip-10-192-20-222 bin]# curl -si awesome-service-03ce89237939acfce.7d67968.vpc-lattice-svcs.us-east-1.on.aws | grep content-type
6content-type: application/json
7[root@ip-10-192-20-222 bin]# curl -si awesome-service-03ce89237939acfce.7d67968.vpc-lattice-svcs.us-east-1.on.aws | grep content-type
8content-type: text/html

I was able to hit the service endpoint and received two independent responses from two different sources(FastAPI responded with JSON, and Nginx with HTML). Yay!

Summary

AWS Lattice presents some exciting possibilities for programmers in the networking area. As I dived into the implementation of services, I gained confidence that its MESH-alike solution, heavily inspired Kubernetes' approach to services.

Additionally, Lattice's networking capabilities offer an excellent opportunity for developers to explore distributed systems and service-oriented architecture. With VPC Lattice, developers can build complex networking applications that enable real-time communication between different services, which is critical in building scalable and robust applications. Especially with distributed teams.

I'm waiting for wide service availability and documentation quality improvement. However, even with the current state, I was able to implement a working solution, so it's not such bad. Also, use OIDC with your pipeline always when it's possible! It's easy, fast, and secure. Reduce credentials leak is always a good thing.

Code

As always you can find my code here: