This is second in a series of posts exploring service to service call patterns in some of the application runtimes on Google Cloud. The first in the series explored service to service call patterns in GKE.
This post will expand on it by adding in a Service Mesh, specifically Anthos Service Mesh, and explore how the service to service patterns change in the presence of a mesh. The service to service call with be across services in a single cluster. The next post will explore services deployed to multiple GKE clusters.
Set-Up
The steps to set-up a GKE cluster and install Anthos service mesh on top of it is described in this document - https://cloud.google.com/service-mesh/docs/unified-install/install, in brief these are the commands that I had to run in my GCP Project to get a cluster running:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
export PROJECT=$(gcloud config get-value project) | |
export ZONE="us-west1-a" | |
# Create a GKE Standard cluster named "solo" | |
gcloud container clusters create solo \ | |
--project=${PROJECT} \ | |
--zone=${ZONE} \ | |
--enable-ip-alias \ | |
--machine-type=e2-standard-4 \ | |
--num-nodes=2 \ | |
--workload-pool=${PROJECT}.svc.id.goog | |
# Download the asmcli utility | |
curl https://storage.googleapis.com/csm-artifacts/asm/asmcli_1.11 > asmcli | |
# Install service mesh on the newly created "solo" cluster | |
./asmcli install \ | |
--project_id ${PROJECT} \ | |
--cluster_name solo \ | |
--cluster_location us-west1-a \ | |
--fleet_id ${PROJECT} \ | |
--enable_all \ | |
--ca mesh_ca | |
# Determine the ASM revision | |
ASM_REVISION=$(kubectl get deploy -n istio-system -l app=istiod -o jsonpath={.items[*].metadata.labels.'istio\.io\/rev'}'{"\n"}') | |
# Install an ingress gateway that can work with the service mesh | |
kubectl create namespace gw-namespace | |
kubectl label namespace gw-namespace \ | |
istio.io/rev=${ASM_REVISION} --overwrite | |
git clone https://github.com/GoogleCloudPlatform/anthos-service-mesh-packages.git | |
kubectl apply -n gw-namespace \ | |
-f anthos-service-mesh-packages/samples/gateways/istio-ingressgateway | |
# Create a namespace to host applications with mesh proxy injected in | |
kubectl create namespace istio-apps | |
kubectl label namespace istio-apps istio-injection- istio.io/rev=${ASM_REVISION} --overwrite |
The services that I will be installing is fairly simple and looks like this:
- Introduce response time delays
- Respond with certain status codes
The codebase for the "caller" and "producer" are in this repository - https://github.com/bijukunjummen/sample-service-to-service, there are kubernetes manifests available in the repository to bring up these services.
Behavior 1 - Mutual TLS
The first behavior that I want to see is for the the caller and the producer to verify each others identities by presenting and validating their certificates.
This can be done by adding in a istio DestinationRule for the producer, along these lines:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
apiVersion: networking.istio.io/v1alpha3 | |
kind: DestinationRule | |
metadata: | |
name: sample-producer-dl | |
namespace: istio-apps | |
spec: | |
host: sample-producer.istio-apps.svc.cluster.local | |
trafficPolicy: | |
tls: | |
mode: ISTIO_MUTUAL | |
--- | |
apiVersion: networking.istio.io/v1alpha3 | |
kind: DestinationRule | |
metadata: | |
name: sample-caller-dl | |
namespace: istio-apps | |
spec: | |
host: sample-caller.istio-apps.svc.cluster.local | |
trafficPolicy: | |
tls: | |
mode: ISTIO_MUTUAL | |
Alright now that the set-up in place, the following is what gets captured as the request flows from the Browser to the Ingress Gateway to the Caller to the Producer.
Behavior 2 - Timeout
The second behavior that I want to explore is the timeouts. A request timeout can be set for the call from the Caller to Producer by creating a Virtual Service for the Producer with the value set, along these lines:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
apiVersion: networking.istio.io/v1alpha3 | |
kind: VirtualService | |
metadata: | |
name: sample-producer-route | |
namespace: istio-apps | |
spec: | |
hosts: | |
- "sample-producer.istio-apps.svc.cluster.local" | |
http: | |
- timeout: 5s | |
route: | |
- destination: | |
host: sample-producer | |
port: | |
number: 8080 |
The mesh responds with a http status code of 504 with a message of "Upstream timed out".
Behavior 3 - Circuit Breaker
Circuit breaker is implemented using a Destination Rule resource
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
apiVersion: networking.istio.io/v1alpha3 | |
kind: DestinationRule | |
metadata: | |
name: sample-producer-dl | |
namespace: istio-apps | |
spec: | |
host: sample-producer.istio-apps.svc.cluster.local | |
trafficPolicy: | |
tls: | |
mode: ISTIO_MUTUAL | |
connectionPool: | |
outlierDetection: | |
consecutive5xxErrors: 3 | |
interval: 15s | |
baseEjectionTime: 15s |
With this configuration in place a request with broken circuit looks like this:
Conclusion
The neat thing is that in all scenarios so far, the way the Caller calls the Producer remains exactly the same, it is the mesh which injects in the appropriate security controls through mTLS and the resilience of calling service through timeouts and circuit breaker.