Thursday, December 29, 2022

Cloud Run Health Checks — Spring Boot App

 Cloud Run services now can configure startup and liveness probes for a running container.


The startup probe is for determining when a container has cleanly started up and is ready to take traffic. A Liveness probe kicks off once a container has started up, to ensure that the container remains functional — Cloud Run would restart a container if the liveness probe fails.


Implementing Health Check Probes

A Cloud Run service can be described using a manifest file and a sample manifest looks like this:


apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  annotations:
    run.googleapis.com/ingress: all
  name: health-cloudrun-sample
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/maxScale: '5'
        autoscaling.knative.dev/minScale: '1'
    spec:
      containers:
        image: us-west1-docker.pkg.dev/sample-proj/sample-repo/health-app-image:latest

        startupProbe:
          httpGet:
            httpHeaders:
            - name: HOST
              value: localhost:8080
            path: /actuator/health/readiness
          initialDelaySeconds: 15
          timeoutSeconds: 1
          failureThreshold: 5
          periodSeconds: 10

        livenessProbe:
          httpGet:
            httpHeaders:
            - name: HOST
              value: localhost:8080
            path: /actuator/health/liveness
          timeoutSeconds: 1
          periodSeconds: 10
          failureThreshold: 5

        ports:
        - containerPort: 8080
          name: http1
        resources:
          limits:
            cpu: 1000m
            memory: 512Mi


This manifest can then be used for deployment to Cloud Run the following way:

gcloud run services replace sample-manifest.yaml --region=us-west1

Now, coming back to the manifest, the startup probe is defined this way:

startupProbe:
  httpGet:
    httpHeaders:
    - name: HOST
      value: localhost:8080
    path: /actuator/health/readiness
  initialDelaySeconds: 15
  timeoutSeconds: 1
  failureThreshold: 5
  periodSeconds: 10

It is set to make an http request to a /actuator/health/readiness path. There is an explicit HOST header also provided, this is temporary though as Cloud Run health checks currently have a bug where this header is missing from the health check requests.

The rest of the properties indicate the following:

  • initialDelaySeconds — delay for performing the first probe
  • timeoutSeconds — timeout for the health check request
  • failureThreshold — number of tries before the container is marked as not ready
  • periodSeconds — the delay between probes

Once the startup probe succeeds, Cloud Run would mark the container as being available to handle the traffic.

A livenessProbe follows a similar pattern:

livenessProbe:
  httpGet:
    httpHeaders:
    - name: HOST
      value: localhost:8080
    path: /actuator/health/liveness
  timeoutSeconds: 1
  periodSeconds: 10
  failureThreshold: 5

From a Spring Boot application perspective, all that needs to be done is to enable the Health check endpoints as described here


Conclusion

Start-Up probe ensures that a container receives traffic only when ready and a Liveness probe ensures that the container remains healthy during its operation, else gets restarted by the infrastructure. These health probes are a welcome addition to the already excellent feature set of Cloud Run.


No comments:

Post a Comment