Thursday, September 30, 2021

Google Cloud Deploy - CD for a Java based project

This is a short write-up on using Google Cloud Deploy for Continuous Deployment of a Java-based project. 

Google Cloud Deploy is a new entrant to the CD space. It facilitates a continuous deployment currently to GKE based targets and in future to other Google Cloud application runtime targets.

Let's start with why such a tool is required, why not an automation tool like Cloud Build or Jenkins. In my mind it comes down to these things:

  1. State - a dedicated CD tool can keep state of the artifact, to the environments where the artifact is deployed. This way promotion of deployments, rollback to an older version, roll forward is easily done. Such an integration can be built into a CI tool but it will involve a lot of coding effort.
  2. Integration with the Deployment environment - a CD tools integrates well the target deployment platform without too much custom code needed.

Target Flow

I am targeting a flow which looks like this, any merge to a "main" branch of a repository should:
1. Test and build an image
2. Deploy the image to a "dev" GKE cluster
3. The deployment can be promoted from the "dev" to the "prod" GKE cluster


Building an Image

Running the test and building the image is handled with a combination of Cloud Build providing the build automation environment and skaffold providing tooling through Cloud Native Buildpacks. It may be easier to look at the code repository to see how both are wired up - https://github.com/bijukunjummen/hello-skaffold-gke


Deploying the image to GKE

Now that an image has been baked, the next step is to deploy this into a GKE Kubernetes environment.  Cloud Deploy has a declarative way of specifying the environments(referred to as Targets) and how to promote the deployment through the environments. A Google Cloud Deploy pipeline looks like this:


The pipeline is fairly easy to read. Target(s) describe the environments to deploy the image to and the pipeline shows how progression of the deployment across the environments is handled. 

One thing to notice is that the "prod" target has been marked with a "requires approval" flag which is a way to ensure that the promotion to prod environment happens only with an approval. Cloud Deploy documentation has a good coverage of all these concepts. Also, there is a strong dependence on skaffold to generate the kubernetes manifests and deploying them to the relevant targets.

Given such a deployment pipeline, it can be put in place using:

gcloud beta deploy apply --file=clouddeploy.yaml --region=us-west1

Alright, now that the CD pipeline is in place, a "Release" can be triggered once the testing is completed in a "main" branch, a command which looks like this is integrated with the Cloud Build pipeline to do this, with a file pointing to the build artifacts:

gcloud beta deploy releases create release-01df029 --delivery-pipeline hello-skaffold-gke --region us-west1 --build-artifacts artifacts.json
This deploys the generated kubernetes manifests pointing to the right build artifacts to the "dev" environment


and can then be promoted to additional environments, prod in this instance.

Conclusion

This is a whirlwind tour of Google Cloud Deploy and the feature that it offers. It is still early days and I am excited to see where the Product goes. The learning curve is fairly steep, it is expected that a developer understands:
  1. Kubernetes, which is the only application runtime currently supported, expect other runtimes to be supported as the Product evolves.
  2. skaffold, which is used for building, tagging, generating kubernetes artifacts
  3. Cloud Build and its yaml configuration
  4. Google Cloud Deploys yaml configuration

It will get simpler as the Product matures.

Saturday, September 25, 2021

Cloud Build and Gradle/Maven Caching

One of the pain points in all the development projects that I have worked on has been setting up/getting an infrastructure for automation. This has typically meant getting access to an instance of Jenkins. I have great respect for Jenkins as a tool, but each deployment of Jenkins tends to become a Snowflake over time with the different set of underlying plugins, version of software, variation of pipeline script etc.

This is exactly the niche that a tool like Cloud Build solves for, deployment is managed by Google Cloud platform, and the build steps are entirely user driven based on the image used for each step of the pipeline.

In the first post I went over the basics of creating a Cloud Build configuration and in the second post went over a fairly comprehensive pipeline for a Java based project.

This post will conclude the series by showing an approach to caching in the pipeline - this is far from original, I am borrowing generously from a few sample configurations that I have found. So let me start by describing the issue being solved for.


Problem

Java has two popular build tools - Gradle and Maven. Each of these tools download a bunch of dependencies and cache these dependencies at startup -
  1. The tool itself is not a binary, but a wrapper which knows to download the right version of the tools binary.
  2. The projects dependencies specified in tool specific DSL's are then downloaded from repositories.
The issue is that across multiple builds the dependencies tend to get downloaded when run

Caching across Runs of a Build

The solution is to cache the downloaded artifacts across the different runs of a build. There is unfortunately no built in way (yet) in Cloud Build to do this, however a mechanism can be built along these lines:
  1. Cache the downloaded dependencies into Cloud Storage at the end of the build 
  2. And then use it to rehydrate the dependencies at the beginning of the build, if available

A similar approach should work for any tool that downloads dependencies. The trick though is figuring out where each tool places the dependencies and knowing what to save to Cloud storage and back. 

Here is an approach for Gradle and Maven.

Each step of the cloud build loads the exact same volume:
    
	volumes:
      - name: caching.home
        path: /cachinghome

Then  explodes the cached content from cloud storage into this volume.

    dir: /cachinghome
    entrypoint: bash
    args:
      - -c
      - |
        (
          gsutil cp gs://${_GCS_CACHE_BUCKET}/gradle-cache.tar.gz /tmp/gradle-cache.tar.gz &&
          tar -xzf /tmp/gradle-cache.tar.gz
        ) || echo 'Cache not found'
    volumes:
      - name: caching.home
        path: /cachinghome

Now, Gradle and Maven store the dependencies into a ".gradle" and ".m2" folder in a users home directory respectively. The trick then is to link the $USER_HOME/.gradle and $USER_HOME/.m2 folder to the exploded directory:


  - name: openjdk:11
    id: test
    entrypoint: "/bin/bash"
    args:
      - '-c'
      - |-
        export CACHING_HOME="/cachinghome"
        USER_HOME="/root"
        GRADLE_HOME="$${USER_HOME}/.gradle"
        GRADLE_CACHE="$${CACHING_HOME}/gradle"

        mkdir -p $${GRADLE_CACHE}

        [[ -d "$${GRADLE_CACHE}" && ! -d "$${GRADLE_HOME}" ]] && ln -s "$${GRADLE_CACHE}" "$${GRADLE_HOME}"
        ./gradlew check
    volumes:
      - name: caching.home
        path: /cachinghome

The gradle tasks should now use the cached content if available or create the cached content if it is being run for the first time. 


It may be simpler to see a sample build configuration which is here - https://github.com/bijukunjummen/hello-cloud-build/blob/main/cloudbuild.yaml