Friday, June 22, 2018

Tracing a reactive flow - Using Spring Cloud Sleuth with Boot 2

Spring Cloud Sleuth which adds Spring instrumentation support on top of OpenZipkin Brave makes distributed tracing trivially simple for Spring Boot applications. This is a quick write up on what it takes to add support for distributed tracing using this excellent library.

Consider two applications - a client application which uses an upstream service application, both using Spring WebFlux, the reactive web stack for Spring:


My objective is to ensure that flows from user to the client application to the service application can be traced and latencies cleanly recorded for requests.


The final topology that Spring Cloud Sleuth enables is the following:


The sampled trace information from the client and the service app is exported to Zipkin via a queuing mechanism like RabbitMQ.


So what are the changes required to the client and the service app - like I said it is trivially simple! The following libraries need to be pulled in - in my case via gradle:

compile("org.springframework.cloud:spring-cloud-starter-sleuth")
 compile("org.springframework.cloud:spring-cloud-starter-zipkin")
 compile("org.springframework.amqp:spring-rabbit")

The versions are not specified as they are expected to be pulled in via Spring Cloud BOM and thanks to Spring Gradle Dependency Management plugin:


ext {
    springCloudVersion = 'Finchley.RELEASE'
}

apply plugin: 'io.spring.dependency-management'

dependencyManagement {
    imports {
        mavenBom "org.springframework.cloud:spring-cloud-dependencies:${springCloudVersion}"
    }
}

And that is actually it, any logs from the application should now start recording the trace and the spans, see how he traceid carried forward in the following logs spanning two different services:

2018-06-22 04:06:28.579  INFO [sample-client-app,c3d507df405b8aaf,c3d507df405b8aaf,true] 9 --- [server-epoll-13] sample.load.PassThroughHandler           : handling message: Message(id=null, payload=Test, delay=1000)
2018-06-22 04:06:28.586  INFO [sample-service-app,c3d507df405b8aaf,829fde759da15e63,true] 8 --- [server-epoll-11] sample.load.MessageHandler               : Handling message: Message(id=5e7ba240-f97d-405a-9633-5540bbfe0df1, payload=Test, delay=1000)

Further the Zipkin UI records the exported information and can visually show a sample trace the following way:



This sample is available in my github repository here - https://github.com/bijukunjummen/sleuth-webflux-sample and can be started up easily using docker-compose with all the dependencies wired in.

1 comment:

  1. Thanks for the example, gave me something to chew on :-)

    I unfortunately failed to run docker-compose up - first I had to fix image locations, obviously, as they were pointing to your account on the Docker hub, but when running docker-compose, it fails with this:

    bash-4.4$ docker-compose up
    Pulling sample-service-app (sample-backend-app:0.0.1-SNAPSHOT)...
    ERROR: The image for the service you're trying to recreate has been removed. If you continue, volume data could be lost. Consider backing up your data before continuing.

    Continue with the new image? [yN]y
    Pulling sample-service-app (sample-backend-app:0.0.1-SNAPSHOT)...
    ERROR: pull access denied for sample-backend-app, repository does not exist or may require 'docker login'
    bash-4.4$ docker login
    Authenticating with existing credentials...
    Login Succeeded
    bash-4.4$ docker-compose up
    Pulling sample-service-app (sample-backend-app:0.0.1-SNAPSHOT)...
    ERROR: The image for the service you're trying to recreate has been removed. If you continue, volume data could be lost. Consider backing up your data before continuing.

    Continue with the new image? [yN]y
    Pulling sample-service-app (sample-backend-app:0.0.1-SNAPSHOT)...
    ERROR: pull access denied for sample-backend-app, repository does not exist or may require 'docker login'

    I'm running on Mac OS X, so I went the non-Docker way, but failed with the start of client and server:

    bash-4.4$ ./gradlew -p applications/sample-client-app clean bootRun

    > Connecting to Daemon
    FAILURE: Build failed with an exception.

    * What went wrong:
    Could not create service of type ScriptPluginFactory using BuildScopeServices.createScriptPluginFactory().
    > Could not create service of type PluginResolutionStrategyInternal using BuildScopeServices.createPluginResolutionStrategy().

    I finally switched to running RabbitMQ (on CLI), client and server (in IDEA) manually, and that worked.

    Best regards.

    ReplyDelete