Summary
Spring Boot 2 with Spring Webflux based application outperforms a Spring Boot 1 based application by a huge margin for IO heavy workloads. The following is a summarized result of a load test - Response time for a IO heavy transaction with varying concurrent users:When the number of concurrent users remains low (say less than 1000) both Spring Boot 1 and Spring Boot 2 handle the load well and the 95 percentile response time remains milliseconds above a expected value of 300 ms.
At higher concurrency levels, the Async Non-Blocking IO and reactive support in Spring Boot 2 starts showing its colors - the 95th percentile time even with a very heavy load of 5000 users remains at around 312ms! Spring Boot 1 records a lot of failures and high response times at these concurrency levels.
Details
My set-up for the performance test is the following:
The sample applications expose an endpoint(/passthrough/message) which in-turn calls a downstream service. The request message to the endpoint looks something like this:
{ "id": "1", "payload": "sample payload", "delay": 3000 }
The downstream service would delay based on the "delay" attribute in the message (in milliseconds).
Spring Boot 1 Application
I have used Spring Boot 1.5.8.RELEASE for the Boot 1 version of the application. The endpoint is a simple Spring MVC controller which in turn uses Spring's RestTemplate to make the downstream call. Everything is synchronous and blocking and I have used the default embedded Tomcat container as the runtime. This is the raw code for the downstream call:public MessageAck handlePassthrough(Message message) { ResponseEntity<MessageAck> responseEntity = this.restTemplate.postForEntity(targetHost + "/messages", message, MessageAck.class); return responseEntity.getBody(); }
Spring Boot 2 Application
Spring Boot 2 version of the application exposes a Spring Webflux based endpoint and uses WebClient, the new non-blocking, reactive alternate to RestTemplate to make the downstream call - I have also used Kotlin for the implementation, which has no bearing on the performance. The runtime server is Netty:import org.springframework.http.HttpHeaders import org.springframework.http.MediaType import org.springframework.web.reactive.function.BodyInserters.fromObject import org.springframework.web.reactive.function.client.ClientResponse import org.springframework.web.reactive.function.client.WebClient import org.springframework.web.reactive.function.client.bodyToMono import org.springframework.web.reactive.function.server.ServerRequest import org.springframework.web.reactive.function.server.ServerResponse import org.springframework.web.reactive.function.server.bodyToMono import reactor.core.publisher.Mono class PassThroughHandler(private val webClient: WebClient) { fun handle(serverRequest: ServerRequest): Mono<ServerResponse> { val messageMono = serverRequest.bodyToMono<Message>() return messageMono.flatMap { message -> passThrough(message) .flatMap { messageAck -> ServerResponse.ok().body(fromObject(messageAck)) } } } fun passThrough(message: Message): Mono<MessageAck> { return webClient.post() .uri("/messages") .header(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE) .header(HttpHeaders.ACCEPT, MediaType.APPLICATION_JSON_VALUE) .body(fromObject<Message>(message)) .exchange() .flatMap { response: ClientResponse -> response.bodyToMono<MessageAck>() } } }
Details of the Perfomance Test
The test is simple, for different sets of concurrent users (300, 1000, 1500, 3000, 5000), I send a message with the delay attribute set to 300 ms, each user repeats the scenario 30 times with a delay between 1 to 2 seconds between requests. I am using the excellent Gatling tool to generate this load.
Results
These are the results as captured by Gatling:300 concurrent users:
Boot 1 | Boot 2 |
---|---|
1000 concurrent users:
Boot 1 | Boot 2 |
---|---|
1500 concurrent users:
Boot 1 | Boot 2 |
---|---|
3000 concurrent users:
Boot 1 | Boot 2 |
---|---|
5000 concurrent users:
Boot 1 | Boot 2 |
---|---|