Summary
Spring Boot 2 with Spring Webflux based application outperforms a Spring Boot 1 based application by a huge margin for IO heavy workloads. The following is a summarized result of a load test - Response time for a IO heavy transaction with varying concurrent users:When the number of concurrent users remains low (say less than 1000) both Spring Boot 1 and Spring Boot 2 handle the load well and the 95 percentile response time remains milliseconds above a expected value of 300 ms.
At higher concurrency levels, the Async Non-Blocking IO and reactive support in Spring Boot 2 starts showing its colors - the 95th percentile time even with a very heavy load of 5000 users remains at around 312ms! Spring Boot 1 records a lot of failures and high response times at these concurrency levels.
Details
My set-up for the performance test is the following:
The sample applications expose an endpoint(/passthrough/message) which in-turn calls a downstream service. The request message to the endpoint looks something like this:
1 2 3 4 5 | { "id" : "1" , "payload" : "sample payload" , "delay" : 3000 } |
The downstream service would delay based on the "delay" attribute in the message (in milliseconds).
Spring Boot 1 Application
I have used Spring Boot 1.5.8.RELEASE for the Boot 1 version of the application. The endpoint is a simple Spring MVC controller which in turn uses Spring's RestTemplate to make the downstream call. Everything is synchronous and blocking and I have used the default embedded Tomcat container as the runtime. This is the raw code for the downstream call:1 2 3 4 5 | public MessageAck handlePassthrough(Message message) { ResponseEntity<MessageAck> responseEntity = this .restTemplate.postForEntity(targetHost + "/messages" , message, MessageAck. class ); return responseEntity.getBody(); } |
Spring Boot 2 Application
Spring Boot 2 version of the application exposes a Spring Webflux based endpoint and uses WebClient, the new non-blocking, reactive alternate to RestTemplate to make the downstream call - I have also used Kotlin for the implementation, which has no bearing on the performance. The runtime server is Netty:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | import org.springframework.http.HttpHeaders import org.springframework.http.MediaType import org.springframework.web.reactive.function.BodyInserters.fromObject import org.springframework.web.reactive.function.client.ClientResponse import org.springframework.web.reactive.function.client.WebClient import org.springframework.web.reactive.function.client.bodyToMono import org.springframework.web.reactive.function.server.ServerRequest import org.springframework.web.reactive.function.server.ServerResponse import org.springframework.web.reactive.function.server.bodyToMono import reactor.core.publisher.Mono class PassThroughHandler( private val webClient: WebClient) { fun handle(serverRequest: ServerRequest): Mono<ServerResponse> { val messageMono = serverRequest.bodyToMono<Message>() return messageMono.flatMap { message -> passThrough(message) .flatMap { messageAck -> ServerResponse.ok().body(fromObject(messageAck)) } } } fun passThrough(message: Message): Mono<MessageAck> { return webClient.post() .uri( "/messages" ) .header(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE) .header(HttpHeaders.ACCEPT, MediaType.APPLICATION_JSON_VALUE) .body(fromObject<Message>(message)) .exchange() .flatMap { response: ClientResponse -> response.bodyToMono<MessageAck>() } } } |
Details of the Perfomance Test
The test is simple, for different sets of concurrent users (300, 1000, 1500, 3000, 5000), I send a message with the delay attribute set to 300 ms, each user repeats the scenario 30 times with a delay between 1 to 2 seconds between requests. I am using the excellent Gatling tool to generate this load.
Results
These are the results as captured by Gatling:300 concurrent users:
Boot 1 | Boot 2 |
---|---|
1000 concurrent users:
Boot 1 | Boot 2 |
---|---|
1500 concurrent users:
Boot 1 | Boot 2 |
---|---|
3000 concurrent users:
Boot 1 | Boot 2 |
---|---|
5000 concurrent users:
Boot 1 | Boot 2 |
---|---|