Benchmarking GraphQL solutions in the JS/TS landscape

Benchmarking GraphQL solutions in the JS/TS landscape

We are definitely after GraphQL-hype these days. Articles with (valid) criticism are being published and professionals who understand the strengths of this technology use it to their advantage.

If you are in the latter group, you have to decide how you’d like to build your GraphQL stack. What are the pieces in the puzzle? What libraries to use?

This article focuses on the operational speed of those stacks.

Performance is important and can vary

I got to the benchmarking topic the same way as others before me - the solution I used was just slow and I was looking for answers and alternatives.

In GraphQL world we have a bunch of “primitives” like schema, models, resolvers or data-loaders so all solutions naturally have to gravitate around those. They will differ about some approaches but the core has to be similar because it has to solve similar problems.

If the things are similar, let the fastest win!

Pieces of the puzzle

It might be surprising but I identified 7 levels for the GraphQL stack. Some are optional but pretty common in mature projects so they should be taken into account.

7 levels of GraphQL stack
Levels 1-3 are mandatory for GraphQL to work.

Runtime environment

NodeJS is not the only JS runtime we can use on the server. There are alternatives and quite interesting ones with increasing popularity.

We will test:

  • NodeJS
  • Deno
  • Bun
Runtimes logos

“Web library/framework”

The HTTP module in those environments is ok but too basic for many common things that servers have to do. That’s why we use libraries like:

  • Express
  • Fastify
  • Koa
Web libraries logos

GraphQL library

Once we have a runtime for JS code and a nice way of handling the traffic, we need to start thinking about something that will handle the GraphQL requests.

Options are:

  • Apollo Server
  • Mercurius (for Fastify)
  • GraphQL Yoga
  • graphql-http
GraphQL libraries logos

That’s not everything available on the market. For example, I was thinking about benchmarking benzene as well but I noticed the last activity in the repo to happen 2 years ago. Even if other solutions could rank high, I didn’t want to “recommend” something that I’m not sure if is still maintained.

Schema first vs code first

Most of those servers support both approaches - schema first and code first. After all, what the server cares about is the schema. Either provided (mostly) manually or fully dynamically generated. So theoretically it doesn’t feel like this should matter for the runtime but rather a bit for the build time or start time.

The common (modern?) approach is to use TypeScript and generate the schema from the types we already use in the project. This sounds smart, efficient, and shouldn’t affect the runtime speed.

Unfortunately, I found this to not to be true - the operation speed is affected for one of the solutions (that’s a spoiler 🙊):

What we will test is:

  • type-graphql
  • pothos
Typing libraries logos

What is interesting about Pothos is that they state the following in the docs:

The core of Pothos adds 0 overhead at runtime

Almost as if it should be something that differentiates them from the other solution 😉

Optimising queries execution with graphql-jit

In the path of handling the GraphQL query, the server/engine has to parse the query and execute it against the schema.

graphql-jit optimises this process by compiling the queries into functions and memoizing them so the subsequent execution of the (cached) query is faster. That of course should shine very much in benchmarks that run the same query over and over again but production systems will see the benefits too as they receive the same queries frequently.

So let’s add another thing to the stack when possible:

  • graphql-jit
Just in time

Notes:

  1. GraphQL Yoga has "Parsing and Validation Caching” switched on by default. For a fair comparison, I also tested the performance of Yoga when opting out of it. In the results you’ll see those entries with -no-pav-cache suffix. But it’s not exactly the same as graphql-jit so that variant was included as well with -jit suffix.
  2. I couldn’t make ApolloServer v4 working with graphql-jit. I found a way to do it with v3 but the executor function was removed from v4. Maybe some hacky solution could be done via Apollo Gateway to make it working but… I guess ApolloServer is somehow not interested in this solution and instead proposed other “performance techniques” like Server-Side Caching or Automatic Persisted Queries. Anyway, if interested, you can follow this issue on GH.

Observability

In non-hobby projects, it’s really worth investing in tools that will help us achieve observability of the system (see my colleagues talking about it). There are a bunch of solutions that are tied to specific vendors but I don’t want this to be a comparison of paid solutions so the only thing we are going to test here is:

  • @opentelemetry
OpenTelemetry

Frameworks

So far we were going from the bottom to the top. Adding more and more tools. But this strategy requires knowledge - you have to know what you need in the stack. I very much understand that for folks looking for “a GraphQL server for NodeJS” or “how to do GraphQL on the server” that’s not the natural way of going through this.

Folks new to the topic will rather look for a more rounded solution (and rightly so). In this spirit we will also give a test to the most popular JS backend framework:

  • NestJS
NestJS - the most popular JS back-end framework
State of JS 2023 - Back-end Frameworks. Nest is built on top of Express/Fastify and accounts for 30% of responses in the survey. It has ~3.6M downloads weekly from npm repository.

Benchmarking

The repository with the benchmark is located here: https://github.com/tniezurawski/graphql-benchmarks

I must say I’m inspired by Ben Awad’s work, and he was inspired by Fastify’s benchmarks. I was thinking about contributing to Ben’s fork but it was quite out of date comparing to Fastify and I wanted to capture the current state. Whatever that means in terms of techniques, tools and libraries versions. I discovered some newer version of libraries don’t want to play together anymore. Like ApolloServer and JIT I mentioned earlier.

I also really wanted to give all runtime environments a try and see if any trends appear.

Feel free to contribute, and I’ll update this post.

Methodology

The plan is simple:

  1. Set up the server
  2. Run autocannon: autocannon -c 100 -d 40 -p 10 localhost:3000/graphql

Autocannon fires requests agains the running server and gathers information about how fast it is. Here’s the explanation of arguments:

  • -c 100 - 100 concurrent connections to the server. It simulates 100 users connecting to the server at the same time
  • -d 40 - the test will run for 40 seconds. I have a lot of cases to run, and following Fastify here. It should be enough to spot potential issues with memory leaks or the solution choking at a certain consistent load.
  • -p 10 - this is pipeplining factor. Each connection will send up to 10 requests in sequence, without waiting for responses in between. Useful to test throughput

The query is simplistic, yet, it will stress the solutions:

{
authors {
id
firstName
lastName
fullName # Resolver - string concatenation
md5 # Resolver - hashing function
books { # Async resolver
id
name
genre
}
}
}

Keep in mind that all the data is mocked. Even if we have to load data for books of the author, no XHR is made.

The dataset is small - 20 authors will be returned. The response has 486 lines of formatted JSON and is rather small.

~9 KB of response of uncompressed text
~9 KB of response of uncompressed text

The results

The fastest solution is 6x faster than the slowest!

Results of the benchmark
Results on a bar chart - click to enlarge. See below for a tabular view.

The winner of this race is Fastify + Mercurius running on NodeJS with JIT switched on.

The difference is even more dramatic if we compare the first result that doesn’t use telemetry with the last result which does - then the first is almost 20x faster! 😮

Let’s dive in (tabular data available after hot takes).

Hot takes

If you need bite-sizes takes, here’s what is worth knowing:

  • Environment:
    • the results are quite unpredictable in my opinion. Some stacks work better on NodeJS, some with Bun, and some with Deno - do you see any correlation here that could explain this? I guess I know too little about differences between environments to draw any solid conclusion here.
  • “Web library/framework”:
    • fastify is the fastest. Way ahead of express
    • I’m not sure if koa should be even a part of the comparison. I got a lot of errors of this kind when stressing the solution:
Error: write EPIPE
at afterWriteDispatched (node:internal/stream_base_commons:159:15)
at writevGeneric (node:internal/stream_base_commons:142:3)
at Socket._writeGeneric (node:net:955:11)
at Socket._writev (node:net:964:8)
at doWrite (node:internal/streams/writable:596:12)
at clearBuffer (node:internal/streams/writable:775:5)
at Writable.uncork (node:internal/streams/writable:531:7)
at ServerResponse.end (node:_http_outgoing:1141:19)
at respond (--PATH--/graphql-benchmarks/node_modules/koa/lib/application.js:302:44)
at handleResponse (--PATH--/graphql-benchmarks/node_modules/koa/lib/application.js:184:34)

To my understanding it fails under the heavy load of autocannon. It still was able to handle the traffic to some extend but I’m not sure of that result(s) and would stay away from Koa when using GraphQL. The errors were thrown for both koa-apollo and koa-graphql-api stacks.

  • GraphQL library:
    • apollo-server is the slowest and dragging down other tech
    • mercurius, together with fastify is very fast
  • JIT:
    • grapqhl-jit - every solution with JIT is faster and all the fastest solutions use it
    • having in mind the above, it’s really bad for ApolloServer to not support it 🤷‍♂️
  • Schema first vs code first
    • I’m surprised to learn that type-graphql adds an overhead. Although, it is not super high and oscillate between 3% to 5%. Would you accept 3-5% performance drop to use types with type-graphql?
    • Pothos on the other hand is responsible for ~1% of performance degradation so I think it might be just a noise in results. It very likely holds the promise of adding 0 overhead at runtime.
  • Observability:
    • gathering telemetry data is costly - as expected, but average drop of 80% speed for every solution is ridiculous!
      • the issue is with gathering data about resolvers
      • not tracking trivial resolver spans (results with -itrs suffix) makes it a bit better - whenever a field is returned 1:1 without a function involved then we don’t track it. TBH, not much interesting info there and from my experience you’ll see those spans being “resolver” in 1ms
      • not tracking resolver spans in general (results with -irs suffix) is a big deal - when you stop tracking resolvers, you will observe a drop of about 13% performance speed.
      • so not tracking resolvers seems to be a must otherwise, who can accept 80% slowdown?
  • Frameworks:
    • If you follow NestJS tutorial on building GraphQL server you’ll end up with express and ApolloServer which is the slowest solution here
    • Even the absolute killer solution (fastify-mercurius-jit) is dragged down by NestJS and sees 40% decrease in performance
    • I debugged the framework and I see two bottlenecks. One of them can be mitigated by not using @Parent() decorator in resolvers but the other is unavoidable as it touches resolvers themselves. “Deparentification” accounts for almost 30% performance boost. So it’s definitely worth doing. But it’s a pity that the poor performance of @Parent() is not documented properly.
  • Absolute values of latency:
    • Comparing requests per second is important but have in mind the absolute values of latency. The slowest solution with telemetry needs 526ms to process a request without any calls being done to DB or external/internal services! That's very bad.

In a table

Stack (all 45)Requests/sLatency (ms)Throughput/Mb
nodejs-fastify-mercurius-jit12754.277.8110.2
nodejs-fastify-mercurius-pothos-jit12433.279.8107.5
deno-yoga-jit12092.082.1104.2
bun-fastify-mercurius-jit11745.884.6101.0
bun-yoga-jit11057.889.895.0
nodejs-yoga-jit10941.290.894.5
nodejs-fastify-mercurius-type-graphql-jit9991.899.586.3
nodejs-nestjs-fastify-mercurius-jit-deparentification9904.6100.385.6
deno-fastify-mercurius-jit9608.6103.582.8
nodejs-nestjs-fastify-mercurius-jit7679.1129.566.4
bun-yoga7501.2132.664.5
bun-fastify-mercurius6784.0146.758.3
nodejs-express-yoga-jit6765.6147.058.8
nodejs-fastify-mercurius5786.2172.050.0
deno-yoga5773.2172.449.8
nodejs-fastify-mercurius-pothos5712.1174.249.4
nodejs-fastify-mercurius-type-graphql5472.2181.947.3
nodejs-yoga5443.6182.847.0
nodejs-yoga-type-graphql5332.6186.646.1
bun-graphql-http5319.1187.145.7
nodejs-fastify-mercurius-open-telemetry-irs4976.9199.043.0
deno-fastify-mercurius4828.4206.141.6
nodejs-express-yoga4352.3224.137.8
nodejs-express-yoga-pothos4124.5235.235.9
nodejs-koa-graphql-api3925.4245.233.9
nodejs-nestjs-fastify-mercurius3915.6245.833.8
nodejs-yoga-no-pav-cache3913.3246.033.8
nodejs-graphql-http3843.1249.833.3
nodejs-express-yoga-type-graphql3625.2262.731.5
nodejs-graphql-http-type-graphql3580.5266.831.0
nodejs-express-graphql-http3157.2293.727.4
nodejs-express-graphql-http-type-graphql3037.1300.426.4
nodejs-koa-apollo2963.5304.425.8
nodejs-fastify-mercurius-open-telemetry-itrs2772.8314.524.0
nodejs-express-yoga-no-pav-cache2629.1319.222.9
nodejs-express-apollo2612.8321.222.9
nodejs-express-apollo-pothos2587.5323.422.7
nodejs-express-apollo-type-graphql2478.8316.621.7
nodejs-nestjs-fastify-apollo2467.7311.021.4
nodejs-express-apollo-open-telemetry-irs2313.2301.020.3
nodejs-nestjs-express-apollo2095.5299.118.3
nodejs-nestjs-express-apollo-open-telemetry1553.0377.313.6
nodejs-fastify-mercurius-open-telemetry751.6462.86.5
nodejs-express-apollo-open-telemetry-itrs648.4535.75.7
nodejs-express-apollo-open-telemetry641.5526.35.6

Conclusions

No matter if you use a framework like NestJS, or not, fastify, mercurius and graphql-jit should be your choice. When gathering telemetry data, don't run on defaults, switch off resolvers tracking and switch it on only when you really have to dive into some issues.

Best of luck!