An Analysis of the Performance of WebSockets in Various Programming Languages (2021)

62 points by max0563 10 hours ago

If the author is reading this, I think a single repository would be more appropriate than multiple repos [1]. It would be nice to set things up so we can simply git pull, docker run, and execute the benchmarks for each language sequentially.

Something that stood out to me is the author’s conclusion that "Node.js wins." However, both the Node.js and C++ versions use the same library, uWebSockets! I suspect the actual takeaway is this:

"uWebSockets wins, and the uWebSockets authors know their library well enough that even their JavaScript wrapper outperforms my own implementation in plain C++ using the same library!" :-p

Makes me wonder if there’s something different that could be done in Go to achieve better performance. Alternatively, this may highlight which language/library makes it easier to do the right thing out of the box (for example, it seems easier to use uWebsockets in nodejs than in C++). TechEmpower controversies also come to mind, where "winning" implementations often don’t reflect how developers typically write code in a given language, framework, or library.

1: https://github.com/matttomasetti?tab=repositories&q=websocke...

latch 8 hours ago

Their explanation for why Go performs badly didn't make any sense to me. I'm not sure if they don't understand how goroutines work, if I don't understand how goroutines work or if I just don't understand their explanation.

Also, in the end, they didn't use the JSON payload. It would have been interesting if they had just written a static string. I'm curious how much of this is really measuring JSON [de]serialization performance.

Finally, it's worth pointing out that WebSocket is a standard. It's possible that some of these implementations follow the standard better than others. For example, WebSocket requires that a text message be valid UTF8. Personally, I think that's a dumb requirement (and in my own websocket server implementation for Zig, I don't enforce this - if the application wants to, it can). But it's completely possible that some implementations enforce this and others don't, and that (along with every other check) could make a difference.

klabb3 36 minutes ago

> Their explanation for why Go performs badly didn't make any sense to me.
To me, the whole paper is full of misunderstanding, at least the analysis. There's just speculation based on caricatures of the language, like "node is async", "c++ is low level" etc. The fact that their C++ impl using uWebSocket was significantly slower than then Node, which used uWebSocket bindings, should have led them to question the test setup (they probably used threads which defeats the purpose of uWebSocket.
Anyway.. The "connection time" is just HTTP handshake. It could be included as a side note. What's important in WS deployments are:
- Unique message throughput (the only thing measured afaik).
- Broadcast/"multicast" throughput, i.e. say you have 1k subscribers you wanna send the same message.
- Idle memory usage (for say chat apps that have low traffic - how many peers can a node maintain)
To me, the champion is uWebSocket. That's the entire reason why "Node" wins - those language bindings were written by the same genius who wrote that lib. Note that uWebSocket doesn't have TLS support, so whatever reverse proxy you put in front is gonna dominate usage because all of them have higher overheads, even nginx.
Interesting to note is that uWebSocket perf (especially memory footprint) can't be achieved even in Go, because of the goroutine overhead (there's no way in Go to read/write from multiple sockets from a single goroutine, so you have to spend 2 gorountines for realtime r/w). It could probably be achieved with Tokio though.
tgv 5 hours ago

> I'm curious how much of this is really measuring JSON [de]serialization performance.
Well, they did use the standard library for that, so quite a bit, I suppose. That thing is slow. I've got no idea how fast those functions are in other languages, but you're right that it would ruin the idea behind the benchmark.
vandot 8 hours ago

They didn’t use goroutines, which is explains the poor perf. https://github.com/matttomasetti/Go-Gorilla_Websocket-Benchm...
Also, this paper is from Feb 2021.
- windlep 8 hours ago
  
  I was under the impression that the underlying net/http library uses a new goroutine for every connection, so each websocket gets its own goroutine. Or is there somewhere else you were expecting goroutines in addition to the one per connection?
  - donjoe 6 hours ago
    
    Which is perfectly fine. However, you will be able to process only a single message per connection at once.
    What you would do in go is:
    - either a new goroutine per message
    - or installing a worker pool with a predefined goroutine size accepting messages for processing
    
    jand 2 hours ago
    
    Another option is to have a read-, and a write-pump goroutine associated with each gorilla ws client. I found this useful for gateways wss <--> *.
- initplus 4 hours ago
  
  http.ListenAndServe is implemented under the hood with a new goroutine per incoming connection. You don't have to explicitly use goroutines here, it's the default behaviour.
  - necrobrit 4 hours ago
    
    Yes _however_ the nodejs benchmark at least is handling each message asynchronously, whereas the go implementation is only handling connections asynchronously.
    The client fires off all the requests before waiting for a response: https://github.com/matttomasetti/NodeJS_Websocket-Benchmark-... so the comparison isn't quite apples to apples.
    Edit to add: looks like the same goes for the c++ and rust implementations. So I think what we might be seeing in this benchmark (particularly the node vs c++ since it is the same library) is that asynchronously handling each message is beneficial, and the go standard libraries json parser is slow.
    Edit 2: Actually I think the c++ version is async for each message! Dont know how to explain that then.
    
    josephg 2 hours ago
    
    Well, tcp streams are purely sequential. It’s the ideal use case for a single process, since messages can’t be received out of order. There’s no computational advantage to “handling each message asynchronously” unless the message handling code itself does IO or something. And that’s not the responsibility of the websocket library.
ikornaselur 6 hours ago

Yeah I thought this looked familiar.. I went through this article about a year and a half ago when exploring WebSockets in Python for work. With some tuning and using a different libraries + libuv we were easily able to get similar performance to NodeJS.
I had a blog post somewhere to show the testing and results, but can't seem to find it at the moment though.

simpaticoder 7 hours ago

Interesting that https://github.com/uNetworking/uWebSockets.js (which is C++ with node bindings) outperforms the raw C++ uWebSockets implementation.

It's also interesting that https://github.com/websockets/ws does not appear in this study, given that in the node ecosystem it is ~3x more likely to be used (not a perfect measurement but ws has 28k github stars vs uWebSockets 8k stars)

travisgriggs 8 hours ago

Thanks for the free access links. I did read through a bit.

The title is misleading because exactly one implementation was chosen for each of the tested languages. They conclude “do not us e Python” because the Python websockets library performs pretty poorly.

Each language is scored based on the library chosen. I have to believe there are more options for some of these languages.

As someone who is implementing an Elixir LiveView app right now, I was particularly curious to see how Elixir performed given LiveViews reliance on websockets, but as Elixir didn’t make the cut.

nelsonic 6 hours ago

Was also surprised they omitted Elixir/Erlang from the list of languages. Crazy considering how many messaging apps use OTP on the backend.

wuschel 2 hours ago

Is this a peer reviewed paper? It does not seem to be. At a first glance, the researchgate URI and the way the title was formulated made me think it would be the case.

fnordpiglet 8 hours ago

(2021) Was surprised it used a depreciated Rust crate until I noticed how out of date it is

frizlab an hour ago

Not including Swift in such a research seems to be a big oversight to me.

5Qn8mNbc2FNCiVV 8 hours ago

Too bad that uWebsockets was used for Node because a lot of higher level libraries are built on top of https://www.npmjs.com/package/ws

windlep 8 hours ago

I was able to make a uWebsockets adapter for NestJS pretty easily. It's a bit sensitive of a library to integrate though, a single write when the connection is gone and you get a segfault, which means a lot of checking before writing if you've yielded since you last checked. This was a few years ago, perhaps they fixed that.

cess11 an hour ago

I'd like to know why Elixir and Erlang were excluded.

cess11 an hour ago

Seems the author went silent after this, maybe he decided to run a café or something instead.

zo1 5 hours ago

Was this published as-is to some sort of prominent CS journal? I honestly can't tell from the link. If that's the case, I'm very disappointed and would have a few choice words about the state of "academia".

ndusart 3 hours ago

Yes, that would be concerning indeed...
The author couldn't tell why he didn't manage to make run the C or python program but figured it is probably the blame of the language for some obscure reasons.
He also mentioned that he should have implemented multithreading in C++ to be comparable with Node, but meh that's probably also not of his concern, let compare them as is ^^`
Also he doesn't mention the actual language of the library used, but that would have voided the interest of the article, so I quite may understand that omission :P
But at the end, nothing can be learned from this and it is hard to believe it is what "research" can produce
- josephg 2 hours ago
  
  Yeah it’s a rubbish paper. It’s just a comparison of some websocket implementations at some particular point in time. It tells you how fast some of the fastest WS implementations are in absolute terms, but there are no broad conclusions you can make other than the fact that there’s more room for optimisation in a few libraries. Whoopty doo. News at 11.

paulgb 9 hours ago

The SSRN link doesn’t have a login-wall: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3778525

chrisweekly 9 hours ago

Thanks! Here's the direct link to the ungated PDF: https://download.ssrn.com/21/02/03/ssrn_id3778525_code456891...
TLDR; NodeJS is the clear winner, and Python far and away the worst of the bunch.

indulona 4 hours ago

The DX for websockets in Go(gorilla) is horrible. But i do not believe these numbers one bit.