Impressive benchmarks. Reaching 1M msgs/sec on a 2 vCPU instance is a great showcase for Crystal. As a Crystal core member, I've always seen LavinMQ as a prime example of performance-oriented engineering, especially with how it handles I/O and minimizes syscalls to get the most out of the hardware.
If you want to see Crystal's concurrency and type system in a serious production environment, this is the project to check out. Kudos to the LavinMQ team for their work since 2020.
We send the confirm as soon as the message hits the mmap and is appended to the local replication queue because that keeps publish latency at memory-copy speed, and the loss window is narrow: only messages still in the page cache or sitting in the replication queue when the leader dies. Config options are coming to opt into stronger guarantees (msync on the leader, waiting for follower ack, or follower fsync), at the cost of added latency.
Hi Anders, thank you for the clarity!
That make sense, it sounds like the "publisher confirm" feature for RabbitMQ that we used before! And it's good to know the stronger options are coming :D
Impressive benchmarks. Reaching 1M msgs/sec on a 2 vCPU instance is a great showcase for Crystal. As a Crystal core member, I've always seen LavinMQ as a prime example of performance-oriented engineering, especially with how it handles I/O and minimizes syscalls to get the most out of the hardware.
If you want to see Crystal's concurrency and type system in a serious production environment, this is the project to check out. Kudos to the LavinMQ team for their work since 2020.
When a leader fails, does a publisher confirm guarantee the message survived on at least one follower?
I'm Anders, a LavinMQ-core developer.
We send the confirm as soon as the message hits the mmap and is appended to the local replication queue because that keeps publish latency at memory-copy speed, and the loss window is narrow: only messages still in the page cache or sitting in the replication queue when the leader dies. Config options are coming to opt into stronger guarantees (msync on the leader, waiting for follower ack, or follower fsync), at the cost of added latency.
Hi Anders, thank you for the clarity! That make sense, it sounds like the "publisher confirm" feature for RabbitMQ that we used before! And it's good to know the stronger options are coming :D
Could it be worth looking into the Raft algorithm?
Yes, we do use raft for leader election