Aim
In this post, I would like to see how to build a concurrency message queue in C++ that is fast. The fastest message queue I can make at this moment is the Lock Free Ring Buffer message queue. There is more details in the result part.
Source
Testing method
There is a speed test framework implemented in SpeedTest class. No matter what type of message queue we are using, a number of message will be pushed into the MsgQ by multiple number of threads at the same time there are the same number of threads reading message from that MsgQ.
We then have a timer to count the total time that is needed to process all the messages.
We have different type of message queue:
Lock Free MsgQ
- MsgQLockFree with Ring Buffer
Using Linux Concurrency library
- MsgQOneLock with Ring Buffer
- MsgQTwoLock with Ring Buffer
- MsgQOneLock with LinkList
- MsgQTwoLock with LinkList
Using C++ standard library
- MsgQOneLock with Ring Buffer
- MsgQTwoLock with Ring Buffer
- MsgQOneLock with LinkList
- MsgQTwoLock with LinkList
As one can see that there are currently 3 types of message queue:
- Lock free
- One lock
- Two lock
We would like to see their difference in speed.
More on Queue
Lock Free
I learn this from
here. I made a simplified version. This is a concurrency message queue but only support 2 threads. 1 thread is pushing message and 1 thread is getting message.
In this message queue we have no mutex lock. There are 2 pointers, one is pointing to the next empty slot(Tail) and the other is pointing to the next ready message slot(Head). The length of the message queue is a power of 2. Since there is no mutex lock, __sync_val_compare_and_swap() is used.
One Lock
The other type of message queue is a queue that is using one lock.
There is one mutex protecting the Head and Tail pointers. More than 2 threads can be supported.
Two Lock
There are two mutex protecting the Head and Tail pointers. More than 2 threads can be supported.
The Test
Hardware
- Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
- 4 core
- 4 threads
- Ubuntu 20.04 LTS
Test Info
- 300000 messages
- Threads
- 1 in, 1 out
- 2 in , 2 out
- 4 in, 4 out
- 10 in, 10 out
- 5 total tests and get the average time spend
Result
MsgQ Type | Thread IN | Thread OUT | Total Message | Total test | Time spend Msec |
Lock Free | 1 | 1 | 300000 | 5 | 46 |
One Lock | 1 | 1 | 300000 | 5 | 115 |
Two Lock | 1 | 1 | 300000 | 5 | 102.5 |
The above grape comparing 1 thread in and 1 thread out case. The lock free message queue is the fastest one. This is expected.
Thread IO | Total Message | Total test | Msec One Lock | Msec Two Lock |
Two | 300000 | 5 | 122 | 135.5 |
Four | 300000 | 5 | 133.5 | 119 |
Ten | 300000 | 5 | 150.5 | 124 |
The above graph shows that the speed of MsgQ with 2 locks is better. We have 2 threads IN and OUT, 4 threads and 10 threads in these tests.
Raw Result
MsgQLockFree_Linux | 1 | 1 | 300000 | 5 | 46 |
MsgQLockFree_LinkList_Std | 1 | 1 | 300000 | 5 | 100 |
MsgStackLockFree_LinkList_Std | 1 | 1 | 300000 | 5 | 296 |
MsgStackLockFree_LinkList_Std | 2 | 2 | 300000 | 5 | 406 |
MsgStackLockFree_LinkList_Std | 4 | 4 | 300000 | 5 | 806 |
MsgStackLockFree_LinkList_Std | 10 | 10 | 300000 | 5 | 568 |
MsgQOneLock_RingBuf_Linux | 1 | 1 | 300000 | 5 | 111 |
MsgQOneLock_RingBuf_Linux | 2 | 2 | 300000 | 5 | 109 |
MsgQOneLock_RingBuf_Linux | 4 | 4 | 300000 | 5 | 135 |
MsgQOneLock_RingBuf_Linux | 10 | 10 | 300000 | 5 | 186 |
MsgQTwoLock_RingBuf_Linux | 1 | 1 | 300000 | 5 | 86 |
MsgQTwoLock_RingBuf_Linux | 2 | 2 | 300000 | 5 | 135 |
MsgQTwoLock_RingBuf_Linux | 4 | 4 | 300000 | 5 | 132 |
MsgQTwoLock_RingBuf_Linux | 10 | 10 | 300000 | 5 | 128 |
MsgQOneLock_RingBuf_Std | 1 | 1 | 300000 | 5 | 119 |
MsgQOneLock_RingBuf_Std | 2 | 2 | 300000 | 5 | 137 |
MsgQOneLock_RingBuf_Std | 4 | 4 | 300000 | 5 | 106 |
MsgQOneLock_RingBuf_Std | 10 | 10 | 300000 | 5 | 115 |
MsgQTwoLock_RingBuf_Std | 1 | 1 | 300000 | 5 | 100 |
MsgQTwoLock_RingBuf_Std | 2 | 2 | 300000 | 5 | 136 |
MsgQTwoLock_RingBuf_Std | 4 | 4 | 300000 | 5 | 106 |
MsgQTwoLock_RingBuf_Std | 10 | 10 | 300000 | 5 | 120 |
MsgQOneLock_LinkList_Linux | 1 | 1 | 300000 | 5 | 229 |
MsgQOneLock_LinkList_Linux | 2 | 2 | 300000 | 5 | 201 |
MsgQOneLock_LinkList_Linux | 4 | 4 | 300000 | 5 | 195 |
MsgQOneLock_LinkList_Linux | 10 | 10 | 300000 | 5 | 197 |
MsgQTwoLock_LinkList_Linux | 1 | 1 | 300000 | 5 | 179 |
MsgQTwoLock_LinkList_Linux | 2 | 2 | 300000 | 5 | 268 |
MsgQTwoLock_LinkList_Linux | 4 | 4 | 300000 | 5 | 225 |
MsgQTwoLock_LinkList_Linux | 10 | 10 | 300000 | 5 | 213 |
MsgQOneLock_LinkList_Std | 1 | 1 | 300000 | 5 | 250 |
MsgQOneLock_LinkList_Std | 2 | 2 | 300000 | 5 | 189 |
MsgQOneLock_LinkList_Std | 4 | 4 | 300000 | 5 | 209 |
MsgQOneLock_LinkList_Std | 10 | 10 | 300000 | 5 | 223 |
MsgQTwoLock_LinkList_Std | 1 | 1 | 300000 | 5 | 166 |
MsgQTwoLock_LinkList_Std | 2 | 2 | 300000 | 5 | 267 |
MsgQTwoLock_LinkList_Std | 4 | 4 | 300000 | 5 | 212 |
MsgQTwoLock_LinkList_Std | 10 | 10 | 300000 | 5 | 219 |
MsgQOneLock_Queue_Linux | 1 | 1 | 300000 | 5 | 203 |
MsgQOneLock_Queue_Linux | 2 | 2 | 300000 | 5 | 158 |
MsgQOneLock_Queue_Linux | 4 | 4 | 300000 | 5 | 152 |
MsgQOneLock_Queue_Linux | 10 | 10 | 300000 | 5 | 172 |
MsgQOneLock_Queue_Std | 1 | 1 | 300000 | 5 | 178 |
MsgQOneLock_Queue_Std | 2 | 2 | 300000 | 5 | 151 |
MsgQOneLock_Queue_Std | 4 | 4 | 300000 | 5 | 150 |
MsgQOneLock_Queue_Std | 10 | 10 | 300000 | 5 | 154 |