block/blk-mq: use atomic_t for quiesce_depth to avoid lock contention on RT#552
block/blk-mq: use atomic_t for quiesce_depth to avoid lock contention on RT#552blktests-ci[bot] wants to merge 1 commit intolinus-master_basefrom
Conversation
|
Upstream branch: c22e26b |
519f160 to
49ecc64
Compare
|
Upstream branch: 37a93dd |
b0802e7 to
b57a767
Compare
|
Upstream branch: 37a93dd |
b57a767 to
bf69262
Compare
49ecc64 to
0525b37
Compare
|
Upstream branch: 7449f86 |
bf69262 to
607d496
Compare
|
Upstream branch: 7449f86 |
607d496 to
a2259e8
Compare
0525b37 to
6de2940
Compare
|
Upstream branch: cee73b1 |
a2259e8 to
ff3d6b3
Compare
6de2940 to
bbff8a4
Compare
|
Upstream branch: ca4ee40 |
ff3d6b3 to
cbc9798
Compare
bbff8a4 to
be7af85
Compare
|
Upstream branch: 26a4cfa |
cbc9798 to
c60c125
Compare
be7af85 to
bfa4f99
Compare
|
Upstream branch: 0f2acd3 |
c60c125 to
a18d878
Compare
bfa4f99 to
e2350d3
Compare
|
Upstream branch: 9702969 |
a18d878 to
e347634
Compare
e2350d3 to
e1fefe2
Compare
|
Upstream branch: 2961f84 |
e347634 to
f37ed48
Compare
e1fefe2 to
f714aad
Compare
|
Upstream branch: 2961f84 |
f37ed48 to
ab1b921
Compare
f714aad to
df85678
Compare
|
Upstream branch: 2b7a25d |
ab1b921 to
82cc2ec
Compare
df85678 to
50e7070
Compare
|
Upstream branch: 32a92f8 |
82cc2ec to
ab4820f
Compare
50e7070 to
c90f83b
Compare
|
Upstream branch: 6de23f8 |
ab4820f to
e580f6b
Compare
c90f83b to
c475e20
Compare
|
Upstream branch: 7dff99b |
… on RT In RT kernel (PREEMPT_RT), commit 6bda857 ("block: fix ordering between checking QUEUE_FLAG_QUIESCED request adding") causes severe performance regression on systems with multiple MSI-X interrupt vectors. The above change introduced spinlock_t queue_lock usage in blk_mq_run_hw_queue() to synchronize QUEUE_FLAG_QUIESCED checks with blk_mq_unquiesce_queue(). While this works correctly in standard kernel, it causes catastrophic serialization in RT kernel where spinlock_t converts to sleeping rt_mutex. Problem in RT kernel: - blk_mq_run_hw_queue() is called from IRQ thread context - With multiple MSI-X vectors, all IRQ threads contend on the same queue_lock - queue_lock becomes rt_mutex (sleeping) in RT kernel - IRQ threads serialize and enter D-state waiting for lock - Throughput drops from 640 MB/s to 153 MB/s Solution: Convert quiesce_depth to atomic_t and use it directly for quiesce state checking, eliminating QUEUE_FLAG_QUIESCED entirely. This removes the need for any locking in the hot path. The atomic counter serves as both the depth tracker and the quiesce indicator (depth > 0 means quiesced). This eliminates the race window that existed between updating the depth and the flag. Memory ordering is ensured by: - smp_mb__after_atomic() after modifying quiesce_depth - smp_rmb() before re-checking quiesce state in blk_mq_run_hw_queue() Performance impact: - RT kernel: eliminates lock contention, restores full throughput - Non-RT kernel: atomic ops are similar cost to the previous spinlock acquire/release, no regression expected Test results on RT kernel: Hardware: Broadcom/LSI MegaRAID 12GSAS/PCIe Secure SAS39xx (megaraid_sas driver, 128 MSI-X vectors, 120 hw queues) - Before: 153 MB/s, IRQ threads in D-state - After: 640 MB/s, no IRQ threads blocked Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Fixes: 6bda857 ("block: fix ordering between checking QUEUE_FLAG_QUIESCED request adding") Cc: stable@vger.kernel.org Signed-off-by: Ionut Nechita <ionut.nechita@windriver.com>
e580f6b to
c695cb6
Compare
Pull request for series with
subject: block/blk-mq: use atomic_t for quiesce_depth to avoid lock contention on RT
version: 3
url: https://patchwork.kernel.org/project/linux-block/list/?series=1053247