Skip to content

virtio: remove per-work Arc<Mutex> clone from queue completion path#3173

Merged
jstarks merged 13 commits intomicrosoft:mainfrom
jstarks:virtio-no-arc
Apr 3, 2026
Merged

virtio: remove per-work Arc<Mutex> clone from queue completion path#3173
jstarks merged 13 commits intomicrosoft:mainfrom
jstarks:virtio-no-arc

Conversation

@jstarks
Copy link
Copy Markdown
Member

@jstarks jstarks commented Apr 1, 2026

Every VirtioQueueCallbackWork currently clones an Arc<Mutex<VirtioQueueUsedHandler>> so that it can independently call complete() on itself. This means every descriptor processed by every virtio device pays for an atomic ref-count increment/decrement pair plus a mutex lock on the completion path—overhead that adds up on high-throughput devices like virtio-net and virtio-blk.

This PR moves completion responsibility from the work item to the queue: callers now invoke VirtioQueue::complete(work, bytes_written) instead of work.complete(bytes_written). Because the queue already owns the used-ring handler, the Arc<Mutex> clone per work item, the completed tracking flag, and the Drop guard on VirtioQueue are all eliminated.

All in-tree virtio device crates are updated to the new pattern. For virtio-vsock, the refactor also fixes a latent bug where write_packet could early-return on error without completing the work item; the caller now always completes it.

jstarks added 11 commits April 1, 2026 18:26
Add a complete() method on VirtioQueue that delegates to the internal
used handler. This coexists with VirtioQueueCallbackWork::complete()
during the migration—devices will be switched one at a time, then the
old method will be removed.
Change process_rng_request to return the byte count instead of
completing the descriptor internally. The caller completes via
queue.complete().
Change process_pmem_request to return the byte count instead of
completing the descriptor internally. The caller completes via
queue.complete().
The caller already receives the byte count from process_9p_request.
Switch from work.complete(bytes) to queue.complete(&mut work, bytes).
Change process_virtiofs_request to take &VirtioQueueCallbackWork and
return the byte count instead of completing the descriptor internally.
The caller completes via queue.complete().
At each call site, split work.consume().complete(n) into a let-binding
followed by queue.complete(&mut work, n). After consume() releases the
queue borrow, the queue is available for completion.
Pass &mut VirtioQueue to finish_io and poll_drain so they can call
queue.complete() instead of work.complete(). In the main event loop,
this comes from state.queue. In stop_queue, use worker.get_mut() to
borrow both the BlkWorker and BlkQueueState simultaneously.
TX: complete_tx_packet borrow-splits self to access both
virtio_state.tx_queue and active_state.pending_tx_packets.

RX: change VirtioWorkPool::complete_packet to return (work, bytes)
instead of completing internally. process_endpoint_rx completes via
virtio_state.rx_queue.
…-work clone

All devices now use VirtioQueue::complete(). Remove:
- work.complete() method
- Arc<Mutex<VirtioQueueUsedHandler>> from VirtioQueueCallbackWork
- completed: bool tracking field
- VirtioQueue Drop impl (Arc::get_mut check is dead now)

Update test callback signatures to pass &mut VirtioQueue so tests
can call queue.complete().
Now that no work item clones the Arc<Mutex<VirtioQueueUsedHandler>>,
the indirection is pure overhead. Inline QueueCoreCompleteWork and
Interrupt directly into VirtioQueue and delete the handler struct.

VirtioQueue::complete() now calls complete_descriptor and delivers
the interrupt inline, with no lock acquisition.
Move complete() calls from VirtioQueueCallbackWork to VirtioQueue,
matching the new API where completion is done on the queue rather than
the work item.

Refactor write_packet so the caller (handle_host_rx) always completes
the work item, ensuring completion even on early error returns.
Copilot AI review requested due to automatic review settings April 1, 2026 18:47
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 1, 2026

⚠️ Unsafe Code Detected

This PR modifies files containing unsafe Rust code. Extra scrutiny is required during review.

For more on why we check whole files, instead of just diffs, check out the Rustonomicon

@github-actions github-actions bot added the unsafe Related to unsafe code label Apr 1, 2026
@jstarks jstarks marked this pull request as ready for review April 1, 2026 18:48
@jstarks jstarks requested a review from a team as a code owner April 1, 2026 18:48
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors virtio queue completion so that completion is performed by VirtioQueue (via VirtioQueue::complete(&mut work, bytes_written)) rather than by each VirtioQueueCallbackWork, eliminating per-descriptor Arc<Mutex<...>> cloning/locking overhead on the hot completion path.

Changes:

  • Moves descriptor completion responsibility into VirtioQueue, removing per-work used-ring handler ownership from VirtioQueueCallbackWork.
  • Updates in-tree virtio devices and virtio tests to the new completion pattern.
  • Adjusts a few device helper APIs (e.g., returning bytes_written) so the caller can complete exactly once.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
vm/devices/virtio/virtio/src/common.rs Introduces VirtioQueue::complete and removes per-work completion handler plumbing.
vm/devices/virtio/virtio/src/tests.rs Updates test callbacks to complete work via VirtioQueue.
vm/devices/virtio/virtiofs/src/virtio.rs Makes request processing return bytes_written; completion is done by the queue.
vm/devices/virtio/virtio_vsock/src/lib.rs Refactors RX/TX completion flow; ensures RX work is always completed even on error.
vm/devices/virtio/virtio_rng/src/lib.rs Returns bytes_written from request processing; caller completes via queue.
vm/devices/virtio/virtio_pmem/src/lib.rs Returns bytes_written; caller completes via queue.
vm/devices/virtio/virtio_p9/src/lib.rs Completes 9p requests via state.queue.complete.
vm/devices/virtio/virtio_net/src/lib.rs Updates TX/RX completion to go through the queue; adapts RX completion flow.
vm/devices/virtio/virtio_net/src/buffers.rs Changes RX “completion” helper to return (work, bytes) so caller completes via queue.
vm/devices/virtio/virtio_console/src/lib.rs Updates peek/consume paths to complete via queue.
vm/devices/virtio/virtio_blk/src/lib.rs Routes IO completion through VirtioQueue::complete, including drain-on-stop logic.

Comment on lines +357 to +365
/// Complete a descriptor previously obtained from this queue.
///
/// Writes `bytes_written` to the used ring and delivers an interrupt
/// to the guest (unless interrupt suppression is active).
pub fn complete(&mut self, work: &mut VirtioQueueCallbackWork, bytes_written: u32) {
match self.complete.complete_descriptor(&work.work, bytes_written) {
Ok(true) => {
self.notify_guest.deliver();
}
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VirtioQueue::complete() no longer enforces the “exactly once” completion contract. Previously, VirtioQueueCallbackWork::complete() asserted it hadn’t already been completed; now a caller can accidentally complete the same work multiple times (or via the wrong queue), which can corrupt the used ring and (for packed queues) can trigger a panic due to mismatched completion context. Consider restoring a guard (e.g., a completed flag in VirtioQueueCallbackWork set by VirtioQueue::complete, or changing complete to consume the work item and/or validate it belongs to this queue).

Copilot uses AI. Check for mistakes.
Comment on lines 98 to 113
@@ -107,7 +109,7 @@ impl VirtioWorkPool {
} else {
packet.len + header_size() as u32
};
packet.work.complete(payload_len);
(packet.work, payload_len)
}
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VirtioWorkPool::complete_packet() no longer completes anything; it now returns the work item + computed length for the caller to complete. This name is misleading and makes it easy to accidentally assume the descriptor has already been completed. Consider renaming to reflect the new behavior (e.g., take_packet_for_completion / take_completed_packet_work) or updating the API to actually perform completion.

Copilot uses AI. Check for mistakes.
jstarks added 2 commits April 1, 2026 19:11
Take VirtioQueueCallbackWork by value instead of &mut, enforcing the
exactly-once completion contract at compile time via move semantics
rather than a runtime assert.

Also rename VirtioWorkPool::complete_packet to take_rx_work to reflect
that it no longer completes the descriptor itself.
Copilot AI review requested due to automatic review settings April 1, 2026 20:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

@jstarks jstarks merged commit 2715fb4 into microsoft:main Apr 3, 2026
61 checks passed
@jstarks jstarks deleted the virtio-no-arc branch April 3, 2026 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

unsafe Related to unsafe code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants