diff --git a/README.md b/README.md
index b28785eb..b04a8316 100644
--- a/README.md
+++ b/README.md
@@ -211,21 +211,22 @@ pip install TransferQueue
-### Simple Case: Regular Tensor Only
+### Simple Case: Regular Tensor
-
+
### Complex Case: Regular Tensor + NestedTensor + NonTensor
-
+
-> Note: Optimization for MooncakeStore and other backends are still in process. Warmly welcome contributions from the community!
+> Note: The openYuanrong benchmark uses only a single NPU, so it doesn't reflect multi-NPU scalability. Additionally, openYuanrong was tested on a different hardware setup than the other backends.
-For detailed performance benchmarks, please refer to [this blog](https://www.yuque.com/haomingzi-lfse7/lhp4el/tml8ke0zkgn6roey?singleDoc#).
+For detailed performance benchmarks, please refer to [the full benchmark report](https://www.yuque.com/haomingzi-lfse7/lhp4el/mywsxovevynra42u?singleDoc#).
-We also provide a [stress test report](https://www.yuque.com/haomingzi-lfse7/lhp4el/mt0vedqy7c337pgg?singleDoc#) that demonstrates more than **8192 concurrent clients writing 2 TB of data** into TransferQueue across 4 nodes. The system remains stable without any crashes or data loss.
+### Stress Test
+Beyond throughput, we also validated stability under high concurrency. We provide a [stress test report](https://www.yuque.com/haomingzi-lfse7/lhp4el/mt0vedqy7c337pgg?singleDoc#) that demonstrates more than **8192 concurrent clients writing 2 TB of data** into TransferQueue across 4 nodes. The system remains stable without any crashes or data loss.
🛠️ Customize TransferQueue
diff --git a/transfer_queue/version/version b/transfer_queue/version/version
index 4c6f32c3..699c6c6d 100644
--- a/transfer_queue/version/version
+++ b/transfer_queue/version/version
@@ -1 +1 @@
-0.1.8.dev0
+0.1.8