Skip to content

sched: add a reference count to the TCB to prevent it from being deleted.#17468

Closed
hujun260 wants to merge 1 commit intoapache:masterfrom
hujun260:apache_12
Closed

sched: add a reference count to the TCB to prevent it from being deleted.#17468
hujun260 wants to merge 1 commit intoapache:masterfrom
hujun260:apache_12

Conversation

@hujun260
Copy link
Contributor

@hujun260 hujun260 commented Dec 10, 2025

Summary

Implement reference counting for task control blocks (TCBs) to enable fine-grained
synchronization and reduce reliance on global task locks. This mechanism allows code
to safely access TCBs without holding large locks, reducing contention and lock
recursion while ensuring TCBs remain valid during access.

should merge with apache/nuttx-apps#3246

Problem Statement

Current implementation relies on large global locks to protect task data structures.
The new reference counting approach provides a lightweight alternative that only
guarantees TCB validity rather than exclusive access, reducing:

  • Lock contention in multi-threaded scenarios
  • Critical section failed do to wait
  • Overhead of global synchronization

Key Changes

1. Core Implementation

  • Enhanced nxsched_get_tcb() and nxsched_put_tcb() for reference counting
  • Modified TCB release mechanism to respect reference counts
  • Added reference count field to task control block structure

2. Scheduler Updates (72 files)

  • Task management and lifecycle
  • Pthread operations
  • Signal and group handling
  • Memory and filesystem operations
  • Device drivers

Usage Pattern

// New: Reference counting
struct tcb_s *tcb = nxsched_get_tcb(pid);
if (tcb != NULL)
{
// ... safely access tcb ...
nxsched_put_tcb(tcb);
}

Impact

tcb release

Test Objective nucleo-g431rb:nsh

after this patch
size nuttx
text data bss dec hex filename
127964 904 8644 137512 21928 nuttx

before this patch
size nuttx
text data bss dec hex filename
127232 904 8612 136748 2162c nuttx

flash size increase 732 byte

Testing

test in hardware

esp32s3-devkit:nsh

user_main: scheduler lock test
sched_lock: Starting lowpri_thread at 97
sched_lock: Set lowpri_thread priority to 97
sched_lock: Starting highpri_thread at 98
sched_lock: Set highpri_thread priority to 98
sched_lock: Waiting...
sched_lock: PASSED No pre-emption occurred while scheduler was locked.
sched_lock: Starting lowpri_thread at 97
sched_lock: Set lowpri_thread priority to 97
sched_lock: Starting highpri_thread at 98
sched_lock: Set highpri_thread priority to 98
sched_lock: Waiting...
sched_lock: PASSED No pre-emption occurred while scheduler was locked.
sched_lock: Finished

End of test memory usage:
VARIABLE BEFORE AFTER
======== ======== ========
arena 5d8bc 5d8bc
ordblks 7 6
mxordblk 548a0 548a0
uordblks 5014 5014
fordblks 588a8 588a8

Final memory usage:
VARIABLE BEFORE AFTER
======== ======== ========
arena 5d8bc 5d8bc
ordblks 1 6
mxordblk 59238 548a0
uordblks 4684 5014
fordblks 59238 588a8
user_main: Exiting
ostest_main: Exiting with status 0
nsh> u
nsh: u: command not found
nsh>
nsh>
nsh>
nsh> uname -a
NuttX 12.11.0 ef91333e3ac-dirty Dec 10 2025 16:11:04 xtensa esp32s3-devkit
nsh>

@github-actions github-actions bot added Arch: risc-v Issues related to the RISC-V (32-bit or 64-bit) architecture Area: Drivers Drivers issues Area: File System File System issues Area: Memory Management Memory Management issues Area: OS Components OS Components issues Size: L The size of the change in this PR is large labels Dec 10, 2025
@hujun260 hujun260 force-pushed the apache_12 branch 3 times, most recently from f192057 to cae0ffd Compare December 11, 2025 06:54
@hujun260 hujun260 requested a review from acassis as a code owner December 11, 2025 06:54
@hujun260 hujun260 marked this pull request as ready for review January 26, 2026 06:19
@hujun260
Copy link
Contributor Author

It seems that not all implementations have replaced nxsched_get_tcb_noref, which means the system performance will still degrade. My test case is only intended to make you aware of the performance degradation caused by nxsched_get_tcb. The pthread_mutex case is just one of them—if I provide other test cases, your patch will have many more performance bottlenecks requiring optimization. could we address this issue from the perspective of architectural implementation?

Some kernel features cannot be made optional via configuration switches, as disabling them would introduce bugs.

Copy link
Contributor

@anchao anchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't let this submission be merged:

  1. code size will increase.
  2. performance will degrade.

@anchao
Copy link
Contributor

anchao commented Jan 31, 2026

If you can't resolve these 2 issues, please stop changing the status of this PR.

@anchao anchao marked this pull request as draft January 31, 2026 09:56
@hujun260
Copy link
Contributor Author

hujun260 commented Feb 1, 2026

I won't let this submission be merged:

  1. code size will increase.
  2. performance will degrade.

However, currently, using nxsched_get_tcb cannot guarantee safety even within critical sections, which is more critical than other factors. Furthermore, this patch reduces the scope of critical sections, thereby improving system real-time performance. As for the code size, it remains acceptable because the added safety justifies the increase.

@hujun260 hujun260 marked this pull request as ready for review February 1, 2026 03:18
@anchao
Copy link
Contributor

anchao commented Feb 2, 2026

I won't let this submission be merged:

  1. code size will increase.
  2. performance will degrade.

However, currently, using nxsched_get_tcb cannot guarantee safety even within critical sections, which is more critical than other factors. Furthermore, this patch reduces the scope of critical sections, thereby improving system real-time performance. As for the code size, it remains acceptable because the added safety justifies the increase.

Zephyr is no more safety either. If your proposal can be applied to any other RTOS, I will approve this submission.

@anchao anchao marked this pull request as draft February 2, 2026 02:21
@hujun260
Copy link
Contributor Author

hujun260 commented Feb 2, 2026

Zephyr is no more safety either. If your proposal can be applied to any other RTOS, I will approve this submission.

We found an issue with NuttX, so why drag other RTOSes into this? I'm not even developing on Zephyr.

@hujun260 hujun260 marked this pull request as ready for review February 2, 2026 02:29
@anchao
Copy link
Contributor

anchao commented Feb 2, 2026

Zephyr is no more safety either. If your proposal can be applied to any other RTOS, I will approve this submission.

We found an issue with NuttX, so why drag other RTOSes into this? I'm not even developing on Zephyr.

  1. None of the leading mainstream RTOSes have adopted this change, as it would introduce unnecessary performance overhead.
  2. Your commit will affect the project I’m maintaining and cause a slowdown of the whole system.
  3. A number of development boards in the community are IoT devices, whose clock speed and memory space are insufficient to support this feature.

Additionally, please stop setting the status manually – GitHub will send out unnecessary push notifications and emails because of this.

@anchao anchao marked this pull request as draft February 2, 2026 03:04
@hujun260
Copy link
Contributor Author

hujun260 commented Feb 2, 2026

Zephyr is no more safety either. If your proposal can be applied to any other RTOS, I will approve this submission.

We found an issue with NuttX, so why drag other RTOSes into this? I'm not even developing on Zephyr.

  1. None of the leading mainstream RTOSes have adopted this change, as it would introduce unnecessary performance overhead.
  2. Your commit will affect the project I’m maintaining and cause a slowdown of the whole system.
  3. A number of development boards in the community are IoT devices, whose clock speed and memory space are insufficient to support this feature.

Additionally, please stop setting the status manually – GitHub will send out unnecessary push notifications and emails because of this.

I will proceed with submitting my proposal.
1 On NuttX, this is a fundamental strategy to ensure the safe usage of TCBs, and it also reduces interrupt disable time.
2 This submission will not impact the core system performance; I have already provided RTOS benchmark data.
3 The increase in code size is controllable and will not cause compilation failures for developers in the community.

@hujun260 hujun260 marked this pull request as ready for review February 2, 2026 03:14
@anchao
Copy link
Contributor

anchao commented Feb 2, 2026

Zephyr is no more safety either. If your proposal can be applied to any other RTOS, I will approve this submission.

We found an issue with NuttX, so why drag other RTOSes into this? I'm not even developing on Zephyr.

  1. None of the leading mainstream RTOSes have adopted this change, as it would introduce unnecessary performance overhead.
  2. Your commit will affect the project I’m maintaining and cause a slowdown of the whole system.
  3. A number of development boards in the community are IoT devices, whose clock speed and memory space are insufficient to support this feature.

Additionally, please stop setting the status manually – GitHub will send out unnecessary push notifications and emails because of this.

I will proceed with submitting my proposal. 1 On NuttX, this is a fundamental strategy to ensure the safe usage of TCBs, and it also reduces interrupt disable time. 2 This submission will not impact the core system performance; I have already provided RTOS benchmark data. 3 The increase in code size is controllable and will not cause compilation failures for developers in the community.

If the only way you can avoid hitting performance is to add a noref method, then what’s the point of having ref in the first place? Should we use noref everywhere instead? Why do I have to bear the extra overhead for the implementation of ref?

@xiaoxiang781216
Copy link
Contributor

I won't let this submission be merged:

  1. code size will increase.
  2. performance will degrade.

without refcount, the used after free will happen under the stress testing of task/thread create/destroy. Used after free is a critical bug which must be fixed. refcount is one of well known method to fix this type of error, if you have better suggestion/method, please point out.

@hujun260
Copy link
Contributor Author

hujun260 commented Feb 2, 2026

Zephyr is no more safety either. If your proposal can be applied to any other RTOS, I will approve this submission.

We found an issue with NuttX, so why drag other RTOSes into this? I'm not even developing on Zephyr.

  1. None of the leading mainstream RTOSes have adopted this change, as it would introduce unnecessary performance overhead.
  2. Your commit will affect the project I’m maintaining and cause a slowdown of the whole system.
  3. A number of development boards in the community are IoT devices, whose clock speed and memory space are insufficient to support this feature.

Additionally, please stop setting the status manually – GitHub will send out unnecessary push notifications and emails because of this.

I will proceed with submitting my proposal. 1 On NuttX, this is a fundamental strategy to ensure the safe usage of TCBs, and it also reduces interrupt disable time. 2 This submission will not impact the core system performance; I have already provided RTOS benchmark data. 3 The increase in code size is controllable and will not cause compilation failures for developers in the community.

If the only way you can avoid hitting performance is to add a noref method, then what’s the point of having ref in the first place? Should we use noref everywhere instead? Why do I have to bear the extra overhead for the implementation of ref?

Setting aside the naming issue for now,
Currently, nxsched_get_tcb_noref is only used to implement nxsched_verify_pid.
If we adopt two different usage patterns for TCBs in the future:
1 nxsched_get_tcb_noref + enter_critical_section (unsafe)
2 nxsched_get_tcb and nxsched_put_tcb
This would break code consistency and make maintenance difficult. I believe the pattern (nxsched_get_tcb_noref + enter_critical_section) should only be reserved for extreme, isolated cases where performance is critical.
Our priority should be correctness first, with performance being optimized on a case-by-case basis.

@anchao
Copy link
Contributor

anchao commented Feb 2, 2026

I won't let this submission be merged:

  1. code size will increase.
  2. performance will degrade.

without refcount, the used after free will happen under the stress testing of task/thread create/destroy. Used after free is a critical bug which must be fixed. refcount is one of well known method to fix this type of error, if you have better suggestion/method, please point out.

  1. This is because in your business logic, Task A still holds the resources of Task B when it exits. Why can't the resources be released properly?
  2. Most IoT and wearable devices use a fixed thread pool. If your business process involves the frequent creation and destruction of a large number of threads, why not design them as deterministic tasks and implement the creation and exit logic with wait/notify mechanisms?
  3. The only acceptable solution to me is to make this feature configurable via Kconfig – no other options are on the table.

@anchao
Copy link
Contributor

anchao commented Feb 2, 2026

Zephyr is no more safety either. If your proposal can be applied to any other RTOS, I will approve this submission.

We found an issue with NuttX, so why drag other RTOSes into this? I'm not even developing on Zephyr.

  1. None of the leading mainstream RTOSes have adopted this change, as it would introduce unnecessary performance overhead.
  2. Your commit will affect the project I’m maintaining and cause a slowdown of the whole system.
  3. A number of development boards in the community are IoT devices, whose clock speed and memory space are insufficient to support this feature.

Additionally, please stop setting the status manually – GitHub will send out unnecessary push notifications and emails because of this.

I will proceed with submitting my proposal. 1 On NuttX, this is a fundamental strategy to ensure the safe usage of TCBs, and it also reduces interrupt disable time. 2 This submission will not impact the core system performance; I have already provided RTOS benchmark data. 3 The increase in code size is controllable and will not cause compilation failures for developers in the community.

If the only way you can avoid hitting performance is to add a noref method, then what’s the point of having ref in the first place? Should we use noref everywhere instead? Why do I have to bear the extra overhead for the implementation of ref?

Setting aside the naming issue for now, Currently, nxsched_get_tcb_noref is only used to implement nxsched_verify_pid. If we adopt two different usage patterns for TCBs in the future: 1 nxsched_get_tcb_noref + enter_critical_section (unsafe) 2 nxsched_get_tcb and nxsched_put_tcb This would break code consistency and make maintenance difficult. I believe the pattern (nxsched_get_tcb_noref + enter_critical_section) should only be reserved for extreme, isolated cases where performance is critical. Our priority should be correctness first, with performance being optimized on a case-by-case basis.

The top priority is to ensure the kernel incurs no performance or footprint impact, right? For the sake of this so-called unnecessary security, making all developers bear the extra performance overhead—isn’t that the biggest issue here?

@anchao
Copy link
Contributor

anchao commented Feb 2, 2026

One more update for you all:

  1. In the process of promoting NuttX right now, most business stakeholders have extremely high requirements for kernel performance and will opt for Zephyr as their first choice. Developers also have numerous doubts about the stability and performance of the NuttX kernel.

  2. When working on your internal solutions, you’d better research how other companies have implemented their corresponding solutions. Even Bestechnic, your Tier 1 supplier, does not recommend using your framework for headphone and Wi-Fi/BT module solutions.

  3. The next-generation OS will be built on the first principles rather than being based on excessive assumptions. Any business implementations with poor performance or flaws should be phased out and optimized, instead of having the operating system do a great deal of unnecessary fallback work for them.

@hujun260
Copy link
Contributor Author

hujun260 commented Feb 2, 2026

One more update for you all:

  1. In the process of promoting NuttX right now, most business stakeholders have extremely high requirements for kernel performance and will opt for Zephyr as their first choice. Developers also have numerous doubts about the stability and performance of the NuttX kernel.
  2. When working on your internal solutions, you’d better research how other companies have implemented their corresponding solutions. Even Bestechnic, your Tier 1 supplier, does not recommend using your framework for headphone and Wi-Fi/BT module solutions.
  3. The next-generation OS will be built on the first principles rather than being based on excessive assumptions. Any business implementations with poor performance or flaws should be phased out and optimized, instead of having the operating system do a great deal of unnecessary fallback work for them.

It just means we are encountering different usage scenarios. While enter_critical_section might work for very simple cases, But, nuttX needs to evolve.

Memory safety issues are not imaginary.
Let's take the function nxsched_foreach as an example. We cannot guarantee that the tcb parameter passed to the handler is always valid. Neither enter_critical_section nor sched_lock can solve this problem. If the handler performs a wait operation, it will invalidate the entire critical section and the scheduling lock.

image

@xiaoxiang781216
Copy link
Contributor

xiaoxiang781216 commented Feb 2, 2026

I won't let this submission be merged:

  1. code size will increase.
  2. performance will degrade.

without refcount, the used after free will happen under the stress testing of task/thread create/destroy. Used after free is a critical bug which must be fixed. refcount is one of well known method to fix this type of error, if you have better suggestion/method, please point out.

  1. This is because in your business logic, Task A still holds the resources of Task B when it exits. Why can't the resources be released properly?
  2. Most IoT and wearable devices use a fixed thread pool. If your business process involves the frequent creation and destruction of a large number of threads, why not design them as deterministic tasks and implement the creation and exit logic with wait/notify mechanisms?

No, I just consider POSIX spec, not private business logic. POSIX spec allow appliation create/destroy task/thread dynamically, we need fix the problem. Let's hightlight the inviolable rule here:
https://github.com/apache/nuttx/blob/master/INVIOLABLES.md#strict-posix-compliance

  1. The only acceptable solution to me is to make this feature configurable via Kconfig – no other options are on the table.

it's bug fix, I don't unerstand why we add an option to choice whether to fix a critical(memory corruption) bug.

@anchao
Copy link
Contributor

anchao commented Feb 2, 2026

One more update for you all:

  1. In the process of promoting NuttX right now, most business stakeholders have extremely high requirements for kernel performance and will opt for Zephyr as their first choice. Developers also have numerous doubts about the stability and performance of the NuttX kernel.
  2. When working on your internal solutions, you’d better research how other companies have implemented their corresponding solutions. Even Bestechnic, your Tier 1 supplier, does not recommend using your framework for headphone and Wi-Fi/BT module solutions.
  3. The next-generation OS will be built on the first principles rather than being based on excessive assumptions. Any business implementations with poor performance or flaws should be phased out and optimized, instead of having the operating system do a great deal of unnecessary fallback work for them.

It just means we are encountering different usage scenarios. While enter_critical_section might work for very simple cases, But, nuttX needs to evolve.

Memory safety issues are not imaginary. Let's take the function nxsched_foreach as an example. We cannot guarantee that the tcb parameter passed to the handler is always valid. Neither enter_critical_section nor sched_lock can solve this problem. If the handler performs a wait operation, it will invalidate the entire critical section and the scheduling lock.

image
image

Shouldn't we optimize the implementation here?

@anchao
Copy link
Contributor

anchao commented Feb 2, 2026

I won't let this submission be merged:

  1. code size will increase.
  2. performance will degrade.

without refcount, the used after free will happen under the stress testing of task/thread create/destroy. Used after free is a critical bug which must be fixed. refcount is one of well known method to fix this type of error, if you have better suggestion/method, please point out.

  1. This is because in your business logic, Task A still holds the resources of Task B when it exits. Why can't the resources be released properly?
  2. Most IoT and wearable devices use a fixed thread pool. If your business process involves the frequent creation and destruction of a large number of threads, why not design them as deterministic tasks and implement the creation and exit logic with wait/notify mechanisms?

No, I just consider POSIX spec, not private business logic. POSIX spec allow appliation create/destroy task/thread dynamically, we need fix the problem. Let's hightlight the inviolable rule here: https://github.com/apache/nuttx/blob/master/INVIOLABLES.md#strict-posix-compliance

  1. The only acceptable solution to me is to make this feature configurable via Kconfig – no other options are on the table.

it's bug fix, I don't unerstand why we add an option to choice whether to fix a critical(memory corruption) bug.

But this is not a bug on my end. There is no scenario in my commercial solution where threads are dynamically created or deleted. If you insist on merging this commit, make it configurable—this way it won’t block existing users. This is my final concession.

@hujun260
Copy link
Contributor Author

hujun260 commented Feb 2, 2026

One more update for you all:

  1. In the process of promoting NuttX right now, most business stakeholders have extremely high requirements for kernel performance and will opt for Zephyr as their first choice. Developers also have numerous doubts about the stability and performance of the NuttX kernel.
  2. When working on your internal solutions, you’d better research how other companies have implemented their corresponding solutions. Even Bestechnic, your Tier 1 supplier, does not recommend using your framework for headphone and Wi-Fi/BT module solutions.
  3. The next-generation OS will be built on the first principles rather than being based on excessive assumptions. Any business implementations with poor performance or flaws should be phased out and optimized, instead of having the operating system do a great deal of unnecessary fallback work for them.

It just means we are encountering different usage scenarios. While enter_critical_section might work for very simple cases, But, nuttX needs to evolve.
Memory safety issues are not imaginary. Let's take the function nxsched_foreach as an example. We cannot guarantee that the tcb parameter passed to the handler is always valid. Neither enter_critical_section nor sched_lock can solve this problem. If the handler performs a wait operation, it will invalidate the entire critical section and the scheduling lock.
image

image Shouldn't we optimize the implementation here?

This is just one example. Your suggestion hardly solves this category of issues at the root. The fundamental problem is that critical sections are simply not a safe solution in complex scenarios. Take task_fssync for instance, which involves a wait operation.

image

I think you are somewhat overstating the performance impact of this patch. The so-called core performance of an RTOS primarily revolves around context switching and mutex/pthread_mutex operations, where this change has little effect. Moreover, the usage scenarios for nxsched_get_tcb are mostly non-frequently-called interfaces.

@hujun260 hujun260 marked this pull request as draft February 3, 2026 11:23
…deleted.

To replace the large lock with smaller ones and reduce the large locks related to the TCB,
in many scenarios, we only need to ensure that the TCB won't be released
instead of locking, thus reducing the possibility of lock recursion.

Signed-off-by: hujun5 <hujun5@xiaomi.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Arch: arm Issues related to ARM (32-bit) architecture Arch: risc-v Issues related to the RISC-V (32-bit or 64-bit) architecture Area: Drivers Drivers issues Area: File System File System issues Area: Memory Management Memory Management issues Area: OS Components OS Components issues Board: arm Size: L The size of the change in this PR is large

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants