Skip to content

The func ofproto_try_ref takes a lot of CPU time. #360

@danieldin95

Description

@danieldin95

I perf a revalidator thread, and find __aarch64_cas4_relax take a lot of CPU time.

   51.17% revalidator260  libofproto-2.16.so.0.0.2      [.] __aarch64_cas4_relax                                                                                                                           ◆
   8.46%  revalidator260  libofproto-2.16.so.0.0.2      [.] ofproto_try_ref                                                                                                                                ▒
   8.41%  revalidator260  libofproto-2.16.so.0.0.2      [.] __aarch64_ldadd4_rel                                                                                                                           ▒
   4.91%  revalidator260  libopenvswitch-2.16.so.0.0.2  [.] classifier_lookup__                                                                                                                            ▒
   4.08%  revalidator260  libofproto-2.16.so.0.0.2      [.] __aarch64_ldadd4_relax                                                                                                                         ▒
   2.35%  revalidator260  libopenvswitch-2.16.so.0.0.2  [.] ccmap_find                                                                                                                                     ▒
   2.10%  revalidator260  libopenvswitch-2.16.so.0.0.2  [.] cmap_find                                                                                                                                      ▒
   1.61%  revalidator260  libpthread-2.28.so            [.] 0x0000000000014660                                                                                                                             ▒
   1.51%  revalidator260  libpthread-2.28.so            [.] 0x00000000000148d0                                                                                                                             ▒
   1.27%  revalidator260  libofproto-2.16.so.0.0.2      [.] __aarch64_ldadd8_relax                                                                                                                         ▒
   0.76%  revalidator260  libofproto-2.16.so.0.0.2      [.] do_xlate_actions                                                                                                                               ▒
   0.62%  revalidator260  libc-2.28.so                  [.] 0x000000000010ccb0                                                                                                                             ▒
   0.34%  revalidator260  libopenvswitch-2.16.so.0.0.2  [.] ovs_mutex_lock_at                                                                                                                              ▒
   0.32%  revalidator260  libofproto-2.16.so.0.0.2      [.] ukey_lookup.isra.31                                                                                                                            ▒
   0.32%  revalidator260  [kernel.kallsyms]             [k] sched_group_set_shares                                                                                                                         ▒
   0.31%  revalidator260  libofproto-2.16.so.0.0.2      [.] xlate_table_action                                                                                                                             ▒
   0.30%  revalidator260  libc-2.28.so                  [.] 0x00000000000847f0                                                                                                                             ▒
   0.26%  revalidator260  libc-2.28.so                  [.] 0x00000000000847e0                                                                                                                             ▒
   0.25%  revalidator260  [kernel.kallsyms]             [k] find_vpid                                                                                                                                      ▒
   0.24%  revalidator260  libc-2.28.so                  [.] 0x00000000000847f8                                                                                                                             ▒
   0.24%  revalidator260  libc-2.28.so                  [.] 0x00000000000847e4                                                                                                                             ▒
   0.22%  revalidator260  libc-2.28.so                  [.] 0x00000000000847ec                                                                                                                             ▒
   0.21%  revalidator260  libopenvswitch-2.16.so.0.0.2  [.] cmap_next_position                                                                                                                             ▒
   0.20%  revalidator260  libofproto-2.16.so.0.0.2      [.] xlate_push_stats_entry                                                                                                                         ▒
   0.20%  revalidator260  libc-2.28.so                  [.] 0x00000000000847f4                                                                                                                             ▒
   0.20%  revalidator260  libc-2.28.so                  [.] 0x00000000000847fc                                                                                                                             ▒
   0.20%  revalidator260  libofproto-2.16.so.0.0.2      [.] rule_dpif_lookup_from_table                                                                                                                    ▒
   0.19%  revalidator260  libc-2.28.so                  [.] 0x00000000000847e8                                                                                                                             ▒
   0.18%  revalidator260  libopenvswitch-2.16.so.0.0.2  [.] mf_set_flow_value                                                                                                                              ▒
   0.17%  revalidator260  libopenvswitch-2.16.so.0.0.2  [.] dp_netdev_flow_to_dpif_flow

The func __aarch64_cas4_relax atomically compares a 32-bit value in memory with an expected value and, if they match, swaps it with a new value—all with relaxed memory ordering.

static inline bool
ovs_refcount_try_ref_rcu(struct ovs_refcount *refcount)
{
    unsigned int count;

    atomic_read_explicit(&refcount->count, &count, memory_order_relaxed);
    do {
        if (count == 0) {
            return false;
        }
    } while (!atomic_compare_exchange_weak_explicit(&refcount->count, &count,
                                                    count + 1,
                                                    memory_order_relaxed,
                                                    memory_order_relaxed));
    return true;
}

I cann't understand why atomic_compare_exchange_weak_explicit takes a lot of CPU time in aarch64. Some explanations suggest that under the aarch64 architecture, there is a high possibility of cas weak failing. Can we use strong instead of weak?

#define atomic_compare_exchange_weak            \
    atomic_compare_exchange_strong
#define atomic_compare_exchange_weak_explicit   \
    atomic_compare_exchange_strong_explicit

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions