Investigate: is single-threaded bucket accumulation in Pippenger a bottleneck?

There is a TODO in `scalar_multiplication.cpp` questioning whether the single-threaded bucket accumulation loop is a performance concern for large MSMs with many threads:

https://github.com/AztecProtocol/aztec-packages/blob/next/barretenberg/cpp/src/barretenberg/ecc/scalar_multiplication/scalar_multiplication.cpp#L478-L486

```cpp
// Accumulate results. This part needs to be single threaded, but amount of work done here should be small
// TODO(@zac-williamson) check this? E.g. if we are doing a 2^16 MSM with 256 threads this single-threaded part
// will be painful.
```

**Action items:**
1. Benchmark the bucket accumulation phase relative to the overall MSM for representative sizes (2^16, 2^20) to determine if this is actually a bottleneck.
2. If it is, explore parallelization options. Note that `BitVector::set()` in `bitvector.hpp` performs a non-atomic read-modify-write, so any parallelization of the bucket loop must keep `BucketAccumulators` per-thread or use atomic operations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate: is single-threaded bucket accumulation in Pippenger a bottleneck? #1656

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Investigate: is single-threaded bucket accumulation in Pippenger a bottleneck? #1656

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions