Fix blockinner+OpenMP and blocklevels=2 with high-order stencils#2856
Fix blockinner+OpenMP and blocklevels=2 with high-order stencils#2856
Conversation
…tencils Two bugs fixed: 1. blockinner+OpenMP: When a loop had both uindices (e.g. Modulo for time stepping) and OpenMP pragmas, the generated for-loop header contained multiple variables in the init clause, violating OpenMP canonical form. Fix: compute uindex values inside the loop body when pragmas are present, keeping them thread-private. 2. blocklevels=2 with space_order>=4: BlockDimension._arg_values crashed with AttributeError when the parent block's step was a literal Integer (not a Symbol). Fix: catch AttributeError and use int(self.parent.step) directly. Added two new tests: - test_cache_blocking_openmp_execution: verifies blockinner+OpenMP compiles and produces correct results (not just structure checks) - test_cache_blocking_hierarchical_high_order: verifies blocklevels=2 with space_order=8 executes correctly
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2856 +/- ##
==========================================
- Coverage 78.98% 78.94% -0.04%
==========================================
Files 248 248
Lines 50980 51031 +51
Branches 4411 4417 +6
==========================================
+ Hits 40267 40288 +21
- Misses 9911 9939 +28
- Partials 802 804 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| # value from the parent dimension (the loop variable) | ||
| value = i.symbolic_incr | ||
| else: | ||
| # IncrDimension: derive value from loop variable |
There was a problem hiding this comment.
I don't understand where this is coming from, it's like it's trying to add new logic, which is not the point of this PR
There was a problem hiding this comment.
I mean this whole block of code...
| return {name: args[self.parent.step.name]} | ||
| except AttributeError: | ||
| # Parent's step is a literal (e.g. Integer), use it directly | ||
| return {name: int(self.parent.step)} |
| parent_step_name = self.parent.step.name | ||
| parent_value = args[parent_step_name] | ||
| except AttributeError: | ||
| # Parent's step is a literal (e.g. Integer) |
There was a problem hiding this comment.
why?
I think AI got pretty badly confused
It feels like it's completely missing the multi-layer structure of the compiler for some reasons... since the CGen visitor that it attempted to fix above is at the very end of compilation, while these changes are for very high level objects...
| 'blockinner': blockinner, | ||
| 'par-collapse-ncores': 1})) | ||
|
|
||
| u.data[:] = 0.1 |
There was a problem hiding this comment.
This is unnecessary, as is the v.data[:] = 0.1 below
| def test_cache_blocking_openmp_execution(blockinner): | ||
| """ | ||
| Verify that blockinner + OpenMP actually compiles and produces | ||
| correct results. This catches canonical-form violations in the |
There was a problem hiding this comment.
This catches canonical-form violations in the
generated for-loop headers (e.g. uindices in the init clause).
I'm not convinced it does. This test should really inspect the generated code/operator to verify this.
| f'Illegal block size `{name}={value}`: sub-block sizes ' | ||
| 'must divide the parent block size evenly ' | ||
| f'(`{self.parent.step.name}={parent_value}`)' | ||
| f'(`{parent_step_name}={parent_value}`)' |
| parent_value = args[parent_step_name] | ||
| except AttributeError: | ||
| # Parent's step is a literal (e.g. Integer) | ||
| parent_step_name = str(self.parent.step) |
There was a problem hiding this comment.
parent_step_name is getting repeated a bunch and quite inconsistently for reasons that escape me
| for i in o.uindices: | ||
| op = '=' if i.is_Modulo else '+=' | ||
| ustep.append(f'{i.name} {op} {self.ccode(i.symbolic_incr)}') | ||
| ustep.append( |
There was a problem hiding this comment.
Why change the layout here?
| # value from the parent dimension (the loop variable) | ||
| value = i.symbolic_incr | ||
| else: | ||
| # IncrDimension: derive value from loop variable |
Summary
uindices(e.g. Modulo for time stepping) and OpenMP pragmas, the generated for-loop header contained multiple variables in the init clause, violating OpenMP canonical form. Fix moves uindex computation inside the loop body when pragmas are present, keeping variables thread-private.BlockDimension._arg_valuesraisedAttributeErrorwhen the parent block's step was a literalInteger(not aSymbol), which occurs withspace_order>=4. Fix catches the error and usesint(self.parent.step)directly.Test plan
test_cache_blocking_openmp_execution— compiles and runsblockinner=Truewith OpenMP, compares againstnoopbaselinetest_cache_blocking_hierarchical_high_order— runsblocklevels=2withspace_order=8, compares againstnoopbaselinetest_dle.pysuite: 176/177 passed (1 pre-existing MPI config failure)