Skip to content

Comments

Fix blockinner+OpenMP and blocklevels=2 with high-order stencils#2856

Open
ggorman wants to merge 1 commit intomainfrom
blockinner_fix
Open

Fix blockinner+OpenMP and blocklevels=2 with high-order stencils#2856
ggorman wants to merge 1 commit intomainfrom
blockinner_fix

Conversation

@ggorman
Copy link
Contributor

@ggorman ggorman commented Feb 19, 2026

Summary

  • blockinner+OpenMP compilation failure: When a loop had both uindices (e.g. Modulo for time stepping) and OpenMP pragmas, the generated for-loop header contained multiple variables in the init clause, violating OpenMP canonical form. Fix moves uindex computation inside the loop body when pragmas are present, keeping variables thread-private.
  • blocklevels=2 crash with high-order stencils: BlockDimension._arg_values raised AttributeError when the parent block's step was a literal Integer (not a Symbol), which occurs with space_order>=4. Fix catches the error and uses int(self.parent.step) directly.
  • Added two new regression tests that exercise both code paths with actual execution (not just structure checks).

Test plan

  • test_cache_blocking_openmp_execution — compiles and runs blockinner=True with OpenMP, compares against noop baseline
  • test_cache_blocking_hierarchical_high_order — runs blocklevels=2 with space_order=8, compares against noop baseline
  • Full test_dle.py suite: 176/177 passed (1 pre-existing MPI config failure)

…tencils

Two bugs fixed:

1. blockinner+OpenMP: When a loop had both uindices (e.g. Modulo for
   time stepping) and OpenMP pragmas, the generated for-loop header
   contained multiple variables in the init clause, violating OpenMP
   canonical form. Fix: compute uindex values inside the loop body
   when pragmas are present, keeping them thread-private.

2. blocklevels=2 with space_order>=4: BlockDimension._arg_values
   crashed with AttributeError when the parent block's step was a
   literal Integer (not a Symbol). Fix: catch AttributeError and
   use int(self.parent.step) directly.

Added two new tests:
- test_cache_blocking_openmp_execution: verifies blockinner+OpenMP
  compiles and produces correct results (not just structure checks)
- test_cache_blocking_hierarchical_high_order: verifies blocklevels=2
  with space_order=8 executes correctly
@codecov
Copy link

codecov bot commented Feb 20, 2026

Codecov Report

❌ Patch coverage is 62.50000% with 21 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.94%. Comparing base (c581d00) to head (08d04fb).

Files with missing lines Patch % Lines
devito/ir/iet/visitors.py 15.78% 15 Missing and 1 partial ⚠️
devito/types/dimension.py 50.00% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2856      +/-   ##
==========================================
- Coverage   78.98%   78.94%   -0.04%     
==========================================
  Files         248      248              
  Lines       50980    51031      +51     
  Branches     4411     4417       +6     
==========================================
+ Hits        40267    40288      +21     
- Misses       9911     9939      +28     
- Partials      802      804       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@FabioLuporini FabioLuporini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

about

# value from the parent dimension (the loop variable)
value = i.symbolic_incr
else:
# IncrDimension: derive value from loop variable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand where this is coming from, it's like it's trying to add new logic, which is not the point of this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean this whole block of code...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seconded

return {name: args[self.parent.step.name]}
except AttributeError:
# Parent's step is a literal (e.g. Integer), use it directly
return {name: int(self.parent.step)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change?

parent_step_name = self.parent.step.name
parent_value = args[parent_step_name]
except AttributeError:
# Parent's step is a literal (e.g. Integer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

I think AI got pretty badly confused

It feels like it's completely missing the multi-layer structure of the compiler for some reasons... since the CGen visitor that it attempted to fix above is at the very end of compilation, while these changes are for very high level objects...

'blockinner': blockinner,
'par-collapse-ncores': 1}))

u.data[:] = 0.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary, as is the v.data[:] = 0.1 below

def test_cache_blocking_openmp_execution(blockinner):
"""
Verify that blockinner + OpenMP actually compiles and produces
correct results. This catches canonical-form violations in the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This catches canonical-form violations in the
    generated for-loop headers (e.g. uindices in the init clause).

I'm not convinced it does. This test should really inspect the generated code/operator to verify this.

f'Illegal block size `{name}={value}`: sub-block sizes '
'must divide the parent block size evenly '
f'(`{self.parent.step.name}={parent_value}`)'
f'(`{parent_step_name}={parent_value}`)'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

parent_value = args[parent_step_name]
except AttributeError:
# Parent's step is a literal (e.g. Integer)
parent_step_name = str(self.parent.step)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parent_step_name is getting repeated a bunch and quite inconsistently for reasons that escape me

for i in o.uindices:
op = '=' if i.is_Modulo else '+='
ustep.append(f'{i.name} {op} {self.ccode(i.symbolic_incr)}')
ustep.append(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change the layout here?

# value from the parent dimension (the loop variable)
value = i.symbolic_incr
else:
# IncrDimension: derive value from loop variable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seconded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants