Skip to content

Feature: Add PPCG iterative diagonalization solver#7404

Open
Silver-Moon-Over-Snow wants to merge 32 commits into
deepmodeling:developfrom
Silver-Moon-Over-Snow:ppcg
Open

Feature: Add PPCG iterative diagonalization solver#7404
Silver-Moon-Over-Snow wants to merge 32 commits into
deepmodeling:developfrom
Silver-Moon-Over-Snow:ppcg

Conversation

@Silver-Moon-Over-Snow

@Silver-Moon-Over-Snow Silver-Moon-Over-Snow commented May 30, 2026

Copy link
Copy Markdown

Summary

Add PPCG iterative diagonalization solver to source/source_hsolver/.

Two strategies

Strategy Status Description
CONJUGATE_GRADIENT ✅ Working Band-by-band Polak-Ribiere CG with line minimization
BLOCK_SUBSPACE ⚠️ Known issue NaN with S=I (needs stability fix)

Changes

  • diago_ppcg.h — Template class header
  • diago_ppcg.cpp — Full implementation with LAPACK bindings
  • test/diago_ppcg_test.cpp — GTest unit test (1D particle-in-a-box)
  • CMakeLists.txt / test/CMakeLists.txt — Build integration

Bug fix

potrf retry: saves original matrix before applying diagonal shift, restores before each retry — prevents accumulated shifts from corrupting the matrix.

Test result (CG)

1D particle-in-a-box (n=10): error = 4.3e-12

Known limitation

BLOCK_SUBSPACE strategy has numerical instability with S=I, unrelated to this fix.

Silver-Moon-Over-Snow and others added 6 commits March 28, 2026 13:19
Consider the previous contributions made by classmates, I'm only capable to make small difference without disrupting the entire program ---- like such a small "static".
…w_Small-Changes

2025PKUCourseHW5: Case: 1 - Change rank_seed_offset to static const
…ent)

Add PPCG iterative diagonalization with two strategies:
- CONJUGATE_GRADIENT: band-by-band Polak-Ribiere CG (verified working)
- BLOCK_SUBSPACE: block subspace diagonalization

Includes potrf retry fix: save/restore original matrix before applying
diagonal shift, preventing accumulated shifts from corrupting the matrix.

Test: 1D particle-in-a-box (n_dim=10), CG strategy matches exact
eigenvalues with error 4.3e-12.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@Silver-Moon-Over-Snow Silver-Moon-Over-Snow changed the title Ppcg Add PPCG iterative diagonalization solver to source/source_hsolver/. May 30, 2026
…ormalization

Three fixes for numerical stability:

1. potrf: save/restore original matrix before diagonal shift retries,
   preventing accumulated shifts from corrupting the Cholesky factor.

2. sygvd/syevd: skip workspace query (lwork=-1) and allocate directly.
   The LAPACK replacement ignores workspace queries, causing the second
   call to operate on already-transformed data, corrupting eigenvalues.

3. Block subspace: add chol_qr + hpsi/spi recomputation after
   update_one_block and every rayleigh_ritz, keeping wavefunctions
   S-orthonormal and preventing numerical drift of H|psi> and S|psi>.

Results (1D particle-in-a-box, S=I):
- CG nband=1: error 4.3e-12 (unchanged, already working)
- BLOCK_SUBSPACE nband=1: no longer NaN, converges (to wrong eigenvalue
  due to algorithmic limitation with S=I)
… RR steps

1. solve_small_generalized: save/restore M matrix before retry with shifts
   (prevents accumulation of shifts on sygvd-corrupted M)

2. BLOCK_SUBSPACE: add Krylov fallback for near-collinear p/w vectors
   When p is nearly parallel to w (cos^2 > 0.99), replace p with H·w
   to keep the 3-vector subspace [psi, w, p] full rank.
   This fixes NaN eigenvalues for nband>1 with S=I.

3. LAPACK: use standard workspace query (lwork=-1) pattern for syevd/sygvd
   More robust with real LAPACK implementations.

4. CG: add periodic Rayleigh-Ritz subspace rotation every rr_step iterations
   Corrects band ordering and eigenvalue estimates after band-by-band
   line minimization. Resets PR state after rotation.
The gamma_dot function returns only the real part of inner products,
which is correct for Hermitian forms like <psi|H|psi> but wrong for
projection coefficients where the imaginary part matters.

In orth_gradient and project_against, the projection coefficient
<psi_i | v> must use the full complex inner product to correctly
remove the overlap. Using only Re(<psi_i | v>) leaves an imaginary
component that corrupts the search direction, causing excited-state
bands to converge to wrong eigenvalues.

This fixes the CG strategy bands 1 and 2 converging to the highest
eigenvalue (3.919) instead of the first excited states.
Comment thread source/source_pw/module_stodft/sto_wf.cpp Outdated
…PACE

The chol_qr_active call after update_one_block re-orthonormalized psi
but left the p vector in the old basis, creating an inconsistency.
The p vector is constructed in update_one_block using the same subspace
rotation as psi, so they start consistent. Adding chol_qr_active before
the p vector is updated breaks this consistency.
This change was unrelated to the PPCG integration and should not
have been included.
H and S are real symmetric operators whose eigenvectors are real.
The previous complex random initialization produced complex
off-diagonal elements in the H-gram matrix (max |Im| ~ 0.5 for
nband=3), causing Re(<psi_i|H|psi_j>) != <psi_i|H|psi_j>.  The
gamma_dot function only returns the real part, so all subspace
Gram matrices (built via gram()) computed wrong off-diagonals,
leading to incorrect eigenvalues from sygvd.

With real-only psi all inner products are real and gamma_dot is
exact, so both BLOCK_SUBSPACE and CONJUGATE_GRADIENT strategies
should now converge to the correct eigenvalues.
…tioning

The 3-block subspace method builds a generalized eigenvalue problem
with basis V = [psi, w, p] where p is constructed from the previous
subspace eigenvectors (p_new += w_l * cw in update_one_block).  This
makes p a linear combination of the w vectors, causing the [w, p]
block of the S-gram matrix M to become nearly rank-deficient.

With nband=3 and sbsize=4 the 9x9 M matrix has condition number
large enough that dsygvd produces negative eigenvalues for the
positive-definite problem (observed: -0.26 at iter=2), and the
eigenvalues diverge exponentially thereafter.

Setting use_p=false reduces the subspace to [psi, w] (2-block),
which is a preconditioned Davidson-like method.  It converges
robustly: the BLOCK_SUBSPACE test now passes in 57 ms with all
3 eigenvalues within 1e-8 of the exact values.

The 3-block code path is preserved for future re-enablement once
a more robust p-vector construction is implemented.
With rr_step=4, the non-RR iterations use Cholesky orthonormalization
which mixes bands through the upper-triangular U^{-1}, causing high-energy
bands to contaminate low-energy ones.  This drives CG eigenvalues to the
spectrum maximum [3.31, 3.68, 3.92] instead of the correct lowest values
[0.081, 0.317, 0.690].

Using rr_step=1 forces Rayleigh-Ritz every iteration, which correctly
diagonalizes the subspace and preserves band ordering.
The orth_cholesky call before rayleigh_ritz mixes bands through the
upper-triangular U^{-1} factor, contaminating low-energy bands with
high-energy components.  This drives CG eigenvalues to the spectrum
maximum instead of the minimum.

rayleigh_ritz solves the generalized eigenvalue problem K v = λ M v
via dsygvd, which correctly handles non-S-orthogonal bases.  The
orth_cholesky is not needed and is actively harmful.

This makes the CG RR path consistent with BLOCK_SUBSPACE, which
calls rayleigh_ritz without prior orth_cholesky.
BLOCK_SUBSPACE starts with rayleigh_ritz (line 1085) which finds correct
eigenvalues and rotates psi before the iteration loop.  CG was using
diagonal Rayleigh quotients instead — these are poor approximations for
random initial guesses, producing wrong gradients that drive the band-by-band
line_minimize toward high-energy eigenstates.

With rr_step=1 (every-iteration RR), the CG loop itself is now correct,
but without an initial RR the first line_minimize step already pushes
psi in the wrong direction, and subsequent RR steps cannot fully recover.
Backup preserved at diago_ppcg_test.cpp.bak
@Silver-Moon-Over-Snow Silver-Moon-Over-Snow changed the title Add PPCG iterative diagonalization solver to source/source_hsolver/. Feature: Add PPCG iterative diagonalization solver Jun 5, 2026
The linear approximation α = -C/B drops the α² term from the Rayleigh
quotient derivative dR/dα = 0.  This picks one of the two stationary
points (minimum or maximum) arbitrarily.  For bands far from convergence
it can select the MAXIMUM, driving ψ toward high-energy states instead
of the desired lowest eigenvalues.

Solve the full quadratic Aα² + Bα + C = 0, evaluate R(α) for both
roots (and the linear guess), and pick the one with the lowest R.

Also restore the CG unit test (rr_step=1, initial rayleigh_ritz).
…e use_p

Three changes to make both PPCG strategies correctly converge with rr_step=4:

1. CG non-RR path: After orth_cholesky, solve the nband x nband subspace
   generalized eigenvalue problem instead of using diagonal Rayleigh quotients.
   The upper-triangular U^{-1} from Cholesky mixes high-energy components into
   low-energy bands, making diagonal RQs overestimate the eigenvalues.  The
   subspace solve gives correct Ritz values without rotating the states,
   preserving Polak-Ribiere conjugate-direction accumulators.

2. BLOCK_SUBSPACE: Re-enable use_p=true (3-block [psi, w, p] subspace).
   The Krylov fallback (replace p with H·w when p ~ w) was already in place
   but dead because use_p was hardcoded to false.  Now it activates on the
   first iteration (p is zero-initialized) and whenever p becomes collinear
   with w after update_one_block.

3. CG test: Change rr_step from 1 back to 4 so the non-RR Cholesky path
   is exercised, validating the true Polak-Ribiere CG mechanism.
The 3-block [psi, w, p] subspace generalized eigenproblem becomes
ill-conditioned when residuals are small (near convergence).  The
[w, p] Gram block shrinks, the M matrix approaches singularity, and
dsygvd produces garbage eigenvectors that drive eigenvalues to
catastrophic values (e.g., -137775 instead of 0.081).

The p-bad H·w Krylov fallback fixes p~w collinearity but does not
address the small-residual ill-conditioning, which is fundamental to
the 3-block construction.  Keep use_p=false for robust convergence.
The [w,p] block of the Gram matrix M shrinks as residuals converge,
making M nearly singular and causing sygvd to produce garbage
eigenvectors.  Scaling w and p to unit S-norm keeps M well-conditioned
(diagonal ~1) without changing the subspace — Ritz values are identical
and Ritz vector coefficients cancel in update_one_block.

This enables the full 3-block [psi,w,p] subspace (use_p=true) by
addressing the fundamental ill-conditioning that the p-bad Krylov
fallback alone could not handle.
The Krylov fallback (replace p with Hw when p~w) was flawed:
when w is approximately an eigenvector (Hw ≈ λw), the replacement
does not fix collinearity.  After S-norm scaling, p ≈ w still,
M_wp ≈ [1,1;1,1] is rank-1, and dsygvd fails.

Instead, simply skip p for this iteration (use_p_now=false).
update_one_block still produces a valid p for the next iteration
from the w Ritz-vector contribution.
@mohanchen mohanchen added the Diago Issues related to diagonalizaiton methods label Jun 9, 2026
Silver-Moon-Over-Snow and others added 8 commits June 11, 2026 08:38
Add tests for:
- 2x2 matrix (smallest non-trivial case)
- Degenerate eigenvalues (H = I + J, multiplicity-3 degeneracy)
- Larger 20x20 tridiagonal with 5 bands
- Dense 8x8 matrix via Givens rotations (addresses full-matrix coverage)

All use CONJUGATE_GRADIENT strategy which has sygvd fallback.
BLOCK_SUBSPACE tests deferred due to dsygvd instability with some LAPACK builds.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Covers diagonal, tridiagonal, dense, pentadiagonal, degenerate,
Neumann, S≠I, gamma_g0, single-band, all-band, many-band,
bad preconditioner, tight threshold, scaled, gapped spectrum,
rr_step=1, 1x1, and eigenvector quality checks.

Adds QuickBenchmark (CI-friendly) and DISABLED_FullBenchmark.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Diago Issues related to diagonalizaiton methods project_learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants