Iacono, Mauro - ed. ; Kołodziej, Joanna - ed.
The Lanczos algorithm is among the most frequently used iterative techniques for computing a few dominant eigenvalues of a large sparse non-symmetric matrix. At the same time, it serves as a building block within biconjugate gradient (BiCG) and quasi-minimal residual (QMR) methods for solving large sparse non-symmetric systems of linear equations. It is well known that, when implemented on distributed-memory computers with a huge number of processes, the synchronization time spent on computing dot products increasingly limits the parallel scalability. ; Therefore, we propose synchronization-reducing variants of the Lanczos, as well as BiCG and QMR methods, in an attempt to mitigate these negative performance effects. These so-called s-step algorithms are based on grouping dot products for joint execution and replacing time-consuming matrix operations by efficient vector recurrences. The purpose of this paper is to provide a rigorous derivation of the recurrences for the s-step Lanczos algorithm, introduce s-step BiCG and QMR variants, and compare the parallel performance of these new s-step versions with previous algorithms.
Zielona Góra: Uniwersytet Zielonogórski
AMCS, volume 25, number 4 (2015) ; kliknij tutaj, żeby przejść
Biblioteka Uniwersytetu Zielonogórskiego
2024-05-06
2024-05-06
44
https://zbc.uz.zgora.pl/publication/88917