Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New type of BlockTridiagonalKron (or something) #37

Open
jlperla opened this issue May 18, 2019 · 6 comments
Open

New type of BlockTridiagonalKron (or something) #37

jlperla opened this issue May 18, 2019 · 6 comments

Comments

@jlperla
Copy link

jlperla commented May 18, 2019

Combination code and basic linear algebra question for @dlfivefifty and any other experts: We have a very particular structure for some of my sparse matrices, which is common enough in economics to be worth specializing types:

  • Take a sequence of Tridiagonal matrices of size MxM, L_1, ... L_N and a square (possibly dense) matrix Q of size NxN, which is an intensity matrix of a markov chain (i.e. q_ii = - sum(q_ij | j != i))
  • Then you can construct a sparse matrix with code like the following
L = blockdiag(L_1, L_2, ..., L_N) + kron(Q, sparse(I, M, M))
  • The structure of this is: a block matrix with tridiagonal matrices down the diagonal, and a bunch of constant coefficient diagonal matrices on all off-diagonal blocks. Lots of structure.

While I can represent this in a BlockBanded matrix, it is inefficient in space (since it has to have the same bandwidth for each block). More importantly, there is structure there that might be exploited to speed up linear solves.

So, with that in mind, what do you think about both the feasibility and payoff of creating a specialized matrix type of that form that can exploit the structure for fast linear solves?

This is where I am entirely ignorant... But looking at https://github.com/JuliaMatrices/BlockBandedMatrices.jl/blob/master/src/blockskylineqr.jl#L19 it seems like the QR decomposition can be done in blocks in some way? The QR for every block other than the diagonal is trivial. Is this both something feasible without too much specialized knowledge, and likely to pay off big for the speed of linear solves?

If you think it worthwhile, we could create this type and add it to this package.

@dlfivefifty
Copy link
Member

Why not do it the other way around: kron(sparse(I, M, M), Q). Then you have dense blocks on the diagonal, with diagonal off-diagonal blocks. This would be fairly efficient as a BlockBandedMatrix, though you don't take advantage of sparsity of the off-diagonal blocks.

QR decomposition for BlockSkylineMatrix (which BlockBandedMatrix is a special case) does indeed use LAPACK for the QR: it takes a view of all blocks diagonal and below (A[Block.(k:k+A.l),Block(k)]) which is memory-compatible with a strided matrix, and hence LAPACK can be used. I've been meaning to write a blog post on this (but swamped with other things, unfortunately).

@jlperla
Copy link
Author

jlperla commented May 19, 2019

Sorry, that was just a simple way to generate the matrix as something sparse, but there may be other ones. Your suggestion generated a different matrix, unless I made a mistake

To be concrete here is some working code to create the matrices.

using LinearAlgebra, SparseArrays, BlockBandedMatrices

# in reality, these would be more complicated and huge
L_1 = Tridiagonal([1, 1, 1], [-2, -2, -2, -2], [1, 1, 1])
L_2 = Tridiagonal([1, 1, 1], [-2, -2, -2, -3], [1, 1, 1])
Ls = (L_1, L_2)
N = length(Ls)
M = size(L_1, 1)

# fully dense in this case, but maybe sparse later and probably banded
Q = [-0.1 0.1; 0.2 -0.2]
Q_bandwidths = (1,1)
@assert N == size(Q, 1) == size(Q, 2)

L = blockdiag(sparse.(Ls)...) + kron(Q, sparse(I, M, M))

Can we set this up as a BandedBlockBandedMatrix? Sure:

using BlockBandedMatrices
Lblock = BandedBlockBandedMatrix{Float64}(Matrix(L), ([M,M],[M,M]), (1,1), Q_bandwidths)

But looking first at storage (with Q dense for now)

  • With the banded setup above, approximately 3M N^2 nonzeros
  • With sparse setup, approximately 3MN + M(N^2 - N)
  • But if we consider that the off-diagonals are constant: 3MN + N^2

For a system of M=1000 and N=500, the differences are:

  • 750,000,000, with bandedblockbanded with the shared bandwidth
  • 251,000,000, with sparse setup, since it doesn't force the higher bandwidth for off-block diagonals
  • 1,750,000, with banded diagonal and specialized storage that off-block diagonals are constant diagonals

I know how to do the matrix-vector multiplications to specialize the type, but as for algorithms for linear solves, there I am ignorant.

But it seems to me that LU or QR algorithms could exploit that the off-block-diagonals bandwidths are just diagonals, or (even better) that off-block-diagonals are constant diagonals?

@dlfivefifty
Copy link
Member

My suggestion is equivalent to permuting the rows/columns to interchange the notion of blocks and entries in the blocks. Essentially a “transpose” if you reinterpret a blocked vector as a matrix. So the linear system is equivalent and you can take advantage of BlockBandedMatrix

@jlperla
Copy link
Author

jlperla commented May 19, 2019

I think I "kind of" get what you are saying here for the kron part, but not sure about how the rest of it would work (i.e. the blockdiag in the way I have constructed it)

L = blockdiag(sparse.(Ls)...) + kron(Q, sparse(I, M, M))

This may be out of my paygrade.

But lets make a chance to my setup: Instead of Q being dense, lets say that it is also banded, and of roughly the same bandwidth as the Ls. If so, then does the transposing with the kron help?

And, if not, and the key really is in the use of BlockBanded, are there any examples of how to construct a block banded with different matrix types for the diagonal vs. the off-diagonal blocks?

@jlperla
Copy link
Author

jlperla commented May 19, 2019

I take that back. I now fully get what you are saying about interchanging the dimensions. But I still don't understand the code to work with different matrix types in the block-diagonal vs. off-diagonals.

@dlfivefifty
Copy link
Member

You probably want to use a BlockArray wrapping a BandedMatrix{Union{DTyp,OffDTyp}} and make sure it implements the block-Banded interface. None of this has been done yet, and will only help with mat * vec and mat * mat: linear solves have full in.

Alternatively, your matrix will be Banded with bandwidth proportional to the block size, so you could just wrap a BandedMatrix{Float64} with a PseudoBlockMatrix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants