UPDATE - I've since realized that a 4x4 matrix simply isn't going to work for what I want to do.

The linear system that I was working on (fun.txt, funfun.txt) is flawed. The maximum number of control points that can be used to create a 4x4 xform matrix is 5 (not 8). Any more than that, and the system becomes overspecified. This does nothing to improve the accuracy of the resulting matrix, it just makes it harder to solve.

There *are* deformation techniques that use 8 control points (Sederberg and Parry, 1986), but these techniques do *not* produce a deformation matrix as their result; they operate directly on the model data (interpolation, using (bezier ?) curves, the same stuff done in image morphing software), and store their changes right in the polygon mesh. This doesn't work for me. One of the big advantages to my system (at the moment) is that a unique copy of each building does *not* need to be stored.

Is near as I can tell, the *only* way to get non-parallel edges after performing a transformation on a rectangle is through the use of the "taper" values in the transformation matrix (a31 & a32 in 3x3, a41, a42 & a43 in 4x4). Additionally, there's no way to prevent the side-effect of z-value distortion (tilted rooftops), as each of the taper transformations act on a plane (a41 acts on y and z, etc).

I've left the system as it is; users will just have to keep in mind that very skewed blocks won't look right. Anyway, this description of the problem will remain here. Perhaps someone with more experience will think of some way around this.
(end of update)

The problem

The routine used to stretch/rotate/scale building models to fit into their assigned block divisions is based on an algorithm I found here .

The code is in matrix_find_transform_matrix()

Note that it is a 2d algorithm. This is a problem. The result is a 3x3 transformation matrix. We need a 4x4 transformation matrix, as our transformations will be made in 3d.

My (naive) solution was to turn the 3x3 matrix into a 4x4 matrix. See matrix_convert_4x4_from_3x3().
        |       a11     a12     a13     |
        |       a21     a22     a23     |
        |       a31     a32     a33     |
becomes this:
        |       a11     a12     0       a13     |
        |       a21     a22     0       a23     |
        |       0       0       1       0       |
        |       a31     a32     0       1       |
The a31 and a32 entries are the ones to really pay attention to here. As this page explains, these values have the effect of tapering points toward the x and y axes. This is fine for 2d, but in 3d, there is an unwanted (for me, at least) side effect: building-to-subblock transformation matrices that have large values for a31 and a32 (resulting from placing a building in a very skewed, non-orthogonal subblock) will also cause the tops of the buildings to be tilted or skewed. This is because the same tapering effect that a31 and a32 have (in order to make the building fit nicely) also cause the roof of a building to be tapered toward the x and y axes.

In short, it looks weird.

The solution

The solution is obvious: do what the author of the original algorithm did, but extend it to 3 dimensions, and 8 control points.

Easier said than done, though. Since there are now 8 required control points, and there are 3 values per point (x,y,z) the intermediate matrix has 24 rows. And since we now need a 4x4 matrix, the intermediate matrix has 16 columns. That's a 24x16 matrix, representing a system of equations. My head hurts already.

Here's how far I've gotten: fun.txt funfun.txt