$\begin{bmatrix}
1 & a \\ 0 & b
\end{bmatrix}$

We first see that the multiplication rule is:
$\begin{bmatrix}
1 & a \\
0 & b
\end{bmatrix}
\begin{bmatrix}
1 & p \\
0 & q
\end{bmatrix}
=
\begin{bmatrix}
1 & p + bq \\
0 & bq
\end{bmatrix}$

so these are closed under matrix multiplication. The identity matrix
is one among these matrices and thus we have the identity. The inverse
of such a matrix can also be seen to be of such a kind.
$\begin{bmatrix}
1 & 0 \\
0 & b
\end{bmatrix}
\begin{bmatrix}
1 & 0 \\
0 & q
\end{bmatrix}
=
\begin{bmatrix}
1 & 0 \\
0 & bq
\end{bmatrix}$

Hopefully clearly, this is isomorphic
to $\mathbb R^*$ since the only degree of freedom is the bottom right entry,
which gets multiplied during matrix multiplication. These transform
a vector $(x, y)$ into the vector $(x, \delta y)$.
Informally, the $D$ matrices are responsible for scaling the $y$-axis.
$\begin{bmatrix}
1 & a \\
0 & 1
\end{bmatrix}
\begin{bmatrix}
1 & p \\
0 & 1
\end{bmatrix}
=
\begin{bmatrix}
1 & (a+p) \\
0 & 1
\end{bmatrix}$

These are isomorphic to $\mathbb R^+$, since the only degree of freedom is
their top-right entry, which gets added on matrix multiplication. These
matrices transform a vector $(x, y)$ into $(x + \delta y, y)$.
$T \equiv
\begin{bmatrix}
1 & a \\
0 & b
\end{bmatrix}$

$[d] \equiv
\begin{bmatrix}
1 & 0 \\
0 & d
\end{bmatrix}; ~~~
[s] \equiv
\begin{bmatrix}
1 & s \\
0 & 1
\end{bmatrix}$

$\begin{aligned}
&[s^{-1}][d][s](x, y) \\
&= [s^{-1}][d] (x+sy, y) \\
&= [s^{-1}](x+sy, dy) \\
&= (x+sy-sdy, dy) \\
\end{aligned}$

This doesn't leave us with another diagonal transform.
$\begin{aligned}
&[d^{-1}][s][d](x, y) \\
&= [d^{-1}][s] (x, dy) \\
&= [d^{-1}](x + dy, dy) \\
&= [d^{-1}](x + dy, dy \times 1/d \\
&= (x + dy, y)
\end{aligned}$

See that the final result we end up with is a shear transform which
shears by $y/d$. So, we can write the equation $DSD^{-1} = S$: conjugating
a shear by scaling leaves us with a shear.
`[a b; c d]`

can be viewed as taking
the fraction `x/y`

to `(ax+by)/(cx+dy)`

. In our case, we have:
- Diagonal:
`[1 0; 0 b]`

which take`x/y`

to`x/by`

. - Shear:
`[1 a 0 1]`

which take`x/y`

to`(x + ay)/y`

.

```
x/y -diagonal->
x/dy -shear->
(x+sdy)/dy
(x+sy')/y'
```

```
x/y -shear->
(x+sy)/y -diagonal->
(x+sy)/dy
= (x+(s/d)y')/y'
```

So, when we compse shears with diagonals, we are left with "twisted shears".
The "main objects" are the shears (which are normal), and the "twists" are
provided by the diagonal.
The intuition for why the twisted obect (shears) should be normal
is that the twisting (by conjugation) should continue to give us twisted
objects (shears). The "only way" this can resonably happen is if the twisted
subgroup is normal: ie, invariant under all twistings/conjugations.