Quaternions Revisited

It has admittedly been quite a while since my last post over a year ago. I thought I would restart the posts by revisiting one of the first topics I discussed on the website: quaternions. My previous post, upon review, seems to be quite uninformative on what the nature and use of them are which I will attempt to show in this post.

Formalism

Quaternions are a generalization of complex numbers (\mathbb{C}  ) or hypercomplex numbers and they are denoted with \mathbb{H}  . Below I write both in their general form.

\displaystyle \mathbb{C}: a+bi 

\displaystyle \mathbb{H}: a+bi+cj+dk 
where  \displaystyle ijk=i^2=j^2=k^2=-1 

Now, there are 3 “imaginary” components and they are defined by that relation at the bottom. This is super interesting! What does this even mean though? A real number with some sort of 3-dimensional imaginary component?
quaternions

 

Let’s first look at the algebra of this new type of number by investigating the results of that relation. Assuming real number multiplication follows the expected properties and associativity, observe the following identities that form out of the algebra defined above.

\displaystyle -1=ijk \implies -k=ijk^2=-ij \implies ij=k
\displaystyle -1=ijk \implies -i=i^2jk=-jk \implies jk=i 
\displaystyle ijk=j^2 \implies i^2jk^2= ij^2k \implies --j=(ij)(jk) \implies ki=j 
\displaystyle j=ki \implies j=iji \implies ij=-ji 
\displaystyle k=ij \implies k=jkj \implies jk=-kj 
\displaystyle i=jk \implies i=kik \implies ki=-ik

Note that this algebra eerily resembles the cross product (🤔) which means its also not commutative. This makes algebra complicated. Look at how two quaternions multiply using these rules.

\displaystyle (a_1+b_1i+c_1j+d_1k)(a_2+b_2i+c_2j+d_2k) = (a_1a_2-b_1b_2-c_1c_2-d_1d_2)+
\displaystyle (a_1b_2+b_1a_2+c_1d_2-d_1c_2)i+
\displaystyle (a_1c_2-b_1d_2+c_1a_2+d_1b_2)j+
\displaystyle (a_1d_2+b_1c_2-c_1b_2+d_1a_2)k

Ew. This definitely calls for more friendly notation. We adopt the following.

Let q=a+bi+cj+dk  be written q=(a,\vec{b})  where \vec{b}=(b,c,d)  . Now consider multiplying two quaternions q_1=(0,\vec{b}_1)  and q_2=(0,\vec{b}_2) . Recall our earlier cross product observation. We know, for example, ij=k\leftrightarrow  \hat{i}\times\hat{j}=\hat{k}  . On the contrary, remember that ii=-1 which does not follow this trend. Some quick wit helps us get around this fact and yield the following nice relation.

\displaystyle (0,\vec{b}_1)\cdot(0,\vec{b}_2)=(-\vec{b}_1\cdot\vec{b}_2,\vec{b}_1\times\vec{b}_2)

Quaternion multiplication can then be simplified immensely but considering the fact that (a,\vec{b})=(a,\vec{0})+(0,\vec{b}) .

\displaystyle (a_1,\vec{b}_1)(a_2,\vec{b}_2)=((a_1,\vec{0})+(0,\vec{b}_1))((a_2,\vec{0})+(0,\vec{b}_2))

Distribute.

\displaystyle = (a_1,\vec{0})(a_2,\vec{0})+(a_1,\vec{0})(0,\vec{b}_2)+(0,\vec{b}_1)(a_2,\vec{0})+(0,\vec{b}_1)(0,\vec{b}_2) 

Apply known rules.

\displaystyle = (a_1a_2,0)+(0,a_1\vec{b}_2)+(0,a_2\vec{b}_1)+(-\vec{b}_1\cdot\vec{b}_2,\vec{b}_1\times\vec{b}_2)

Combine.

\displaystyle = (a_1a_2-\vec{b}_1\cdot\vec{b}_2,a_1\vec{b}_2+a_2\vec{b}_1+\vec{b}_1\times\vec{b}_2)

Compare this with complex number multiplication.

\displaystyle \mathbb{C}: (a_1,b_1)(a_2,b_2)= (a_1a_2-b_1b_2,a_1b_2+a_2b_1)

\displaystyle \mathbb{H}: (a_1,\vec{b}_1)(a_2,\vec{b}_2)= (a_1a_2-\vec{b}_1\cdot\vec{b}_2,a_1\vec{b}_2+a_2\vec{b}_1+\vec{b}_1\times\vec{b}_2)

They are very similar. (Note: In the future, we will define \textup{Re}((a,\vec{b})):=a  and \textup{Im}((a,\vec{b})):=\vec{b}  )

I leave it to the reader to convince themselves that the following definitions of the conjugate and norm for q=(a,\vec{b})  are natural and closely follow well-known properties.

\displaystyle \overline{q}:=(a,-\vec{b})

|q|:=\sqrt{q\overline{q}}\in \mathbb{R}

Euler’s formula for complex numbers also extends to quaternions which will come in use later.

\displaystyle \mathbb{C}: e^{(a,b)}=e^a(\cos{b},\sin{b})

\displaystyle \mathbb{H}: e^{(a,\vec{b})}=e^a(\cos{|b|},\hat{b}\sin{|b|})

Showing this works requires a lot more math than we have space for. The Pauli matrix formulation1 however makes it fairly straightforward with matrix exponentials.

Application

These hypercomplex numbers are remarkably applicable. Say we wish to rotate a vector \vec{v}  around the axis \hat{n}  (|\hat{n}|=1  ) at an angle of \theta . Then we define the following.

\displaystyle r:=e^{\left(0,\frac{\theta}{2} \hat{n}\right)}
\displaystyle q:=(0,\vec{v})

All our rotation information is encoded in the rotation quaternion r  and our target vector is encoded in the target quaternion q  . From this, we obtain the desired rotated quaternion q'  .

q' = rq\overline{r}

The desired vector is obtained by taking \textup{Im}(q') . Note: \textup{Re}(q')=0 in this case. If we want to apply two rotations r_1 and r_2 ,

q' = r_2r_1q\overline{r}_1 \overline{r}_2 = r_2r_1q\overline{r_2r_1}

It is equivalent to applying the singular rotation r_2r_1  .

Allow me to convince you of this with a very familiar case of 3D rotations. Let our axis be \hat{k}  and our vector \vec{v} be completely contained within the xy-plane so \hat{k}\times(\hat{k}\times\vec{v})=\vec{v} . Then, we first set up our quaternions.

\displaystyle r=e^{\left(0,\frac{\theta}{2}\hat{k}\right)}=\left(\cos{\frac{\theta}{2}}, \hat{k}\sin{\frac{\theta}{2}}\right)
\displaystyle q=(0,\vec{v})

Apply the formula.

\displaystyle q' = \left(\cos{\frac{\theta}{2}}, -\hat{k}\sin{\frac{\theta}{2}}\right)(0,\vec{v}) \left(\cos{\frac{\theta}{2}}, \hat{k}\sin{\frac{\theta}{2}}\right)

\displaystyle =\left(\cos{\frac{\theta}{2}}, -\hat{k}\sin{\frac{\theta}{2}}\right)\left(0,\cos{\frac{\theta}{2}}\vec{v}+\hat{k}\times\vec{v}\sin{\frac{\theta}{2}}\right)

\displaystyle = \left(0,\cos^2{\frac{\theta}{2}}\vec{v}+\hat{k}\times\vec{v}\sin{\frac{\theta}{2}}\cos{\frac{\theta}{2}}- \hat{k}\times(\hat{k}\times\vec{v})\sin^2{\frac{\theta}{2}}\right)

\displaystyle = (0, \vec{v}\cos{\theta}+\hat{k}\times\vec{v}\sin{\theta})

This is exactly our desired result. Amazing!

This amazing simplification of 3D rotations alone makes quaternions extremely useful in physics and computer graphics. It also has an interesting position in algebra as a ring but that extends into a much different topic.

Extension

Naturally, one might ask the question why not add more imaginary components? Or why did we jump from 1 to 3? Why don’t we just consider a general n-component case? This is where algebraic theory becomes important.

Cayley-Dickson Construction and hyperhyperhypercomplex numbers

There are a variety of directions one can go when constructing hypercomplex numbers but I chose a very popular and desirable one: quaternions. They are the result of Cayley-Dickson construction.

Cayley-Dickson construction takes an algebra or vector space over the real numbers in n  dimensions, and constructs one in 2n  dimensions. It happens in the following way.

Take an algebra over the reals A  with defined addition, multiplication, and involution operator (conjugate). We then define an algebra A^* in the following way.

\displaystyle A^* = \{(a,b)|a,b \in A\}

\displaystyle \textup{(A1)                    } (a_1,b_1)(a_2,b_2)=(a_1a_2-b_2^*b_1, b_2a_1+b_1a_2^*)

\displaystyle \textup{(A2)                    } (a,b)^*=(a^*,-b)

Some notes:

  • Order is very important here!
  • The norm is universally defined as |a|:= \sqrt{a^*a}  and it always yields a non-negative real number (check for yourself)
  • Things like the identity and distributivity are passed down in a trivial manner as well.

Now let’s apply this construction.

We start with A=\mathbb{R}  . Then

\displaystyle A^* = \{(a,b)|a,b \in \mathbb{R}\}
\displaystyle (a_1,b_1)(a_2,b_2)=(a_1a_2-b_2^*b_1, b_2a_1+b_1a_2^*)=(a_1a_2-b_2b_1, b_2a_1+b_1a_2)
\displaystyle (a,b)^*=(a^*,-b)=(a,-b)

From these properties, it is clear that A^* = \mathbb{C}  . Now, let A=\mathbb{C}  . Then,

\displaystyle A^* = \{(a,b)|a,b \in \mathbb{C}\}
\displaystyle (a_1,b_1)(a_2,b_2)=(a_1a_2-b_2^*b_1, b_2a_1+b_1a_2^*)
\displaystyle (a,b)^*=(a^*,-b)

See if you can convince yourself that A^* = \mathbb{H}  . Hint: in the Pauli formulation of quaternions referenced earlier, a quaternion q=a+bi+cj+dk can be expressed

\displaystyle q=\begin{bmatrix} z & w \\ -w^* & z^* \end{bmatrix}

From this point on, the exercise becomes fairly robotic but there is one noteworthy point. We lose algebraic convenience with each construction. Note the following convenient algebraic properties. The fields in parentheses are the different levels of the Cayley-Dickson constructions that have them.

\displaystyle \begin{matrix} a^ma^n=a^{m+n} &(\mathbb{R}, \mathbb{C}, \mathbb{H}, \mathbb{O},\mathbb{S}, \dots) & \textup{Power Associativity}\\ (aa)b=a(ab) &(\mathbb{R}, \mathbb{C}, \mathbb{H}, \mathbb{O}) & \textup{Alternativity} \\ (ab)c=a(bc) &(\mathbb{R}, \mathbb{C}, \mathbb{H}) & \textup{Associativity}\\ ab=ba & (\mathbb{R}, \mathbb{C}) & \textup{Commutativity} \\ a^*=a & (\mathbb{R}) \end{matrix}

\mathbb{O}  represents the octonions (a construction up from quaternions) and \mathbb{S}  represents sedenions (another step up). Note that at each step, algebra gets more difficult and from sedenions onwards, the only property that remains is power associativity. Algebra becomes impossibly difficult.

Remember that distributivity is not mentioned because it is more about relation addition and multiplication than describing multiplication.

In fact, there is a more general Cayley-Dickson construction but I think this is a good stop pointing in the discussion of theory. Now to another basic question.

Why not 2 imaginary components, or 4? Why must it be 2^n  dimensions?

I am not gonna sketch out full general reasoning why but I will provide an example that may allow you to intuit why only these dimensions are notable. Assume there existed an  algebra of numbers a+bi+cj  . To make this hypercomplex, we require i^2=j^2=-1  Then we basically have 5 reasonable options for ij .

1. ij =ij .

Then just call it k ! It’s a quaternion.

2. ij=i .

Then i^2j=i^2 \implies -j=-1 \implies j=1 . Now it’s just the complex numbers!

3. ij=j .

Then ij^2=j^2 \implies -i=-1 \implies i=1 . Still the complex numbers …

4. ij=1 .

You get it by now … ij=1 \implies i=-j . Complex numbers.

5. ij=-1 .

ij=1 \implies i=j . Complex numbers.

No matter how we define the product, it basically reduces to complex numbers or quaternions. The same will happen for higher dimensions which is why the hypercomplex algebras only have meaning for dimensions of the form 2^n  .

From here, if you wish to learn more, try looking into more forms of the Cayley-Dickson construction like split-complex numbers, split-quaternions, etc. but this itself I am sure was enough to digest. Have fun exploring!


A quaternion q=(a,\vec{b}) can be represented with 2 \times 2  complex matrices.

1\mapsto I
i\mapsto -i\sigma_1
j\mapsto -i\sigma_2
k\mapsto -i\sigma_3

 \displaystyle q = aI+\vec{b}\cdot(-i\vec{\sigma}) = \textup{Re}(q)I +\textup{Im}(q)\cdot(-i\vec{\sigma})

\sigma_p  are the Pauli matrices and \sigma  is the Pauli vector. I leave it to the reader to convince themselves of this.

2 thoughts on “Quaternions Revisited

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s