An introduction to the Qubit

This page will introduce two key properties of Qubits: Superposition and Entanglement. This page uses Expounder to show tangents without cluttering the page. Expounder is a tool that will display text on click. Thanks to Stelios Petrakis for making his code available. Click on text with a dashed underline to see more information on the idea. Below, the three columns are split to offer some of the different ways to express and calculate quantum states. Information will be roughly equivalent at given vertical positions. On the left, Matrix and Vector notation will offer explanations of how qubits work. In the center, Bra-Ket notation is used. On the right, the focus is on visualizations and explanations of quantum behavior. If you would like to understand qubits, but don't want to consider the mathematics behind them, you can remove the math.

Remove Math Columns
Explore all tangents

Starting notation and background knowledge

Matrix Multiplication and Tensor Product

Matrices are a very clear way to show quantum states.

A matrix is a way to represent multiple numbers as a part of a system. One of the simplest matrices is called the Identity Matrix; the Identity Matrices for 2x2 and 3x3 look like this:

1 0
0 1
1 0 0
0 1 0
0 0 1

The noticeable pattern here is the 1's along the diagonal from top left to bottom right, and the 0's everywhere else. This is important because of how matrix multiplication works. When we multiply a matrix by another matrix, or by a vector, we use the rows of the first, and the columns of the second. We sum the resulting products.

1 0 0
0 1 0
0 0 1
x
0
5
-2
=
0x1+5x0+-2x0
0x0+5x1+-2x0
0x0+5x0+-2x1
=
0
5
-2

Crucially, we can see the identity matrix multiplied by anything will return the vector or matrix it was multiplied by. This includes negative numbers, but also complex numbers. This is due to the geometric arrangement of the 1's and 0's. The numbers along the diagonal are the simpliest way to interact with matrices, but the most efficient and strangest quantum algorithms make use of the (non-diagonal) corners and outer edges of the matrix, to get more complex behavior with their systems.

There's another type of multiplication used for quantum systems, and that is Tensor Product(⨂). Remember that matrix multiplication sums products of the two components to give a matrix that has the number of rows of the first matrix, and the number of columns of the second. Tensor product, on the other hand, will multiply the rows and columns to get a larger dimensioned output matrix. We can think of Tensor Product as nesting the second matrix inside each element of the first. Multiply the each element of the first matrix by the entire second matrix:

1
0
1
0
=
1
1
0
0
1
0
=
1
0
0
0

Try clicking on the input matrices on the left, to see how the output matrix changes.

Bra-Ket Notation

Bra-Ket notation is used by physicists, especially for large quantum systems.

Many of the matrices used in quantum computing get quite large, because of how many times products can be tensored. A Qubit's vector is 2x1, so every time we add a Qubit to a system, the dimensions of the system are doubled. For example,

1
0
|0>
,
1
0
|0>
,
1
0
|0>

when tensored would be

1
0
0
0
0
0
0
0
or
|0>|0>|0>
=
|000

This notation is much more efficient, by showing the components that will be tensored, inside the
| and >. This notation is called a Ket, and along with its corresponding Bra, makes up Bra-Ket Notation. Bra-Ket Notation is a way to show vectors and matrices, and different operations between them. A Bra is arranged like < a |, while the ket is | b >. There is a ton of depth in this notation, and it is extremely powerful. This page isn't going to explore everything you can do with Bra-Ket notation, but it will show all the steps taken represented in Kets, to help familiarize you.

Addition between kets is commutative, and is often used to represent different measurement outcomes. A coefficient to the outside of a Ket will represent the amplitude associated with that Ket's outcome, which is related to the probability of that outcome occurring (more on this after). We can represent a system of Kets like:

( α|w> + β|x> + γ|y> + δ|z>)

This notation represents a quantum state with an α amplitude of vector w, and a β amplitude of vector x, and so on. Bra-Ket notation can be confusing at first, but its a much more compact way to show many vectors, rather than writing out the direction of the vector every time. Every time a Qubit is added to a system, the dimensions of the resulting vector double. The three qubit system above is already long with an 8 element vector, but what about a 50 qubit system? The nature of exponential growth would make writing the vectors and matrices for associated logic gates impossible for larger systems. Bra-Ket notation is an exponential reduction on notation space when compared with vectors and matrices.

Classical Bits:

Classical bits are easy to understand: each bit can be a one or a zero.

In practice, these ones and zeroes represent differences in charge measured in different components of cpus or flash memory, or as different readings of magnetism in a hard disk drive. For our purposes, we only care about the way we use these systems to store data, whether its magnets, electricity, or anything else. Similarly, we could use different voltages to have 3 or 4 or more distinct levels of electricity, allowing for more than two options, but in practice, we use a single bit, or either charge or no charge, at the fundamental level. You can click on the bits to change their values.

4

2

1

=
0

These three bits are each very simple, but together, we can use them to represent 8 distinct values. The reason we get 8 possible values is because each bit represents two possible values, and there are three of them, so 2^3 or 8. Every bit you add will double the number of possibilities. The first bit represents 4. The second is 2, the third is 1. We multiply the value of the bit by its representative digit, and sum the total. Just like how 135 is the same as 100 + 30 + 5, 111 is the same as 100 + 10 + 1. In binary, this means that 7 = 4 + 2 + 1, or 5 = 4 + 0 + 1.

This page will explore two concepts of Qubits that don't apply to Classical bits: Superposition and Entanglement. There's a lot of data inside of a qubit, but upon measurement, it will output to a classical bit, either 1 or 0. All the operations done in quantum computing are made with the goal of getting a 1 or 0, often as part of a system like the one above.

Quantum systems will start out as a wavefunction. These waves can interact with each other in ways that are impossible with a classical system. The goal of quantum computing is to take advantage of the different ways that qubits can interact with each other. One of the biggest challenges is that we can only ever get some of the information out of a quantum system, so we have to be careful when creating a system. A quantum state can only be measured once before it's gone forever. The best quantum algorithms maximize the complex interactions done inside the quantum system, and minimize the information lost when transferring it to a classical output.

Meet the Qubit: 0 and 1

Qubits can do everything classical bits can. This section will show 0 and 1 states, and how to flip between them.

The matrix for 0 is:

1
0

The matrix for 1 is:

0
1

We use a 2x1 matrix, or a 2 dimensional vector, to represent a single qubit. The first number represents the probability of measuring a 0, and the second is the probability of a 1.

Remember, if we multiply either of these states by the identity matrix above, we would get the same state. But what if we wanted to switch states?

0 1
1 0

This is the Pauli-X matrix, or a Not Gate.

0 1
1 0
x
1
0
=
0x1 + 1x0
1x1 + 0x1
=
0
1

Here we see a Not gate, multiplied by a 0 state, and the output is a 1 state. The reverse is also true. In fact, the reverse is always true. Quantum gates are always represented by Unitary Matrices (U), so multiplying a resultant vector by the conjugate transpose of the Matrix (U*) will always result in the starting vector. This is because all quantum processes are fundamentally reversible, which creates some interesting necessities when constructing quantum circuits.

Matrix Multiplication is used to show what happens to a qubit as it passes through a quantum logic gate. We use the same math to show what happens if multiple qubits pass through a larger system of quantum gates, or even to show the transformations that a series of logic gates would have.

The Ket for 0 is:

|0>

The Ket for 1 is:

|1>

There's a little bit of dressing up we can do to this notation. For example, the ket is multiplied by the amplitude associated with that state. Here, for a single state, the amplitude would be 1. You can also express any ket in terms of 0 and 1, so for ket 0, we could have:

|0>
=
( 1|0>+ 0|1>)
1

This tells us that we have a 100% chance of measuring 0, and a 0% chance of measuring 1. More on how to calculate the coefficients and denominator below.

To show a Pauli-X or Not gate with these kets, we can show that for any state

X( α|0> + β|1>)
( β|0> + α|1>)

All this means is we are swapping the probabilities associated with 0 and 1.

Visualization of 0:

Visualization of 1:

The closer the position of the qubit is to the top of the circle, the greater the probability of measuring 0. The closer the qubit is the bottom of the circle, then the greater the probability of measuring 1.

These are approximations, for a couple of reasons. We are using a two dimensional circle here for 0 and 1. A 3-D Bloch Sphere is a better approximation, as there is a third axis of information that is part of a qubit. That being said, a circle is both easier to understand, and easier to clearly visualize. For many quantum interactions, a 2-D circle is plenty. There are also limitations to a Bloch Sphere, as it works to visualize the relationship between quantum states in a physical sense, like putting 0 and 1 on opposite sides. But mathematically, the 0 and 1 vectors are actually orthogonal. This is key because orthogonal vectors have special properties, especially in expressing opposition or interaction. What is important to show is that a Qubit's value can be anything inside or on the edges of the circle. This means theoretically, there are infinite possibilities for the value of a qubit. There are a number of limitations that guide the practical possibilities though. If you wanted a qubit to potentially express unlimited values, then you would need the physical capability to adjust the particle by a fraction of a degree. A popular standard is 1/8th rotations, simply because of the physical limitations, and due to the nature of measurement of a qubit. A theoretical quantum computer could make any number of nuanced movements across this space, but the position of the qubit will only modify its probability of measuring 0 or 1. And, after it's been measured, the original state is replaced with the output of the measurement. Since you only have once chance to measure the particle, there are diminishing returns to additional fine tuned control.

To change a qubit from 0 to 1, or 1 to 0, we use the Pauli-X or Not gate. This is very similar to inverting the classical bits above. A Pauli-X gate can be thought of as flipping the circle vertically over the center. Try it out on the states below.

Normalization

Representing superposition requires Normalization, to properly represent multiple components:

All quantum states are a vector, with a magnitude or Norm of 1. Since 0 and 1 are mathematically orthogonal, we can construct a right triangle out of their components. That means that the values at 0 and 1 must follow A^2 + B^2 = C^2 where C and C^2 = 1. Thus, a system like:

1
1

must be normalized, so that the the square of its components sum to 1.

1
1
12
x
1
1
=
12
12

Often, normalizations are kept on the outside of the matrix, to aid in readability. Sometimes they are unwritten entirely, but its important to remember that any quantum state has a magnitude of 1.

12
12

The first number in the matrix represents the amplitude associated with measuring 0 and the second for 1. because the numbers are the same, the probability of measuring either value is 50%. The probability is the square root of the amplitude value inside the matrix.

Normalization in Bra-Ket Notation is very similar to vectors. All coefficients must square and then sum to 1. Here, the coefficients of each ket represent the probability of their respective ket state. We can use the denominator to apply the same normalization to each component.

( |0>+ |1>)
( |0>+ |1>)
2
=
( 1/√2|0> + 1/√2|1>)

Remember, the coefficients must square and add to 1. If the coefficients are unwritten, then they are assumed to be one, which often means normalization is necessary.

One of the interesting results of this system is that we use the square root of the probability as our coefficients. We call these the Amplitudes, and they are closely related to how quantum wavefunctions work. Manipulating amplitudes is one of the most subtle and powerful tools that quantum computing has. A negative wave would measure just as real as the inverse positive wave, but if they were both present in a system, the waves would cancel each other out, and that outcome would be annihilated. Similarly, a faint wave can be amplified with similar waves being added to it. That would increase the probability of that respective measurement.

The formula for probability from amplitude is:

P = |A^2|

It may seem curious that we take the absolute value of the Amplitude squared. After all, even a negative amplitude will square to a positive probability. However, an amplitude can also be a complex number, so we want to avoid the nonsensical expression of a negative probability.

Our goal with normalization is to keep our quantum state as something that can be expressed within our system. This time, we are going to use a different visualization, with 0 and 1 as perpendicular lines. For perpendicular vectors like 0 and 1, we say they are orthogonal.

Here, the point is outside the bounds of possibility. We can't be both fully 0 and fully 1. In fact, to maximize the 0 value, the 1 value has to be minimized, and vice versa. But what would this same ratio look like if we normalized the point?

0
1
1.0, 1.0
Normalize

We would get 1/√2 ≈ 0.7071... for each value. This also follows A^2 + B^2 = C^2, or 45° 45° 90° triangle rules. Fundamentally, the values we use as the coordinates here, or as the coefficients of 0 and 1 represent amplitudes. These amplitudes are the square root of the probability, so they must square and sum to 1. We can't have a probability greater than 1, just like we can't have a negative probability. 1/√2 represents a 50% chance of either 0 or 1, this is because 1/√2 is the square root of 1/2, or 50%.

Superposition

Qubit states that aren't 0 or 1 are a combination of 0 and 1.

We can represent more states than just 0 and 1. This vector represents the state +:

12
x
1
1

The state - is similar:

12
x
1
-1

To get these states from 0 and 1, we use the Hadamard gate, whose matrix looks like this:

12
x
1 1
1 -1

As an example, we show a 0 vector going through a Hadamard gate.

12
x(
1 1
1 -1
x
1
0
) =
12
x
1
1

1 going through Hadamard:

12
x(
1 1
1 -1
x
0
1
) =
12
x
1
-1

Remember, the Hadamard operator is reversible:

12
x(
1 1
1 -1
x
12
1
1
) =
1
0
12
x(
1 1
1 -1
x
12
1
-1
) =
0
1

To represent an even superposition of 0 and 1, there are a couple popular expressions:

( |0>+ |1>)
2
( |0>- |1>)
2

These represent the + and - states respectively. To get these states from the 0 and 1, we use the Hadamard operator.

H( 1|0>+ 0|1>)
1
( |0>+ |1>)
2
H( 0|0>+ 1|1>)
1
( |0>- |1>)
2

These Hadamard outputs are reversible, so hadamarding a + will get you 0. Just as we can express + and - in terms of 0 and 1, we can express 0 and 1 in terms of + and -.

|+>
=
( |0>+ |1>)
2
,
|->
=
( |0>- |1>)
2

With this, we can now provide an expression equal to 0:

|+>+ |->
2
=
( |0>+ |1>) + ( |0>- |1>)
2 x √2
=
( 2|0>+ |1>- |1>)
2
=
( 1|0>+ 0|1>)
1

Similarly:

|+>- |->
2
=
( 0|0>+ 1|1>)
1

Superposition is one of the most interesting quantum properties. First, Superposition is fundamentally probabalistic. If we have a qubit with a superposition of 0 and 1, then that means there is an associated probability of measuring a 0 and another for a 1. Second, The Qubit will exhibit properties of both states while in superposition (before measuring). Avoid simplifying superposition to just "both 0 and 1". A key element is that 0 and 1 are fundamentally opposites. Being 0 and 1 at the same time is impossible in a sense. When we measure a qubit, it is never both 0 and 1. But, before measuring, a superposition is possible. Consider two new states, + and -. If 0 and 1 are North and South, then + and - are East and West.

Any state along the center horizontal line, including + and -, will have a 50% probability of measuring 0 and 50% to measure 1.

One of the simplest ways to create a superposition is with the Hadamard gate. The Hadamard gate will map 0 to + and 1 to -. The reverse of each of these relations is also true. You can think of this operation as flipping the qubit over a diagonal line that goes from Southwest to Northeast. Click on any state below to apply the Hadamard operator to it, click on it again to apply it a second time.

So how does a qubit in superposition behave when we measure it?

Measure

This allows you to measure more than once, so you can understand the random nature of the measurement. This process is truly random, unlike the JavaScript library behind the pseudorandom animation above. Quantum computers are far more useful than just a random number generator, but the randomness inside their systems is very real. Keep in mind that a real qubit will change its state to whatever it outputs in measurement- if you wanted to do this with a real qubit, you'd have to hadamard it between measurements.

Superposition's probabilistic uncertainty raises some big questions. There are many theories and thinking experiments about this uncertainty, the most famous of which is Schrodinger's Cat. We know through rigorous testing that this quantum uncertainty is very real, but classical entities don't exhibit it. There's a lot of warranted discussion around the underlying mechanisms that may or may not cause this sort of "quantum weirdness", but the reason it's prompted so many disagreements and theories is because the experimental results have been so convincing. Some physicists may disagree about Why quantum particles behave differently, but they aren't disagreeing on what the particles are doing. Quantum computers can utilize these quantum properties, even without answers to all the questions in quantum mechanics.

Entanglement

When we work with two qubit systems, we use Tensor Product(⨂). Tensor the two vectors together, to show a vector that represents the whole system. Consider a two qubit system system like:

1
0
⨂(
12
x
1
1
)=
12
x
1
1
0
0

This output vector represents a 0 for the first qubit, and + for the second. In other words, a 50% chance of 00 and a 50% chance of 01. Lets look at something more complex:

12
x
1
0
0
1

This vector represents a 1/2 probability of measuring 00, and a 1/2 probability of measuring 11. There exists no 2x1 matrices that would tensor to this vector. This is the definition of an entangled system: any system that cannot be achieved by simply combining component vectors.

If we cannot achieve the output by combining smaller input vectors together, that means that the vectors making up the system are affecting each other's values. The degree to which the inputs affect each other determines its entanglement. Entanglement is a range, from no entanglement, to maximally entangled, like the system above. In fact, this is one of the four Bell Pairs, which represent the ways two particles can be maximally entangled with each other.

To show two qubits in Bra-Ket notation, we can put the values of each Ket next to each other.

|0>|0>
=
|00>

Two adjacent Kets are tensored, but we don't have to write the ⨂. The same applies to two values inside a Ket. What if we want to show two Kets with different values?

|0>|+>
=
( 1|0>+ 0|1>)
1
( 1|0>+ 1|1>)
2
=
( 1|00>+ 1|01>)
2

Here we have one qubit that is 0 and a second that is +, an even superposition of 0 and 1. At the end, we can represent both the superposition of the second qubit, and the certainty of the first.

Lets look at something more interesting:

( 1|00>+ 1|11>)
2

This is a Bell Pair, one of the four states representing a pair of maximally entangled particles. What this expression means is that each qubit has a 50% probability of measuring 0 or 1, and that specifically, if one measures 0, the other will as well. Even though each individual value is maximally uncertain, their values will always be the same as each other.

An entangled state like this one cannot be achieved by combining two simpler states. When combining states, we are just aggregating the individual values of each qubit. Here, the values of the qubits are affecting each other.

At its simplest, Entanglement is a relationship between two particles. Lets imagine a very simple entangled state: A pair of particles that always measure the same. This is an example of a Bell pair, two maximally entangled particles. Entangled particles can be moved apart from each other, but the entanglement will remain.

This has profound implications, especially as it pertains to communication and cryptography. A recent proof developed by Zhengfeng Ji, Anand Natarajan, Thomas Vidick, John Wright and Henry Yuen showed that MIP* = RE, meaning that multiple quantum computer provers, each with access to infinite entanglement and computational power, would allow a external system to solve the hardest problems in Computer Science, including the Halting Problem(!) in polynomial time. This also allows correlation (not causation) at any distance, what Einstein famously called "spooky action at a distance". The reason entangled particles are not causal is because it doesn't matter what order you measure them in. Once either particle in the above bell pair is measured, the other particle is guaranteed to get the same value.

There are difficulties when visualizing entangled particles. Every system could be represented with a single sphere, but more dimensions are required for every additional particle. Here, we show each possible outcome of this Bell pair on a different row. Lets explore the possible outcomes of the Bell state above. Remember, this system will either measure 00 or 11, so they always measure the same.

Despite the potential complexity added by entanglement, this system is quite simple. A bell pair like this is another fundamental building block of quantum computing- used in pseudo-telepathy experiments, or in quantum cryptography or communication over a quantum internet. There's even opportunities for greater than 1 bit of information transferred through an entangled pair like this.

Entanglement isn't an on/off switch: its a scale from no entanglement (outcome is just a sum of the amplitudes associated with the component qubits), to a maximally entangled system like the one above.

Quantum Circuit for a Bell State

To achieve this system, we need to introduce a new tool: the Controlled Gate.

Identity:

1 0
0 1

Pauli-X (Not):

0 1
1 0

Consider the familiar Identity and Not matrices. With these, we can construct a gate that will Not the second qubit, only if the first qubit is 1.

1 0 0 0
0 1 0 0
0 0 0 1
0 0 1 0

The first two columns (00 and 01) look like the identity matrix, while the last two columns (10 and 11) look like the not matrix. Often when constructing a conditional matrix, we start with the identity matrix, identifying the columns where the condition applies. Then, apply the gate's matrix to those columns. Step by step for the reverse Conditional Not Gate (not gate on the first qubit, if the second qubit is 1):

1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
1 0 0 0
0 0 0 1
0 0 1 0
0 1 0 0
1 0 0 0
0 0 0 1
0 0 1 0
0 1 0 0

First, we start with the identity matrix, notice the 2nd and 4th columns are highlighted. These are 01 and 11, or the vectors where the second qubit is 1. We then apply the Not matrix to just these columns, switching the positions of the 1's in them.

We cannot achieve an entangled state by individually operating on two different qubits. Instead, we have to consider the system as a whole, and allow one qubit to have an effect on the other. Start with 00:

1
0
1
0
=
1
0
0
0

Now what we need is a gate that will apply Hadamard to the first qubit, and keep the second qubit the same as before. To get this gate, we will tensor Hadamard and Identity: The order that we tensor the matrices here is crucial. because the hadamard applies to the first qubit, it goes first, to be tensored by identity. If we wanted to hadamard the second qubit only, we would put the Identity matrix first.

12
x
1 1
1 -1
1 0
0 1
=
12
x
1 0 1 0
0 1 0 1
1 0 -1 0
0 1 0 -1

Now lets pass our 00 vector through this new gate:

12
x
1 0 1 0
0 1 0 1
1 0 -1 0
0 1 0 -1
x
1
0
0
0
=
12
x
1
0
1
0

Now, we can apply the controlled Not gate to this new vector.

1 0 0 0
0 1 0 0
0 0 0 1
0 0 1 0
x
12
x
1
0
1
0
=
12
x
1
0
0
1

To allow for this complex interaction, we need to have a Controlled Not gate. If the Not gate switches the amplitudes for 0 and 1, a Controlled Not gate will only switch the amplitudes of a qubit if a control qubit is 1.

CX( α|10> + β|11> + γ|00> + δ|01>)
( β|10> + α|11> + γ|00> + δ|01>)

Notice that the amplitudes for 10 and 11 switch, but not 00 and 01. The second Qubit will only change values if the first qubit is 1. Now, start with 00 and hadamard the first qubit.

H|0>|0>
( 1|00>+ 1|10>)
2

From here, CNot the system:

CX( 1|00>+ 1|10>)
2
( 1|00>+ 1|11>)
2

Because the Controlled Not only flipped the second qubit when the first was also 1, we end up with 00 and 11.

How do we achieve a Bell state if we start with 00? Remember, if we Hadamard Both qubits, we would just get an even spread of 00, 01, 10, 11. We need some way to link the values of one qubit to another. A Controled Not gate, or CX, only flips a qubit if a control qubit is 1. Lets make the left qubit our control, and the right the subject of the CX gate. Click on the left qubit to switch its state, and click on the right to see what happens to it. The gate's action will depend on the left qubit's value.

Control

Subject

The qubit on the right will only switch if the qubit on the left has a value of 1. What if we were to consider a more complex state for the control? Imagine we hadamard the first qubit:

Control

Subject

The control qubit is in a superposition of 0 and 1: only the subject qubit in the second instance can be switched, because that's where the control qubit is 1.

Controlled gates like these allow for entangled states, because they allow one qubit to affect others' values. Can you make the 00 11 bell pair in the above example?

The full circuit for a 00 11 Bell pair from a starting set of 00 is to hadamard the first qubit, then use the first qubit as a control for a controlled not on the second. This will flip the second qubit from 0 to 1 only if the first qubit is 1, ensuring that both qubits have the same value at the end.

Acknowledgements

I want to recognize the resources that made this page possible, especially Scott Aaronson's lecture notes, which were invaluable in understanding this material. Other useful information include this StackExchange answer on the Bloch Sphere, emphasizing the strengths and limitations of quantum visualization. This Berkeley Class has lectures and homeworks available. Many Wikipedia pages were used, often split screen to understand new math concepts. They were inconsistent in their readability, but this Wikipedia Page on Quantum Gates is an amazing reference on quantum gates. Also, Microsoft's series on Quantum Computing was useful as an introduction.

For Quantum Simulation, I recommend:
Quirk is an in browser Quantum simulator. One of its best features is that the circuits are stored in the url, so you can share or store this circuit that generates a Bell pair, or this link of their Grover Search implementation.
Microsoft's Q# is a programming language that works inside of Python or C#, and simulates quantum states on your machine.
There's a lot of attention around cloud access to quantum computers, but for the most part, your machine can simulate as well as any quantum computer accessible on the cloud. (This is written in early 2020, I hope it's outdated advice soon). The cloud can give you access to faster classical hardware, which may be helpful with larger quantum simulations.