Grover's Algorithm, Quantum Fourier Transform, Quantum Phase Estimation and Resource Estimates

#lecture #grover #quantum-fourier-transform #quantum-phase-estimation #resource-estimates

Lecture: Grover's Algorithm, Quantum Fourier Transform, Quantum Phase Estimation and Resource Estimates

Introduction

In the first part of this course, we explored the physical implementations of quantum computing: neutral atoms in optical tweezers, trapped ions, superconducting circuits cooled to millikelvin temperatures, photons, and electron spins in quantum dots.
Each platform offers different engineering trade-offs, coherence times, and scalability prospects.

Last week, we took our first deep dive into quantum algorithms with Shor's factoring algorithm, the poster child for exponential quantum speedup.
We saw how factoring an integer $N$ reduces to the problem of finding the period $r$ of the function $f (x) = a^{x} mod N$ , and how quantum computers can solve this period-finding problem exponentially faster than any known classical algorithm.
Along the way, we briefly mentioned two powerful quantum subroutines that made this possible: the Quantum Fourier Transform (QFT) and Quantum Phase Estimation (QPE).
In that lecture, we treated QFT and QPE primarily as black box tools that we plugged into Shor's algorithm to extract the period.
We didn't fully unpack how they worked or why they're so powerful beyond this specific application.
Today's lecture aims at opening those black boxes, as well as another well known quantum algorithm, and understand the quantum primitives that power not just Shor's algorithm, but the landscape of quantum algorithms:

Grover's Algorithm: Last week we mentioned that Grover provides (only) a quadratic speedup (vs Shor's exponential) for unstructured search.
We'll see exactly how amplitude amplification works by relying on a geometric rotation in Hilbert space.
Quantum Fourier Transform (QFT): We'll go beyond using QFT as a subroutine and understand its circuit implementation, its connection to classical Fourier analysis, and why it achieves an exponential speedup.
Quantum Phase Estimation (QPE): We'll formalize this eigenvalue extraction technique and see why it's the workhorse behind not just Shor's algorithm, but also quantum chemistry simulations, solving linear systems, and many other applications.

Beyond understanding these individual algorithms in depth, we'll step back to survey the broader (but partial) quantum algorithm landscape: What classes of problems admit quantum speedups?
Where are the boundaries?
Which real-world applications might actually benefit from quantum hardware in the near term?

Finally, we'll confront the resource estimation reality check.
Shor's algorithm can in theory factor a 2048-bit RSA key but how many physical qubits does that actually require?
How long would it take?
What error rates can we tolerate?
These aren't just academic questions but determine whether quantum advantage is achievable with current or near-term hardware.
Spoiler alert: the resource estimates will reveal that we cannot run most of these algorithms on today's noisy devices without bridging the gap between current machines' error rates vs what is required for high-level applications.
This sets the floor for the quantum error correction lectures that will follow, where we'll see exactly how to bridge this gap between the imperfect physical qubits from our first lectures and the pristine logical qubits that algorithms require.
Error correction isn't a just nice-to-have feature, it's the essential bridge from quantum computing theory to quantum computing reality.

These notes were written for the Experimental Quantum Computing and Error Correction course of M2 Master QLMN, see the course's website for more information.
The notebooks used to illustrate the algorithms are adapted from the now archived Qiskit/textbook GitHub repository (see IBM Quantum Learning for the new URL).

For next time 2026-02-11

You are not expected to solve any exercice from today's lecture.
Instead, read Chen et al, Exponential suppression of bit or phase errors with cyclic error correction , Nature 2021 (https://www.nature.com/articles/s41586-021-03588-y) 1.
We'll discuss it during next lecture.

Let's begin by diving into Grover's algorithm, the quintessential example of quadratic quantum speedup.

Grover's Algorithm

Quiz

Last time, we asked you to read the paper 2.
Let's test our understanding of this pioneering (from 1998!) NMR implementation of Grover's algorithm with some questions covering the physical implementation, experimental techniques, and the specific dynamics of the $N = 4$ case.

Q1: Physical Hamiltonian and Qubit Encoding

Regarding the physical Hamiltonian and qubit encoding used in this experiment (select all correct answers):

A. The experiment was conducted using a homonuclear spin system at cryogenic temperatures.

B. The two qubits were represented by the nuclear spins of Hydrogen ( $^{1}$ H) and Carbon-13 ( $^{13}$ C) in chloroform molecules.

C. The interaction term in the Hamiltonian responsible for creating entanglement is of the Ising form $H_{i n t} \propto I_{z A} I_{z B}$ .

D. The scalar coupling constant $J$ was approximately 215 Hz, allowing for gate operations in the millisecond regime.

Answer

Correct Answers: B, C, D

The experiment used a heteronuclear system ( $^{1}$ H and $^{13}$ C) in chloroform at room temperature.
The Hamiltonian (Eq. 1 in the paper) is dominated by the scalar coupling term $2 π ℏ J I_{z A} I_{z B}$ , and the measured coupling $J$ was 215 Hz.

Q2: Pure State Preparation

How was the "pure" initial state $| 00 ⟩$ prepared from the thermal equilibrium state (select all correct answers)?

A. The sample was cooled to the ground state using optical pumping.

B. A technique called "temporal labeling" was used, which involves summing the results of three separate experiments with permuted populations.

C. The experiment utilized a single-shot projective measurement to collapse the ensemble into the ground state.

D. The effective pure state was defined using deviation density matrices, where the identity part of the thermal state is ignored as it does not contribute to the NMR signal.

Answer

Correct Answers: B, D

True pure state preparation is difficult in NMR at room temperature.
The authors used "temporal labeling" (Eq. 2 in the paper) to simulate a pure state by permuting populations across three experiments and summing the results.
NMR measures the deviation density matrix (traceless part), effectively ignoring the identity component.

Q3: Grover Operator and Oracle Implementation

Regarding the implementation of the Grover operator $U$ and the Oracle (select all correct answers):

A. The conditional phase shift (the Oracle) was implemented using the natural scalar coupling evolution of the spins ( $2 π J I_{z A} I_{z B} t$ ).

B. For the $N = 4$ case, the algorithm theoretically requires approximately $\frac{π}{4} \sqrt{N} \approx 1.57$ iterations to maximize probability.

C. The "inversion about the mean" operator $D$ was compiled into a pulse sequence represented as $D = W P W$ .

D. The Walsh-Hadamard transform was applied to both spins using the optimized pulse sequence $H = X {\bar{Y}}_{\frac{1}{2}}$ (rotation about X after a rotation about -Y).

Answer

Correct Answers: A, C, D

The oracle uses the coupled-spin evolution (free evolution).
For $N = 4$ , the exact solution is found in one iteration (not 1.57).
The inversion operator $D$ is decomposed into Walsh-Hadamard transforms sandwiching a conditional phase shift (Eq. 5 in the paper).

Q4: Algorithm Dynamics and Periodicity

What do the experimental results demonstrate regarding the algorithm's dynamics and periodicity (select all correct answers)?

A. The amplitude of the solution state $| 11 ⟩$ (for marked state $x_{0} = 3$ ) reaches a maximum after exactly 1 iteration.

B. The density matrix exhibits a periodicity of 3 iterations, consistent with the theoretical prediction for $N = 4$ .

C. The experiment showed that the algorithm is unstable and loses periodicity after the first cycle due to decoherence.

D. The classical expected number of queries for this search is 2.25, whereas the quantum implementation solved it in a single step.

Answer

Correct Answers: A, B, D

The paper confirms that for $N = 4$ , the answer is found in 1 step (Eq. 6).
The periodicity of the probability (density matrix) is 3 iterations (Fig. 1 in the paper).
The classical average for searching 4 items is indeed $(1 + 2 + 3 + 4) / 4 \times prob \approx 2.25$ .

Q5: Read-out Method and Error Analysis

Regarding the read-out method and error analysis (select all correct answers):

A. The final state was determined by full state tomography, which required reconstructing the density matrix from 9 different rotation experiments.

B. The longest computation (7 iterations) took approximately 35ms, which was significantly longer than the coherence time $T_{2}$ .

C. The signal strength in the NMR spectrum represents the fraction of the population in a given state, rather than a collapse to an eigenstate.

D. The primary source of experimental error (7-44%) was identified as the inhomogeneity of the magnetic field.

Answer

Correct Answers: A, C, D

Readout in ensemble NMR does not collapse the state but measures magnetization (expectation values).
Tomography required 9 settings (27 total runs with temporal labeling).
The computation time (35ms) was well within the coherence time ( $T_{2} \approx 300 - 400$ ms), not longer.
Field inhomogeneity was cited as the primary error source.

Introduction

Grover's algorithm addresses one of the most fundamental problems in computer science: searching through an unstructured database.
Let's start by explaining the problem at stake.
Imagine you have a list of $N = 2^{m}$ items, and somewhere in that list is a special element that we call $w$ (aka the "winner") that satisfies some unique property.
Classically, finding this needle in items haystack requires checking them one by one: on average $N / 2$ queries, and in the worst case, all $N$ items.

\underset{on m qubits}{\underset{⏟}{\begin{array}{ccccccc} 1 & 2 & \dots & w & \dots & N - 1 & N \end{array}}}

Grover's algorithm 3 provides a quadratic speedup, finding the winner in only $O (\sqrt{N})$ queries.
While this may seem modest compared to the exponential speedup of Shor's algorithm, Grover's approach is widely applicable which make it so valuable.
The underlying technique of amplitude amplification serves as a general subroutine for any problem where solutions can be verified efficiently.

Algorithm Overview

The algorithm consists of three main components that work together to concentrate probability amplitude on the winning state:

State Preparation: We initialize the quantum register in a uniform superposition over all possible items in our search space.
Oracle: A black-box operation that "marks" the correct answer by flipping its phase (without revealing which state it is).
Diffusion Operator: An amplification step that increases the probability of measuring the marked state while suppressing all others.

Grover's algorithm circuit showing initialization, oracle, and diffusion operator|600 — Grover's algorithm circuit showing initialization, oracle, and diffusion operator

See Fig. for the circuit structure for Grover's algorithm.
The circuit begins by the state preparation phase in red with Hadamard gates creating the uniform superposition, followed by the oracle (yellow) marking the winner state, and then the diffusion operator (blue) that amplifies the marked state's amplitude.
For small databases like $N = 4$ as we will see later, a single iteration achieves near-perfect success probability.
For other settings, we need to repeat Oracle+Diffuser.
Let's go through all the steps of this algorithm one by one.

The Initial Superposition

Since we have no prior knowledge about which item is the winner, we begin by preparing a uniform superposition over all $N$ basis states.
This is achieved by applying Hadamard gates, introduced in the first lecture, to each of the $m$ qubits initialized to $| 0 ⟩$ :

| s ⟩ = H^{\otimes m} | 0 ⟩^{\otimes m} = \frac{1}{\sqrt{N}} \sum_{x = 0}^{N - 1} | x ⟩

At this stage, if we were to measure immediately, each outcome $| x ⟩$ would appear with equal probability $1 / N$ .
The probability of successfully finding the winner in a single shot is therefore $1 / N$ , no better than classical random guessing.
To see why this is problematic, consider that the expected number of queries $X$ to find the item classically is:

E (X) = \sum_{k = 1}^{N} k \cdot P (X = k) = \sum_{k = 1}^{N} k \cdot \frac{1}{N} = \frac{1}{N} \frac{N (N + 1)}{2} = \frac{N + 1}{2}

Our goal is thus to transform this uniform distribution into one heavily concentrated on $| w ⟩$ , so that measurement yields the correct answer with high probability after only $O (\sqrt{N})$ iterations.

The Oracle or the Yes/No Verifier Blackbox Function

Grover's Algorithm is a verification algorithm

We emphasize that Grover's algorithm does not search through items one by one.
Instead, it assumes we already have access to a function that can verify whether a given input is the solution, and uses quantum interference to amplify the correct answer.

Grover's algorithm does not "look inside" the database.
Rather, it relies on an oracle, a black-box function that can recognize the winner when presented with it.
The oracle's job is simple: given any state $| x ⟩$ , flip its phase if and only if $x = w$ .

This distinction between finding and verifying is fundamental in complexity theory.
Recall from lecture 1 that many problems in the class NP (Non-deterministic Polynomial time) are hard to solve but easy to verify.
One example to understand this subtle difference is to consider Sudoku.
Checking whether a completed grid is valid takes seconds, but finding the solution from scratch can be much harder.
Grover's algorithm exploits this asymmetry, working for any problem where verification is efficient.
To get some intuition of what the oracle is, let's go through multiple definitions and interpretations.

Mathematical Definition of the Oracle

As we just mentioned, the oracle $U_{w}$ acts on computational basis states as follows:

U_{w} | x ⟩ = {\begin{cases} | x ⟩ & if x \neq w \\ - | x ⟩ & if x = w \end{cases}

This can be written compactly as $U_{w} = I - 2 | w ⟩ ⟨ w |$ , which is a reflection operator about the vector state $| w ⟩$ .
In matrix form, for a 2-qubit system where the winner is $| 10 ⟩$ , we can write:

U_{| 10 ⟩} = (\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & - 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix})

Notice that only the diagonal entry corresponding to the winner state carries a $- 1$ .
This phase flip is invisible if you measure immediately (the probabilities $| amplitude |^{2}$ are unchanged), but it sets the stage for interference in the next step.

Exercise 1: Oracle Matrix Construction

For a 3-qubit system ( $N = 8$ ), construct the explicit $8 \times 8$ oracle matrix $U_{w}$ when the winner state is $| w ⟩ = | 101 ⟩$ .
Verify that $U_{w}$ is unitary by checking $U_{w} U_{w}^{†} = I$ .

Answer

The oracle matrix is diagonal with all entries equal to 1 except for the entry corresponding to $| 101 ⟩ = | 5 ⟩$ :

U_{w} = diag (1, 1, 1, 1, 1, - 1, 1, 1)

To verify unitarity, note that $U_{w}$ is diagonal and real, so $U_{w}^{†} = U_{w}$ .
Then:

U_{w} U_{w}^{†} = U_{w}^{2} = diag (1, 1, 1, 1, 1, 1, 1, 1) = I ✓

Exercise 2: Expected Number of Classical Queries

Show that for a database of size $N$ , the expected number of classical queries to find a unique item is $(N + 1) / 2$ .
What is the variance of the number of queries?

Answer

Let $X$ be the number of queries.
Then $P (X = k) = 1 / N$ for $k = 1, \dots, N$ (uniform distribution).
Expected value:

E (X) = \sum_{k = 1}^{N} k \cdot \frac{1}{N} = \frac{1}{N} \cdot \frac{N (N + 1)}{2} = \frac{N + 1}{2}

Variance:

E (X^{2}) = \sum_{k = 1}^{N} k^{2} \cdot \frac{1}{N} = \frac{1}{N} \cdot \frac{N (N + 1) (2 N + 1)}{6} = \frac{(N + 1) (2 N + 1)}{6}

Var (X) = E (X^{2}) - [E (X)]^{2} = \frac{(N + 1) (2 N + 1)}{6} - \frac{(N + 1)^{2}}{4} = \frac{N^{2} - 1}{12}

Building the Oracle from a Classical Function

Another way to get intuition on this oracle function is to express it from a classical function.
Indeed, in practice, we construct the oracle from a classical boolean function $f (x)$ that identifies the winner:

f (x) = {\begin{cases} 0 & if x \neq w \\ 1 & if x = w \end{cases}

The oracle's action can then be written as $U_{w} | x ⟩ = (- 1)^{f (x)} | x ⟩$ .
But how do we actually implement this phase flip using quantum gates?

The standard approach uses phase kickback.
Suppose we have a quantum circuit $U_{f}$ that computes $f$ in the usual way: $| x ⟩ | 0 ⟩ \overset{U_{f}}{\to} | x ⟩ | f (x) ⟩$ .
The trick is to prepare the target qubit not in $| 0 ⟩$ , but in a particular state, for instance $| - ⟩ = \frac{1}{\sqrt{2}} (| 0 ⟩ - | 1 ⟩)$ , that turns out to be an eigenstate of the unitary transformation.
When we apply $U_{f}$ with this special target state, the following happens.
If $f (x) = 0$ , the target remains $| - ⟩$ .
But if $f (x) = 1$ , the XOR operation flips $| - ⟩$ to $- | - ⟩$ , introducing a global phase of $- 1$ .
The result is:

| x ⟩ | - ⟩ \overset{U_{f}}{\to} (- 1)^{f (x)} | x ⟩ | - ⟩

The phase has been "kicked back" onto the input register, while the target qubit is left unchanged (up to a phase we can ignore).
This is the standard technique for converting any classical verification function into a quantum phase oracle.
Let's explain it in more details.

Phase Kickback: The Underlying Oracle Mechanism

To understand this more generally, consider a controlled- $U$ gate where the target is prepared in an eigenstate $| ψ ⟩$ satisfying $U | ψ ⟩ = e^{i ϕ} | ψ ⟩$ .

Key Concept

Phase kickback occurs when a controlled operation acts on an eigenstate of the target unitary.
The eigenvalue phase gets transferred to the control qubit's relative phase.

If the control qubit starts in a superposition:

C - U (\frac{| 0 ⟩ + | 1 ⟩}{\sqrt{2}} \otimes | ψ ⟩) = \frac{| 0 ⟩ | ψ ⟩ + e^{i ϕ} | 1 ⟩ | ψ ⟩}{\sqrt{2}} = (\frac{| 0 ⟩ + e^{i ϕ} | 1 ⟩}{\sqrt{2}}) \otimes | ψ ⟩

The target qubit remains in $| ψ ⟩$ , but the control qubit has acquired a new relative phase $e^{i ϕ}$ .
This mechanism allows us to extract information about $U$ (its eigenvalue) and encode it as a phase on another register (here the control register), a principle that underlies not just Grover's oracle, but also quantum phase estimation and many other algorithms, we will thus mention it again below.

Exercise 3: Phase Kickback with Pauli Gates

Consider the controlled- $Z$ gate acting on $\frac{1}{\sqrt{2}} (| 0 ⟩ + | 1 ⟩) \otimes | 1 ⟩$ .
Show that the phase gets kicked back to the control qubit.
What is the resulting state?

Answer

The $Z$ gate has $| 1 ⟩$ as an eigenstate with eigenvalue $- 1$ : $Z | 1 ⟩ = - | 1 ⟩$ .
Applying the controlled- $Z$ :

C Z (\frac{| 0 ⟩ + | 1 ⟩}{\sqrt{2}} \otimes | 1 ⟩) = \frac{| 0 ⟩ \otimes | 1 ⟩ + | 1 ⟩ \otimes Z | 1 ⟩}{\sqrt{2}}

= \frac{| 0 ⟩ \otimes | 1 ⟩ - | 1 ⟩ \otimes | 1 ⟩}{\sqrt{2}} = \frac{| 0 ⟩ - | 1 ⟩}{\sqrt{2}} \otimes | 1 ⟩

The target remains $| 1 ⟩$ , but the control qubit has picked up a relative phase of $- 1$ .

Exercise 4: Oracle Action on Superposition

Consider a 2-qubit system with winner $| w ⟩ = | 11 ⟩$ . Calculate $U_{w} | s ⟩$ explicitly, where $| s ⟩ = \frac{1}{2} (| 00 ⟩ + | 01 ⟩ + | 10 ⟩ + | 11 ⟩)$ .

Answer

The oracle flips only the phase of $| 11 ⟩$ :

U_{w} | s ⟩ = U_{w} \cdot \frac{1}{2} (| 00 ⟩ + | 01 ⟩ + | 10 ⟩ + | 11 ⟩)

= \frac{1}{2} (| 00 ⟩ + | 01 ⟩ + | 10 ⟩ - | 11 ⟩)

Notice that all amplitudes remain $1 / 2$ in magnitude, but $| 11 ⟩$ now has a negative sign.

Geometric Interpretation of Grover's Algorithm

We have not yet covered all the 3 steps of Grover's algorithm, we are still missing the Diffuser one.
But before diving into this last part, we take a moment to focus on a geometric interpretation that will help us introduce this last part.
Grover's algorithm is indeed best understood through geometry.
The key insight that we will discover is that two reflections compose to give a rotation, a fact from elementary geometry that becomes the engine of Grover's quantum speedup.

See Fig. for an step-by-step illustration of how amplitudes evolve during Grover's algorithm.
Before the oracle, all states have equal positive amplitude.
After the oracle, the winner state's amplitude becomes negative, creating an asymmetry.
The diffusion operator then reflects all amplitudes about their mean, dramatically boosting the winner's amplitude while slightly decreasing all others.

Amplitude evolution during Grover's algorithm showing state preparation (red), oracle (orange) and diffusion (blue) steps|600 — Amplitude evolution during Grover's algorithm showing state preparation (red), oracle (orange) and diffusion (blue) steps

Setting Up the 2D Subspace

Let's focus on the top row of Fig..
Although our Hilbert space has dimension $N = 2^{m}$ , the entire algorithm takes place in a two-dimensional subspace spanned by just two states:

$| w ⟩$ : the target state we wish to find
$| s^{'} ⟩$ : the uniform superposition over all non-winner states, defined as $| s^{'} ⟩ = \frac{1}{\sqrt{N - 1}} \sum_{x \neq w} | x ⟩$

The initial state $| s ⟩$ lies in this plane because it can be written as a linear combination of $| w ⟩$ and $| s^{'} ⟩$ .
Since $| s ⟩$ has only a tiny component along $| w ⟩$ (amplitude $1 / \sqrt{N}$ ), it starts almost parallel to $| s^{'} ⟩$ .
More precisely, we can write any state in this subspace as:

| ψ ⟩ = \sin θ | w ⟩ + \cos θ | s^{'} ⟩

For the initial state, the angle from the $| s^{'} ⟩$ axis is:

θ_{0} = \arcsin \frac{1}{\sqrt{N}} \approx \frac{1}{\sqrt{N}} (for large N)

This tiny angle (roughly $1 / \sqrt{N}$ radians) is the starting point of our journey toward $| w ⟩$ .

Exercise 5: Initial Angle Calculation

For $N = 16$ (4 qubits), calculate the exact initial angle $θ_{0} = \arcsin (1 / \sqrt{N})$ in both radians and degrees. Compare this to the approximation $θ_{0} \approx 1 / \sqrt{N}$ .

Answer

$θ_{0} = \arcsin (\frac{1}{\sqrt{16}}) = \arcsin (\frac{1}{4}) \approx 0.2527 rad \approx 14.48 °$

The approximation gives:

θ_{0} \approx \frac{1}{\sqrt{16}} = 0.25 rad \approx 14.32 °

The error is about 1%, which is acceptable for large $N$ but noticeable for small $N$ .

Step 1: the Oracle as a Reflection

Now let's switch to the second row of Fig..
We mentioned several interpretations for the oracle $U_{w}$ , the final one we will use is its geometric one.
The oracle $U_{w} = I - 2 | w ⟩ ⟨ w |$ performs a reflection about the hyperplane perpendicular to $| w ⟩$ .
Geometrically, if you picture $| w ⟩$ as the vertical axis in our 2D subspace, see Fig., the oracle flips the state vector across the horizontal axis $| s^{'} ⟩$ .
The effect is simple: if the state was at angle $θ$ above the $| s^{'} ⟩$ axis, after the oracle it sits at angle $- θ$ (below the axis).
The distance to $| w ⟩$ hasn't changed, but the state has been "flipped" to the other side of $| s^{'} ⟩$ .

Exercise 6: Oracle Reflection Verification

Verify that $U_{w} | x ⟩ = | x ⟩$ when $x \neq w$ and $U_{w} | w ⟩ = - | w ⟩$ by direct calculation using $U_{w} = I - 2 | w ⟩ ⟨ w |$ .

Answer

Using $U_{w} = I - 2 | w ⟩ ⟨ w |$ , we verify both cases easily:

Case 1: When $x \neq w$ :

U_{w} | x ⟩ = (I - 2 | w ⟩ ⟨ w |) | x ⟩ = I | x ⟩ - 2 | w ⟩ ⟨ w | x ⟩

Since $| w ⟩$ and $| x ⟩$ are orthogonal basis states when $x \neq w$ , we have $⟨ w | x ⟩ = 0$ .
Therefore:

U_{w} | x ⟩ = | x ⟩ - 2 | w ⟩ \cdot 0 = | x ⟩

Case 2: When $x = w$ :

U_{w} | w ⟩ = (I - 2 | w ⟩ ⟨ w |) | w ⟩ = I | w ⟩ - 2 | w ⟩ ⟨ w | w ⟩

Since $⟨ w | w ⟩ = 1$ (normalized state):

U_{w} | w ⟩ = | w ⟩ - 2 | w ⟩ \cdot 1 = | w ⟩ - 2 | w ⟩ = - | w ⟩

This confirms that the oracle acts as a reflection about the hyperplane perpendicular to $| w ⟩$ , leaving all orthogonal states unchanged while flipping the sign of $| w ⟩$ itself.

Step 2: The Diffusion Operator as a Second Reflection

Now we briefly mention the diffuser's geometric action to understand its effect intuitively.
As can be seen in the last row of Fig., the diffuser is a second reflection, now performed about the initial state $| s ⟩$ itself.

Exercise 7: Diffusion Operator Matrix Form

For a 2-qubit system ( $N = 4$ ), construct the explicit $4 \times 4$ diffusion operator matrix $U_{s} = 2 | s ⟩ ⟨ s | - I$ , where $| s ⟩ = \frac{1}{2} (| 00 ⟩ + | 01 ⟩ + | 10 ⟩ + | 11 ⟩)$ .

Answer

First calculate $| s ⟩ ⟨ s |$ :

| s ⟩ ⟨ s | = \frac{1}{4} (\begin{matrix} 1 \\ 1 \\ 1 \\ 1 \end{matrix}) (\begin{matrix} 1 & 1 & 1 & 1 \end{matrix}) = \frac{1}{4} (\begin{matrix} 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \end{matrix})

Then:

U_{s} = 2 | s ⟩ ⟨ s | - I = \frac{1}{2} (\begin{matrix} 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \end{matrix}) - (\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix}) = (\begin{matrix} - 1 / 2 & 1 / 2 & 1 / 2 & 1 / 2 \\ 1 / 2 & - 1 / 2 & 1 / 2 & 1 / 2 \\ 1 / 2 & 1 / 2 & - 1 / 2 & 1 / 2 \\ 1 / 2 & 1 / 2 & 1 / 2 & - 1 / 2 \end{matrix})

The Diffusion Operator: Amplifying the Marked State

After applying the oracle, our quantum state looks similar to where we started, at least on the surface.
Indeed, if we measured at this point, we would still get each outcome with equal probability, since $| 1 / \sqrt{N} |^{2} = | - 1 / \sqrt{N} |^{2}$ .
But what is crucial is that every amplitude remains $1 / \sqrt{N}$ in magnitude, except the winner, which now carries a negative sign!

State	$\| 000 ⟩$	$\| 001 ⟩$	...	$\| 100 ⟩$	$\| 101 ⟩$
Amp. Before Oracle	$1 / \sqrt{N}$	$1 / \sqrt{N}$	...	$1 / \sqrt{N}$	$1 / \sqrt{N}$
Amp. After Oracle	$1 / \sqrt{N}$	$1 / \sqrt{N}$	...	$- 1 / \sqrt{N}$	$1 / \sqrt{N}$
Goal	0	0	...	1	0

The final step is the diffusion operator $U_{s}$ , also known as the Grover diffusion or "inversion about the mean."
As we just saw geometrically, this operator reflects all amplitudes about their average value $| s ⟩$ , which has the effect of boosting the marked state (whose amplitude is below the mean) while suppressing all others.
Mathematically, the diffusion operator is defined as:

U_{s} = 2 | s ⟩ ⟨ s | - I

where $| s ⟩$ is still the uniform superposition.
The full Grover's algorithm is composed of the oracle and diffuser, but how many times?
We repeat it to get as close as possible to the winner state $| w ⟩$ .

Sign Convention

The difference in signs betweenn $U_{w} = I - 2 | w ⟩ ⟨ w |$ and $U_{s} = 2 | s ⟩ ⟨ s | - I$ comes from which vector you use to define the reflection.

Mathematically, both operators perform the exact same geometric action: reflection about an axis.
The general formula for reflecting a vector across an axis defined by a projector $P$ is:

Reflection = 2 P - I

Here is why the signs look different for the two operators:

The Diffuser ( $U_{s}$ ): Reflect the state about the axis parallel to $| s ⟩$ .
We have the projector for this axis: $P_{s} = | s ⟩ ⟨ s |$ .
Using the standard formula:

U_{s} = 2 P_{s} - I = 2 | s ⟩ ⟨ s | - I

(This is a standard reflection about the vector $| s ⟩$ .)

The Oracle ( $U_{ω}$ ): Reflect the state about the axis orthogonal to the winner $| ω ⟩$ (i.e., the axis of "non-winners") that we called $| s^{'} ⟩$ .
Its projector is $P_{s^{'}}$ .
Using the standard formula, the operator should be:

U_{ω} = 2 P_{s^{'}} - I

However, we don't build the circuit using "non-winners"; we build it using the winner $| ω ⟩$ .
We know that the sum of projectors in the space must be the Identity:

P_{s^{'}} + P_{ω} = I ⟹ P_{s^{'}} = I - P_{ω}

Substitute this into our reflection formula:

U_{ω} = 2 (I - P_{ω}) - I = 2 I - 2 P_{ω} - I = I - 2 | ω ⟩ ⟨ ω |

Summary: The sign flips because of the basis we use to write the operator.
$U_{s}$ is written using the axis we reflect about (positive $P$ ).
$U_{ω}$ is written using the axis normal to the reflection plane (negative $P$ ).

Iteration and Optimal Stopping

The following Fig. builds on Fig. and depicts the geometric view of the full Grover's algorithm in the 2D subspace spanned by $| w ⟩$ and $| s^{'} ⟩$ where Oracle and Diffuser are repeated.
Composing these two reflections produces a rotation by $2 θ$ towards the target state $| w ⟩$ . After $k$ iterations, the state has rotated through angle $(2 k + 1) θ$ .
The amplitude of the winner state grows while all other amplitudes shrink, this is the amplification we sought.

Geometric view of Grover's algorithm as rotations in 2D subspace|600 — Geometric view of Grover's algorithm as rotations in 2D subspace

We want to stop when this angle reaches approximately $π / 2$ , at which point the state is nearly aligned with $| w ⟩$ and measurement succeeds with high probability.
This can be see as $k^{*} = ⌊ \frac{Total distance}{Step size} ⌋ = ⌊ \frac{π / 2}{θ} ⌋$ .
Solving $(2 k^{*} + 1) θ \approx π / 2$ for large $N$ (where $θ \approx 1 / \sqrt{N}$ ):

k^{*} \approx \frac{π}{4} \sqrt{N}

This is the origin of the $\sqrt{N}$ scaling.
Each iteration rotates by a tiny angle $\sim 1 / \sqrt{N}$ , and we need $\sim \sqrt{N}$ such rotations to traverse a quarter-circle.

Exercise 8: Optimal Iterations for Different Database Sizes

Calculate $k^{*}$ for database sizes $N = 4, 16, 64, 256$ and the associated computational speedup over classical search.

Answer

Using $k^{*} \approx \frac{π}{4} \sqrt{N}$ :

$N = 4$ : $k^{*} \approx 1.57 \approx 2$ vs classical $N / 2 = 2$ (no advantage)
$N = 16$ : $k^{*} \approx 3.14 \approx 3$ vs classical $N / 2 = 8$ (2.7× speedup)
$N = 64$ : $k^{*} \approx 6.28 \approx 6$ vs classical $N / 2 = 32$ (5.3× speedup)
$N = 256$ : $k^{*} \approx 12.57 \approx 13$ vs classical $N / 2 = 128$ (9.8× speedup)

Exercise 9: Success Probability After k Iterations

The success probability after $k$ Grover iterations is $P (k) = \sin^{2} ((2 k + 1) θ_{0})$ , where $θ_{0} = \arcsin (1 / \sqrt{N})$ .
For $N = 16$ , calculate the success probability after $k = 1, 2, 3, 4$ iterations.
Which value of $k$ gives the highest probability?

Answer

For $N = 16$ : $θ_{0} = \arcsin (1 / 4) \approx 0.2527$ rad

$k = 1$ : $P (1) = \sin^{2} (3 θ_{0}) = \sin^{2} (0.758) \approx 0.50$ (50%)
$k = 2$ : $P (2) = \sin^{2} (5 θ_{0}) = \sin^{2} (1.264) \approx 0.91$ (91%)
$k = 3$ : $P (3) = \sin^{2} (7 θ_{0}) = \sin^{2} (1.769) \approx 0.997$ (99.7%) ← optimal
$k = 4$ : $P (4) = \sin^{2} (9 θ_{0}) = \sin^{2} (2.274) \approx 0.73$ (73%) (over-rotated!)

The optimal is $k = 3$ , close to the theoretical $k^{*} \approx \frac{π}{4} \sqrt{16} \approx 3.14$ .

Concrete Example: 20 qubits

For a database of $N = 2^{20} \approx 10^{6}$ items:

k^{*} = \frac{π}{4} \sqrt{2^{20}} \approx 800 iterations

Compared to the classical $N / 2 \approx 500,000$ queries, this is a substantial improvement.

Don't Over-Rotate!

Unlike classical algorithms where more iterations can lead to better results, Grover's algorithm has a "sweet spot."
If you iterate beyond $k^{*}$ , the state rotates past $| w ⟩$ and the success probability decreases.
You must measure at the right moment.

Generalization: Multiple Winners

We briefly mention the case where multiple states are considered as winners.
The algorithm extends naturally to the case where there are $m$ solutions rather than just one.
We redefine the "winner state" as the uniform superposition over all solutions:

| w ⟩ = \frac{1}{\sqrt{m}} \sum_{x \in Solutions} | x ⟩

The geometric picture remains the same, but now the initial angle is larger: $\sin θ = \sqrt{m / N}$ instead of $1 / \sqrt{N}$ .
Since we start closer to the target subspace, fewer iterations are needed:

k^{*} = ⌊ \frac{π}{4} \sqrt{\frac{N}{m}} ⌋

This makes sense: if half the items are winners ( $m = N / 2$ ), we only need about one iteration.

Exercise 10: Multiple Winners Initial Angle

For a 4-qubit system ( $N = 16$ ) with $m = 4$ winner states, calculate the initial angle $θ$ and the optimal number of iterations $k^{*}$ .
Compare this to the single-winner case ( $m = 1$ ).

Answer

For $m = 4$ winners in $N = 16$ items:

\sin θ = \sqrt{\frac{m}{N}} = \sqrt{\frac{4}{16}} = \frac{1}{2}

θ = \arcsin (1 / 2) = \frac{π}{6} \approx 0.524 rad \approx 30 °

Optimal iterations:

k^{*} = ⌊ \frac{π}{4} \sqrt{\frac{16}{4}} ⌋ = ⌊ \frac{π}{4} \cdot 2 ⌋ = ⌊ 1.57 ⌋ = 1

For comparison, single winner ( $m = 1$ ): $θ \approx 14.5 °$ and $k^{*} \approx 3$ .
With more winners, we start closer to the target and need fewer iterations.

Exercise 11: Multi-Winner Search Angle

Consider a 3-qubit system ( $N = 8$ ) with two winner states: $w \in {| 101 ⟩, | 110 ⟩}$ .

Write down the initial state $| s ⟩$ after applying Hadamards.
Design an oracle that marks both winning states (hint: it should flip the phase of $| 101 ⟩$ and $| 110 ⟩$ ).
Calculate the initial angle $θ_{0}$ and determine the optimal number of iterations.
Draw the quantum circuit.

Answer

1. Initial state:
After applying Hadamard gates to all three qubits starting from $| 000 ⟩$ :

| s ⟩ = H^{\otimes 3} | 000 ⟩ = \frac{1}{\sqrt{8}} \sum_{x = 0}^{7} | x ⟩ = \frac{1}{2 \sqrt{2}} (| 000 ⟩ + | 001 ⟩ + | 010 ⟩ + | 011 ⟩ + | 100 ⟩ + | 101 ⟩ + | 110 ⟩ + | 111 ⟩)

2. Oracle design:
The oracle must flip the phase of both $| 101 ⟩ = | 5 ⟩$ and $| 110 ⟩ = | 6 ⟩$ :

U_{w} = I - 2 (| 101 ⟩ ⟨ 101 | + | 110 ⟩ ⟨ 110 |)

In matrix form, this is a diagonal matrix with entries:

U_{w} = diag (1, 1, 1, 1, 1, - 1, - 1, 1)

The oracle can be implemented as a combination of controlled gates that detect both patterns.

3. Initial angle and optimal iterations:
With $m = 2$ winners in $N = 8$ items:

\sin θ_{0} = \sqrt{\frac{m}{N}} = \sqrt{\frac{2}{8}} = \sqrt{\frac{1}{4}} = \frac{1}{2}

θ_{0} = \arcsin (1 / 2) = \frac{π}{6} \approx 0.524 rad \approx 30 °

Optimal number of iterations:

k^{*} = ⌊ \frac{π}{4} \sqrt{\frac{N}{m}} ⌋ = ⌊ \frac{π}{4} \sqrt{\frac{8}{2}} ⌋ = ⌊ \frac{π}{4} \cdot 2 ⌋ = ⌊ 1.57 ⌋ = 1

Only one iteration is needed!

4. Quantum circuit:
The circuit consists of:

Initialization: Three Hadamard gates on qubits 0, 1, 2
Oracle: A multi-controlled gate structure that flips the phase when the state is either $| 101 ⟩$ or $| 110 ⟩$
Diffuser: $H^{\otimes 3}$ , followed by a multi-controlled Z gate (with X gates on controls to detect $| 000 ⟩$ ), then $H^{\otimes 3}$ again
Measurement: Measure all three qubits

The oracle can be decomposed as:

Oracle = (CZ on qubits that detect |101⟩) + (CZ on qubits that detect |110⟩)

Understanding the Amplitude Dynamics: Inversion About the Mean

The geometric picture is clear, but it's also helpful to see how the amplitudes evolve algebraically.
This reveals the mechanism of "inversion about the mean" that gives the diffusion operator its amplifying power.
The final state after one iteration is:

| ψ_{final} ⟩ = \underset{U_{s}}{\underset{⏟}{(2 | s ⟩ ⟨ s | - I)}} \underset{U_{w}}{\underset{⏟}{(I - 2 | w ⟩ ⟨ w |)}} \underset{| s ⟩}{\underset{⏟}{H^{\otimes n} | 0 ⟩^{\otimes n}}}

After applying the oracle, the state becomes $| s_{w} ⟩ = \frac{1}{\sqrt{N}} \sum_{x} (- 1)^{f (x)} | x ⟩$ , where $f (x) = 1$ only for the winner.
Let's compute what happens when we apply the diffusion operator $U_{s} = 2 | s ⟩ ⟨ s | - I$ :

| ψ_{final} ⟩ = 2 | s ⟩ ⟨ s | s_{w} ⟩ - | s_{w} ⟩ .

Using the fact that $⟨ s | x ⟩ = \frac{1}{\sqrt{N}}$ , the full expression for the amplitude becomes:

| ψ_{final} ⟩ = \frac{1}{\sqrt{N}} \sum_{x} [\frac{2}{N} \underset{Sum over all phases}{\underset{⏟}{\sum_{z} (- 1)^{f (z)}}} - (- 1)^{f (x)}] | x ⟩

The key observation is that $⟨ s | s_{w} ⟩ = \frac{1}{N} \sum_{x} (- 1)^{f (x)} = \frac{N - 2}{N}$ (since $N - 1$ terms contribute $+ 1$ and one term contributes $- 1$ ).
After some algebra, the amplitude of state $| x ⟩$ becomes:

\frac{2 (N - 2)}{N} - (- 1)^{f (x)} \cdot 1

State	$f (x)$	$(- 1)^{f (x)}$	New Amplitude	Effect
Loser	0	+1	$\frac{2}{N} (N - 2) - 1 = 1 - \frac{4}{N}$	From $1 \to$ Decrease
Winner	1	-1	$\frac{2}{N} (N - 2) - (- 1) = 3 - \frac{4}{N}$	From $1 \to$ Roughly $\times 3$

The winner's amplitude roughly triples with each iteration (for large $N$ ), while loser amplitudes decrease slightly.
This is the quantitative manifestation of "inversion about the mean": the winner amplitude, being far below the mean, gets reflected far above it.

The Physical Origin of Interference

To finish discussing the algorithm and before switching to examples and experimental realisations, let's wonder where the quantum interference takes place.
You remember from previous lecture that a quantum computer's advantage does not rely in running the same calculation in parallel over all the superposed states.
Rather, you only want a single state in your final state distribution.
The speedup in Grover's algorithm comes from this quantum interference, and it's worth understanding where this interference arises:

The Oracle Step:
- By flipping the winner's phase to negative, the oracle creates a situation where the winner amplitude is now "out of step" with all the others.
- The mean amplitude drops slightly because of this negative contribution.
The Diffusion Step: The diffusion operator reflects all amplitudes about this new (lower) mean.
- For the loser states, which sit just above the mean, reflection barely moves them.
- But the winner state sits far below the mean, so reflection pushes it upward, effectively collecting constructive contributions from all the other states, the $\times 3$ factor we saw in the previous section.

This is constructive interference for the winner and destructive interference for the losers, achieved not by path differences (as in optical interference) but by manipulation of quantum phases.

Intuition on the Origin of Speedup: Pythagoras

Where does the speedup come from? You can see it from Pythagoras' theorem.
In the classical world, different observable states are like pure coordinate directions (the axes of our Hilbert space). To get from your starting point to the correct answer, you are essentially "walking the edges" of an $N$ -dimensional space.

Classical (Manhattan distance): You have to traverse $N$ items one by one. You are constrained to move along the axes (checking one bit string at a time). To reach the "opposite corner" of a unit cube in $N$ dimensions, you would have to walk $N$ units. Distance $\propto N$ .
Quantum (Euclidean distance): Quantum mechanics gives us access to diagonal directions (superpositions) that are unavailable to a classical system.
- By allowing the state vector to move "diagonally" through the Hilbert space, we can take a shortcut.
- In an $N$ -dimensional space, the distance from the origin to the opposite corner $(1, 1, \dots, 1)$ is $\sqrt{1^{2} + 1^{2} + \dots + 1^{2}} = \sqrt{N}$ .
- The Grover iteration effectively traces a quarter-circle arc from our initial state to the target, utilizing these "diagonal" states to achieve a square-root shortcut.
- The state vector slowly walks along this arc, taking a path that would be entirely unavailable if we were limited to pure coordinate directions.

A word of caution: the "Quadratic Trap"

It is proven (see BBBV Theorem) that $O (\sqrt{N})$ is the lower bound: we cannot search faster than square root.
When we will discuss resource estimation and space-time overhead (how many qubits and how much time) of applications, we will see that the $O (\sqrt{N})$ for quantum vs $O (N)$ for classical complexity will not be enough: we need to take the prefactors into account if we ever want to have a quantum advantage over classical computers, eg $C \cdot N$ vs $Q \cdot \sqrt{N}$ .
The "Quadratic Trap" occurs when the quantum computer's clock speed is slower than the classical computer.
If the "Quantum constant" $Q$ (due to error correction overheads that we will cover in the next lectures) is $10^{8}$ times larger than the "Classical constant" $C$ , then for small $N$ , the classical computer remains much faster.
Said in another way:

C \cdot N < Q \cdot \sqrt{N} ⟹ \sqrt{N} < Q / C

If $Q / C = 10^{8}$ , Grover only wins if the database size $N > 10^{16}$ .

Example: 2-Qubit Search ( $N = 4$ )

Let's work through Grover's algorithm for the smallest non-trivial case: searching among 4 items with 2 qubits.
In this example, we'll suppose the winner is $| w ⟩ = | 11 ⟩$ .

Detailed circuit implementation of Grover's algorithm with gate decomposition|600 — Detailed circuit implementation of Grover's algorithm with gate decomposition

Initial Setup

The starting angle is $θ_{0} = \arcsin (1 / \sqrt{4}) = \arcsin (1 / 2) = π / 6$ (30 degrees).
Since we need to rotate from $π / 6$ to $π / 2$ , the optimal number of iterations is $k^{*} = 1$ .
This is a special case where a single Grover iteration gives perfect success probability.

The Oracle

For winner state $| 11 ⟩$ , the oracle flips only this state's phase:

U_{w} | s ⟩ = U_{w} \cdot \frac{1}{2} (| 00 ⟩ + | 01 ⟩ + | 10 ⟩ + | 11 ⟩) = \frac{1}{2} (| 00 ⟩ + | 01 ⟩ + | 10 ⟩ - | 11 ⟩)

In matrix form, this oracle is simply the controlled-Z gate: $U_{w} = C Z$ , seen in lecture 1:

U_{w} = (\begin{matrix} 1 \\ 1 & (0) \\ (0) & 1 \\ - 1 \end{matrix}) = C Z

Implementing the Diffusion Operator

The diffusion operator $U_{s} = 2 | s ⟩ ⟨ s | - I$ might look complicated, but there's a clear circuit decomposition.
The key insight is that reflecting about $| s ⟩$ is equivalent to:

Rotating $| s ⟩$ to $| 0 ⟩$ : $H^{\otimes n}$
Reflecting about $| 0 ⟩$ : $U_{0}$
Rotating back: $H^{\otimes n}$

Where $U_{0}$ flips the sign of $| 00. . .0 ⟩$ and leaves all other states unchanged.
The operator $U_{0}$ can be implemented as a multi-controlled Z gate (flip sign when all control qubits are 0).
For 2 qubits, this decomposes to $U_{0} = Z_{1} Z_{2} \cdot C Z$ :

U_{0} \frac{1}{2} (| 00 ⟩ + | 01 ⟩ + | 10 ⟩ + | 11 ⟩) = \frac{1}{2} (| 00 ⟩ - | 01 ⟩ - | 10 ⟩ - | 11 ⟩)

The complete circuit has three stages: Initialization (Hadamards to create $| s ⟩$ ), Oracle (CZ gate marking the winner), and Diffuser (H-Z-CZ-H sequence), see Fig..

A more complex example with 5 qubits and one winner can be seen using Quirk.

Qiskit Examples

Let's look at examples that you can run yourself.

Examples with 2 And 3 Qubits

First go to Colab, a colaborative jupyter notebook website.
Then click on "Open Notebook" for a new Colab notebook and paste the Github url from a Grover implementation edited from the official qiskit release: qiskit-demo/presentations/grover.ipynb at main · francois-marie/qiskit-demo.

Application: Sudoku

How would you encode a $2 \times 2$ Sudoku puzzle as a Grover search problem?
Let's do this using Qiksit and Colab!

Experimental Implementations

We end this first part of the lecture by looking at some experimental implementations.
Grover's algorithm has been demonstrated on many quantum computing platforms: historically NMR but also superconducting (SC), trapped ions, photonics, and spins.
The table below summarizes key experimental results, showing the gap between theoretical and achieved success probabilities, a measure of hardware quality.

Platform	Reference	Qubits	Database items	Winners	Exp. Success Rate	Theo. Success Rate
SC	4	3	8	-	-
SC	5	6	64	-	4%
SC	5	4	32	-	45%
Trapped Ions	6	3	8	1	0.39	0.78
Trapped Ions	6	3	8	2	0.70	1
Photonics	7	2	4	1	0.71
Silicon Spins	8	3	8	1	0.93	0.98
Silicon Spins	8	3	8	2	0.96	1
NMR	2	2	4	1	-

Notice that silicon spin qubits achieve high fidelities, approaching theoretical limits.
The gap between achieved and theoretical success rates reflects gate errors and decoherence in current hardware.

Quantum Fourier Transform and Phase Estimation

Having explored Grover's quadratic speedup for unstructured search, we now turn back to last week's lecture, to a different class of algorithms that achieve exponential advantages.
The Quantum Fourier Transform and Quantum Phase Estimation are the workhorses behind Shor's factoring algorithm but also quantum chemistry simulations, and many other applications.
While Grover taught us about amplitude amplification through geometric rotations, QFT and QPE reveal how quantum computers can extract hidden periodicities and eigenvalues exponentially faster than classical approaches.

Classical Fourier Transform

Let's start by recalling some intuition on the Fourier transform.
The Fourier transform is one of the most powerful tools in mathematics and engineering.
It converts signals from the time domain to the frequency domain, revealing hidden periodicities, see Fig. that illustrates the concept of Fourier transformation, adapted from Wikipédia's page on the Fourier transform.
A time-domain signal $f$ containing multiple frequency components (in black, top row) is decomposed into its constituent frequencies (red, green and blue, middle row).
The Fourier transform reveals the amplitude and phase of each frequency component (bottom row), making hidden periodicities visible.
The key insight is that periodic structure in the time domain becomes localized peaks in the frequency domain.
Mathematically, the transform $\hat{f} (ξ)$ tells us "how much" of each frequency $ξ$ is present in the original signal:

\hat{f} (ξ) = \int_{- \infty}^{+ \infty} f (x) e^{- 2 π i ξ x} d x

The quantum version performs this transformation on quantum amplitudes rather than classical signals but the idea remains the same.

The Fourier transform relates the time domain, in black, with a function in the domain of the frequency. The component frequencies are shown as red green and blue peaks in the domain of the frequency.|600 — The Fourier transform relates the time domain, in black, with a function in the domain of the frequency. The component frequencies are shown as red green and blue peaks in the domain of the frequency.

Exercise 12: Comparing Classical and Quantum Fourier Transform Complexity

The classical Fast Fourier Transform (FFT) on $N = 2^{n}$ points requires $O (N \log N)$ operations.
The Quantum Fourier Transform on $n$ qubits requires $O (n^{2})$ gates.
Compare these complexities for $n = 10, 20, 30$ qubits in terms of number of operations and compute the speedup factor.

Answer

For $n$ qubits, $N = 2^{n}$ :

Classical FFT: $O (N \log N) = O (2^{n} \cdot n)$ operations
QFT: $O (n^{2})$ gates

Comparison:

$n$	$N = 2^{n}$	Classical: $N \log_{2} N$	Quantum: $n^{2}$	Speedup Factor
10	1,024	10,240	100	100×
20	1,048,576	20,971,520	400	52,000×
30	$10^{9}$	$3.2 \times 10^{10}$	900	$3.6 \times 10^{7}$ ×

Discrete Fourier Transform (DFT)

For computational purposes, we can work with discrete vectors rather than continuous functions.
Given an input vector $(x_{0}, x_{1}, \dots, x_{N - 1})$ , the DFT produces an output vector $(y_{0}, y_{1}, \dots, y_{N - 1})$ where:

y_{k} = \frac{1}{\sqrt{N}} \sum_{n = 0}^{N - 1} x_{n} ω_{N}^{n k}, where ω_{N} = e^{2 π i / N}

The quantity $ω_{N}$ is the primitive $N$ th root of unity, a complex number that, when raised to the $N$ th power, equals 1.

The Quantum Fourier Transform

Going from DFT to QFT is simply switching from vectors to vector states: the QFT performs the DFT on the amplitudes of a quantum state.
If we start with a state $| ψ ⟩ = \sum_{j = 0}^{N - 1} x_{j} | j ⟩$ , the QFT produces:

Q F T | ψ ⟩ = \sum_{k = 0}^{N - 1} y_{k} | k ⟩, where y_{k} = \frac{1}{\sqrt{N}} \sum_{j = 0}^{N - 1} x_{j} ω_{N}^{j k}

For a computational basis state $| j ⟩$ , this becomes:

Q F T | j ⟩ = \frac{1}{\sqrt{N}} \sum_{k = 0}^{N - 1} ω_{N}^{j k} | k ⟩

The QFT creates a superposition where each basis state $| k ⟩$ acquires a phase that depends on both $j$ (the input) and $k$ (the output index).

Matrix Representation

Mathematically, the QFT can be expressed as an $N \times N$ unitary matrix:

F_{N} = \frac{1}{\sqrt{N}} (\begin{matrix} 1 & 1 & 1 & \dots & 1 \\ 1 & ω & ω^{2} & \dots & ω^{N - 1} \\ 1 & ω^{2} & ω^{4} & \dots & ω^{2 (N - 1)} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ 1 & ω^{N - 1} & ω^{2 (N - 1)} & \dots & ω^{(N - 1)^{2}} \end{matrix})

Each row represents a different "frequency", and the entries oscillate faster as you move down the matrix.

Exercise 13: Construct F4 Matrix

For $N = 4$ , we have $ω = e^{i π / 2} = i$ . Write out the explicit $4 \times 4$ QFT matrix using powers of $i$ .
Verify that the first column corresponds to $Q F T | 0 ⟩$ .

Answer

Using $ω = i$ and $F_{N} = \frac{1}{\sqrt{N}} [ω^{j k}]$ :

F_{4} = \frac{1}{2} (\begin{matrix} 1 & 1 & 1 & 1 \\ 1 & i & i^{2} & i^{3} \\ 1 & i^{2} & i^{4} & i^{6} \\ 1 & i^{3} & i^{6} & i^{9} \end{matrix}) = \frac{1}{2} (\begin{matrix} 1 & 1 & 1 & 1 \\ 1 & i & - 1 & - i \\ 1 & - 1 & 1 & - 1 \\ 1 & - i & - 1 & i \end{matrix})

The first column is $(1, 1, 1, 1)^{T} / 2$ , which equals $| s ⟩ = \frac{1}{2} \sum_{k = 0}^{3} | k ⟩ = Q F T | 0 ⟩$ $✓$

Key Properties

For the sake of completeness, we briefly mention some key properties of the QFT.
The QFT is a unitary transformation: $F_{N} F_{N}^{†} = I$ .
This means it preserves probability (as all quantum operations must) and is reversible.
The inverse QFT is simply $F_{N}^{†}$ , obtained by taking the complex conjugate of all entries.

Exercise 14: Roots of Unity Properties

Let $ω_{8} = e^{2 π i / 8}$ be the primitive 8th root of unity.
Calculate $ω_{8}^{4}$ and $ω_{8}^{8}$ .
Show that $\sum_{k = 0}^{7} ω_{8}^{j k} = 0$ for $j \neq 0 (\mod 8)$ .

Answer

$ω_{8}^{4} = e^{2 π i \cdot 4 / 8} = e^{π i} = - 1$ $ω_{8}^{8} = e^{2 π i \cdot 8 / 8} = e^{2 π i} = 1$

For the sum with $j \neq 0$ : this is a geometric series with ratio $r = ω_{8}^{j} \neq 1$ :

\sum_{k = 0}^{7} ω_{8}^{j k} = \frac{1 - ω_{8}^{8 j}}{1 - ω_{8}^{j}} = \frac{1 - (ω_{8}^{8})^{j}}{1 - ω_{8}^{j}} = \frac{1 - 1^{j}}{1 - ω_{8}^{j}} = 0

This orthogonality property is crucial for the QFT.

Building Intuition: How the QFT Encodes Information

Before diving into the circuit representation of the QFT, we build some intuition by looking at an example of QFT on basis states.
A feature of the QFT is that when applied to a basis state $| j ⟩$ (eg $| j ⟩ = | 5 ⟩ = | 101 ⟩$ ), the output can be written as a tensor product of single-qubit states.
As we will see explicitly by taking the example of 3 and 4 qubits, each output qubit is in a superposition $| 0 ⟩ + e^{i ϕ_{k}} | 1 ⟩$ , where the phase $ϕ_{k}$ depends on the input $j$ :

Q F T_{N} | j ⟩ = \frac{1}{\sqrt{N}} ⨂_{k = 1}^{n} (| 0 ⟩ + e^{2 π i j / 2^{k}} | 1 ⟩)

In binary notation, where $j = j_{1} j_{2} \dots j_{n}$ and $0. j_{j} j_{j + 1} \dots j_{n}$ denotes the binary fraction, this becomes:

Q F T_{N} | j ⟩ = \frac{1}{\sqrt{N}} (| 0 ⟩ + e^{2 π i [0. j_{n}]} | 1 ⟩) \otimes (| 0 ⟩ + e^{2 π i [0. j_{n - 1} j_{n}]} | 1 ⟩) \otimes \dots \otimes (| 0 ⟩ + e^{2 π i [0. j_{1} \dots j_{n}]} | 1 ⟩)

The different qubits rotate at different "frequencies":

The least significant output qubit rotates by $π$ between consecutive inputs (fastest oscillation) hence oscillates between $| + ⟩$ and $| - ⟩$ ,
The next qubit rotates by $π / 2$ (half the frequency), hence oscillates between $| + ⟩, | i ⟩, | - ⟩, | - i ⟩$
Each subsequent qubit rotates half as fast

This hierarchical structure of phases is what makes the QFT useful for detecting periodicity (on what relies Shor's algorithm): if the input state has period $r$ , the output will have peaks at multiples of $N / r$ .

Bloch sphere representation of QFT output states showing phase relationships|300 — Bloch sphere representation of QFT output states showing phase relationships

As shown in Fig., the QFT output states can be visualized on the Bloch sphere with their phase relationships. Let's map explicitly input number states to their binary representations and QFT outputs in the following table:

Input state	Binary $\| q_{1} q_{2} q_{3} ⟩$	QFT Output $\| {\tilde{q}}_{1}, {\tilde{q}}_{2}, {\tilde{q}}_{3} ⟩$
$\| 0 ⟩$	$\| 000 ⟩$	$\| +, +, + ⟩$
$\| 1 ⟩$	$\| 001 ⟩$	$\| -, i, T_{1} ⟩$
$\| 2 ⟩$	$\| 010 ⟩$	$\| +, -, i ⟩$
$\| 3 ⟩$	$\| 011 ⟩$	$\| -, - i, T_{2} ⟩$
$\| 4 ⟩$	$\| 100 ⟩$	$\| +, +, - ⟩$
$\| 5 ⟩$	$\| 101 ⟩$	$\| -, i, T_{3} ⟩$
$\| 6 ⟩$	$\| 110 ⟩$	$\| +, -, - i ⟩$
$\| 7 ⟩$	$\| 111 ⟩$	$\| -, - i, T_{4} ⟩$

The first qubit oscillates from $| + ⟩, | T_{1} ⟩, \dots, | - i ⟩, | T_{4} ⟩$ .
In Fig. and Fig., we can see this even more clearly on 4 qubits, where the first qubit oscillates between $2^{4} = 16$ states:

Representation of basis states on 4 qubits.|600 — Representation of basis states on 4 qubits.

Representation of Fourier basis states on 4 qubits.|600 — Representation of Fourier basis states on 4 qubits.

The pattern reveals how the input number $j$ gets encoded into the relative phases of the output qubits.

Worked Example: QFT of state 5

For the 3-qubit state $| 5 ⟩ = | 101 ⟩$ , the QFT produces phases that increase with qubit position:

The fastest qubit ( ${\tilde{q}}_{3}$ ) rotates by $5 \times \frac{2 π}{8} = \frac{5 π}{4}$
The middle qubit ( ${\tilde{q}}_{2}$ ) rotates twice as slowly: $\frac{5 π}{2} \mod 2 π = \frac{π}{2}$
The slowest qubit ( ${\tilde{q}}_{1}$ ) rotates at half that rate: $5 π \mod 2 π = π$

Let's write the full explicit tensor product form for QFT on $| 5 ⟩$ :

Q F T | 5 ⟩ = \frac{1}{\sqrt{2}} (| 0 ⟩ + e^{i π} | 1 ⟩) \otimes \frac{1}{\sqrt{2}} (| 0 ⟩ + e^{i π / 2} | 1 ⟩) \otimes \frac{1}{\sqrt{2}} (| 0 ⟩ + e^{i 5 π / 4} | 1 ⟩)

Exercise 15: Product State for Specific Input

Calculate the explicit product state representation for $Q F T | 3 ⟩$ on a 3-qubit system ( $N = 8$ ).
Express your answer as a tensor product of single-qubit states $\frac{1}{\sqrt{2}} (| 0 ⟩ + e^{i ϕ_{k}} | 1 ⟩)$ .

Answer

For $| 3 ⟩ = | 011 ⟩$ in binary and $N = 8$ :
The phases for each output qubit are:

Qubit 1 (slowest): $ϕ_{1} = 2 π \cdot 3 / 2 \mod 2 π = 3 π \mod 2 π = π$
Qubit 2 (middle): $ϕ_{2} = 2 π \cdot 3 / 4 = 3 π / 2$
Qubit 3 (fastest): $ϕ_{3} = 2 π \cdot 3 / 8 = 3 π / 4$

Therefore:

Q F T | 3 ⟩ = \frac{1}{\sqrt{2}} (| 0 ⟩ + e^{i π} | 1 ⟩) \otimes \frac{1}{\sqrt{2}} (| 0 ⟩ + e^{i 3 π / 2} | 1 ⟩) \otimes \frac{1}{\sqrt{2}} (| 0 ⟩ + e^{i 3 π / 4} | 1 ⟩)

= \frac{1}{\sqrt{2}} (| 0 ⟩ - | 1 ⟩) \otimes \frac{1}{\sqrt{2}} (| 0 ⟩ - i | 1 ⟩) \otimes \frac{1}{\sqrt{2}} (| 0 ⟩ + e^{i 3 π / 4} | 1 ⟩)

Now that we built some intuition on what the QFT does in practice, let's look at its circuit implementation.

QFT Circuit Implementation

The feature of the QFT is that it can be implemented with only $O (n^{2})$ gates (even $O (n \log n)$ using approximate techniques) for $n$ qubits (where $N = 2^{n}$ ), compared to the $O (N \log N)$ operations of the classical FFT.
The circuit uses two types of gates:

Hadamard gate: Creates equal superposition of $| 0 ⟩$ and $| 1 ⟩$ :

H = \frac{1}{\sqrt{2}} (\begin{matrix} 1 & 1 \\ 1 & - 1 \end{matrix})

Controlled phase rotation: Adds a phase depending on the control qubit:

{UROT}_{k} = R_{k} = (\begin{matrix} 1 & 0 \\ 0 & e^{2 π i / 2^{k}} \end{matrix})

The circuit of Fig. illustrates the complete circuit structure and applies Hadamards and controlled rotations in a specific pattern, followed by SWAP gates to reverse the qubit order.
The algorithm proceeds as follows:

apply a Hadamard to the first qubit, then controlled- $R_{2}$ (controlled by qubit 2), controlled- $R_{3}$ (controlled by qubit 3), and so on.
Then move to the second qubit and repeat with smaller rotations.

QFT circuit showing Hadamard gates and controlled rotation gates|800 — QFT circuit showing Hadamard gates and controlled rotation gates

Exercise 16: QFT Circuit Depth

For an $n$ -qubit QFT, count the total number of gates required.
How many Hadamard gates, controlled rotation gates, and SWAP gates are needed?
What is the total circuit depth (assuming all gates are sequential)?

Answer

For each of the $n$ qubits:

Qubit 1: 1 Hadamard + $(n - 1)$ controlled rotations
Qubit 2: 1 Hadamard + $(n - 2)$ controlled rotations
Qubit $k$ : 1 Hadamard + $(n - k)$ controlled rotations

Total gates:

Hadamards: $n$
Controlled rotations: $\sum_{k = 1}^{n} (n - k) = \sum_{j = 0}^{n - 1} j = \frac{n (n - 1)}{2} = O (n^{2})$
SWAPs: $⌊ n / 2 ⌋$ (to reverse qubit order)

Circuit depth: $O (n^{2})$ (dominated by the controlled rotations).
With approximations, this can be reduced to $O (n \log n)$ .

A nice and interactive display of the QFT circuit on 8 qubits can be found on Quirk.

The Inverse QFT

Since the QFT is unitary, we briefly mention that its inverse $Q F T^{†}$ exists and undoes the transformation.
In terms of circuit implementation, the inverse QFT circuit is simply the original circuit run backwards: reverse the order of all gates and replace each $R_{k}$ with $R_{k}^{†} = R_{k}^{- 1}$ (which has the opposite phase rotation).
The inverse QFT is important for extracting classical information from quantum phase information, we now focus on this in action in Quantum Phase Estimation use case.

Exercise 17: Inverse QFT Circuit

Describe the circuit for the inverse QFT on 3 qubits.
What are the gate types, and in what order do they appear?
How does it differ from the forward QFT circuit?

Answer

The inverse QFT circuit reverses the forward QFT:

Forward QFT (3 qubits):

SWAPs (if bit-reversal needed)
$H$ on qubit 3, then $C R_{2}, C R_{3}$ controlled by qubits 2,1
$H$ on qubit 2, then $C R_{2}$ controlled by qubit 1
$H$ on qubit 1

Inverse QFT (3 qubits):

$H$ on qubit 1
$C R_{2}^{†}$ controlled by qubit 1 on target qubit 2, then $H$ on qubit 2
$C R_{3}^{†}$ controlled by qubit 1, then $C R_{2}^{†}$ controlled by qubit 2, then $H$ on qubit 3
SWAPs (reversed)

The inverse uses $R_{k}^{†} = R_{- k}$ (negative phase) and reverses the gate order.

Do it Yourself!

Just like for Grover's algorithm, let's look at examples that you can run yourself.
Go again to Colab, a colaborative jupyter notebook website.
Then click on "Open Notebook" for a new Colab notebook and paste the following Github url: qiskit-demo/presentations/quantum-fourier-transform.ipynb at main · francois-marie/qiskit-demo.

Application: Quantum Phase Estimation (QPE)

The QFT finds one of its most important application in Quantum Phase Estimation, a subroutine used by Shor's factoring algorithm and many quantum chemistry algorithms.
Let's start by quickly explaining the problem at stake.

The Problem

You are given a unitary operator $U$ and one of its eigenvectors $| v ⟩$ satisfying $U | v ⟩ = e^{2 π i θ} | v ⟩$ , the goal is to estimate the eigenvalue phase $θ$ .
Looks pretty straightforward, right?
But how would you do this in practice?

The Key Insight: the return of the Phase Kickback

QPE uses controlled- $U$ operations to transfer the phase information from the eigenvector into an auxiliary register, just like what we used in Grover's Algorithm!
If we apply $C - U^{2^{j}}$ with the control qubit in superposition and the target in eigenstate $| v ⟩$ :

C - U^{2^{j}} (\frac{| 0 ⟩ + | 1 ⟩}{\sqrt{2}} \otimes | v ⟩) = \frac{| 0 ⟩ + e^{2 π i \cdot 2^{j} θ} | 1 ⟩}{\sqrt{2}} \otimes | v ⟩

We see that by using multiple control qubits with different powers of $U$ , we encode $θ$ into the phases of a multi-qubit register, the format that the inverse QFT can decode.
The circuit implementation is shown in Fig.:

Quantum Phase Estimation circuit with controlled-U gates and inverse QFT|800 — Quantum Phase Estimation circuit with controlled-U gates and inverse QFT

Exercise 18: Controlled-U Power Implementation

Suppose $U$ is a unitary with eigenstate $| v ⟩$ and eigenvalue $e^{2 π i θ}$ where $θ = 3 / 16$ .
Calculate the phase acquired by the control qubit when applying $C - U^{2^{2}}$ to $\frac{1}{\sqrt{2}} (| 0 ⟩ + | 1 ⟩) \otimes | v ⟩$ .
Express your answer as $e^{i ϕ}$ and determine $ϕ$ .

Answer

Since $U | v ⟩ = e^{2 π i θ} | v ⟩$ with $θ = 3 / 16$ :

U^{2^{2}} | v ⟩ = U^{4} | v ⟩ = (e^{2 π i θ})^{4} | v ⟩ = e^{2 π i \cdot 4 θ} | v ⟩ = e^{2 π i \cdot 4 \cdot 3 / 16} | v ⟩ = e^{2 π i \cdot 3 / 4} | v ⟩

Applying the controlled operation:

C - U^{4} (\frac{| 0 ⟩ + | 1 ⟩}{\sqrt{2}} \otimes | v ⟩) = \frac{| 0 ⟩ + e^{2 π i \cdot 3 / 4} | 1 ⟩}{\sqrt{2}} \otimes | v ⟩

The phase is: $ϕ = 2 π \cdot 3 / 4 = 3 π / 2$ .
Therefore the control qubit acquires phase $e^{i 3 π / 2} = - i$ .

The depicted circuit above can be hard to understand so to make things hopefully clearer, let's take an example.
Consider the $T$ gate (also called $π / 8$ gate) that we have lentoined several times already throughout the previous lectures:

T = (\begin{matrix} 1 & 0 \\ 0 & e^{i π / 4} \end{matrix})

The state $| 1 ⟩$ is an eigenvector of $T$ with eigenvalue $e^{i π / 4} = e^{2 π i \cdot (1 / 8)}$ .
Thus $θ = 1 / 8$ is the value we are going to be looking for.
If we run QPE with 3 qubits in the auxiliary register, the algorithm will encode $θ = 1 / 8 = {0.001}_{2}$ in binary.
After the inverse QFT, we measure the auxiliary register and obtain the binary representation of $θ$ :

θ = \frac{measured value}{2^{n}} = \frac{1}{8}

This example illustrates the power of QPE: it converts quantum phase information, which is normally invisible to measurement, into a classical bitstring that we can read out directly.

Exercise 19: QPE for the T Gate

What would you measure if you used only 2 qubits in the auxiliary register instead of 3?
Would the result still be exact?

Answer

We can only represent $θ$ to 2 bits of precision: $θ \approx {0.00}_{2} = 0$ .
The exact value $1 / 8 = {0.001}_{2}$ is truncated to ${0.00}_{2}$ .
Measurement would give $| 00 ⟩$ , suggesting $θ = 0 / 4 = 0$ , which is incorrect.
We need at least $⌈ \log_{2} (1 / θ) ⌉ = 3$ qubits for exact estimation.

Exercise 20: Phase Estimation Accuracy

In QPE, the auxiliary register has $n$ qubits, allowing us to estimate phases to $n$ bits of precision.
If we want to estimate a phase $θ$ with accuracy $ϵ$ (i.e., within $\pm ϵ$ ), how many auxiliary qubits do we need?
What is the probability of success if we use exactly this many qubits?

Answer

To achieve accuracy $ϵ$ , we need to distinguish phases that differ by $2 ϵ$ .
Since each qubit adds one bit of precision (halving the uncertainty), we need:

\frac{1}{2^{n}} \leq ϵ ⟹ n \geq \log_{2} (1 / ϵ) = ⌈ \log_{2} (1 / ϵ) ⌉

Example: For $ϵ = 0.01$ (1% accuracy):

n \geq \log_{2} (100) \approx 6.64 ⟹ n = 7 qubits

Success probability: If $θ$ has an exact $n$ -bit representation, success probability is 1.
If not, the standard QPE algorithm has success probability $\geq 4 / π^{2} \approx 0.405$ without additional precision bits.
To achieve high success probability ( $\geq 1 - δ$ ), use $n = ⌈ \log_{2} (1 / ϵ) ⌉ + ⌈ \log_{2} (2 + 1 / (2 δ)) ⌉$ qubits.

Qiskit time

We end this section by running some code.
Just like for Grover's algorithm and QFT, let's look at examples that you can run yourself.
Go again to Colab, a colaborative jupyter notebook website.
Then click on "Open Notebook" for a new Colab notebook and paste the following Github url: qiskit-demo/presentations/quantum-phase-estimation.ipynb at main · francois-marie/qiskit-demo.

Resource Estimates

Between last week's lecture and today's, we've now seen two paradigms of quantum advantage: Grover's quadratic speedup for search and the exponential speedup of Shor, QFT and phase estimation.
But there's a crucial question we haven't addressed yet.
Can we actually run these algorithms on real hardware?
How many would would we need and how much time would it take to run a useful quantum application?
Answering these questions provides what's called space-time overheads and is crucial to understand the cost of quantum advantage.
The answer depends not just on the number of logical (aka perfect and noiseless) qubits required, but on error rates and coherence times of physical qubits, and the massive overhead of quantum error correction that we will discuss in the next two lectures.
Let's end this lecture by confronting the resource requirements and understand the gap between algorithmic promise and experimental reality.

(Partial) Landscape of Quantum Algorithms

We've said already that different problem classes admit different types of quantum speedups, see Fig.:

Problems like factoring, discrete logarithm, or HHL algorithm where quantum computers achieve exponential advantage (in blue),
Optimization and search problems where we see quadratic speedups (yellow),
Other types of problems, like finding the ground state of an Hamiltonian or combinatorial problems, with an unproven advantage (red)

To classify all these quantum applications, we first decompose them in terms of the algorithm they use (for instance, remember that Grover's algorithm is at the center of a wide variety of use cases) and the primitives or building blocks of the algorithm.
More detailed maps exist but they become quickly unreadable.
However, considering even such a partial landscape allows to take a step back on the many topics addressed during our two "Application" lectures and all of the many remaining ones that you will discover in your future careers.

Partial landscape of quantum algorithms showing primitives, algorithms, problem classes and speedup types|300 — Partial landscape of quantum algorithms showing primitives, algorithms, problem classes and speedup types

Resource Overheads: From Theory to Reality

Let's talk numbers.
When researchers publish quantum algorithms, they often focus on asymptotic complexity (the $O (. . .)$ notation that tells you how runtime scales).
But if you want to actually run these algorithms on hardware one day, you need concrete resource estimates beforehand.
How many qubits? How many gates? How long will it take? What error rates can we tolerate?

Take nitrogen fixation as an example.
For context, the FeMoCo (iron-molybdenum cofactor) is the active site in certain bacteria that can convert atmospheric nitrogen into ammonia at room temperature.
If we could understand this process well enough to replicate it industrially, we could revolutionize agriculture and reduce the massive energy costs of the Haber-Bosch process.
Quantum simulation of FeMoCo is often cited as a "killer app" for quantum computers because the electronic structure involves strongly correlated electrons that are extremely difficult for classical computers to handle accurately.

But here's the reality check.
Even for this relatively small molecule (Fe $_{7}$ MoS $_{9}$ C), you need somewhere between 100 to 1,000 logical qubits and around $10^{14}$ (!!) logical gates.
Besides, that's not 100-1000 physical qubits (that we can build that today), but error-corrected logical qubits, each potentially requiring thousands of physical qubits depending on your error correction code and the quality of your hardware (more on that next week).
The tradeoff between circuit depth (number of sequential gates) and width (number of qubits) depends on the specific problem and algorithm.
Financial optimization problems can require even more qubits (around 10,000 logical qubits) but fewer total gates.

For Shor's algorithm attacking RSA-2048 (the encryption standard used widely on the internet), you need about 1,000 logical qubits.
But when you account for quantum error correction, that balloons to roughly 1 million physical qubits for the Gidney estimate 9, or 20 million for more conservative assumptions or other platforms 10.
And you would need to run the computation for about a week, maintaining coherence and performing error correction continuously throughout.

App.	FeMoCo (Chemistry) 11	Financial opti. 12	RSA-2048 (Shor, SC) 9	RSA-2048 (Shor, NA) 10
Logical Qubits	$\sim 100 - 1000$	$\approx 10^{4}$	$\sim 1, 000$	$\sim 10, 000$
Logical Gates	$\sim 10^{14}$	$\sim 10^{10}$	$\sim 10^{10}$	$\sim 10^{10}$ (CCZ)
Physical Qubits	-	-	$\sim 1$ Million	$\sim 20$ Million
Runtime	Days to months		~1 week	~1 week

Bridging Theory and Practice

The gap between what we can build and what we need is real, but it's closing.
See Fig. from Quantum Landscape which shows historical progress in reducing resource estimates for problems related to cryptography.
Back in the early 2000s, estimates for factoring RSA-2048 were in the billions of qubits.
Today, through better algorithms and more efficient error correction schemes, we're down to the millions.
That's still far from the hundreds or thousands of qubits in current machines, but the trajectory is clear.
Will this be sustainable or will it plateau? Still an open question.

Historical progress in reducing resource estimates for cryptographically-relevant quantum algorithms|600 — Historical progress in reducing resource estimates for cryptographically-relevant quantum algorithms

Where We Stand Today

Also from Quantum Landscape, Fig. provides a snapshot of where we are right now.
Current quantum processors have somewhere between 100 and 1,000 physical qubits, depending on the platform.
Superconducting qubits (IBM, Google, Rigetti) are pushing toward the high hundreds.
Trapped ions (IonQ, Quantinuum) have fewer qubits but better connectivity and gate fidelities.
Neutral atoms (QuEra, Pasqal) can scale to larger numbers but are still working on gate fidelities.

Current quantum computing landscape showing the gap between available hardware and algorithm requirements|600 — Current quantum computing landscape showing the gap between available hardware and algorithm requirements

The vertical axis in this figure is linked to the number of logical operations you can perform reliably.
Right now, we're in the NISQ (Noisy Intermediate-Scale Quantum) era, meaning we have devices with somehow a decent number of qubits but not enough coherence or low enough error rates to implement reach what is needed for useful algorithm, on the order of $10^{- 10}$ .
We can run short circuits (maybe a few hundred to a thousand gates) before errors accumulate and destroy the computation.

But look at where the useful algorithms sit.
Shor's algorithm needs millions of gates.
Quantum chemistry simulations need even more.
The applications that would provide clear practical advantage are still out of reach compared to today's achieved performance.

Even if we can increase the number of qubits, useful computations would still be out of reach because of the error rates gap between what we have and what is required.
This is precisely why quantum error correction is the next major topic in this course: building one almost-noise-free qubit out of many noisy ones.
QEC is the bridge the gap between NISQ devices and fault-tolerant quantum computers that can run Shor, simulate molecules, and tackle the problems we've been discussing.
Without error correction, we're limited to small demonstrations and variational algorithms that try to squeeze utility out of noisy circuits.
With error correction, we unlock the full algorithmic power of quantum computing.

The Path Forward

The takeaway of this section isn't pessimism, it's realism.
Quantum algorithms provide genuine speedups (quadratic for Grover, exponential for Shor and quantum simulation).
But translating those asymptotic advantages into practical utility requires overcoming significant engineering challenges.
We need:

Better qubits: Lower error rates, longer coherence times
More qubits: Scaling from hundreds to millions
Error correction: Efficient codes that don't require prohibitive overhead
Compilation and optimization: Smarter ways to map algorithms onto hardware constraints

The next lectures on quantum error correction will show you exactly how we plan to get there.
We'll see how the surface code and other QEC schemes can protect quantum information from noise, how to perform fault-tolerant gates on encoded qubits (aka robust to physical error propagating), and what it will take to reach the thresholds where error-corrected quantum computing becomes viable.
The resource estimates we've discussed today aren't fixed targets but they're moving goals that improve and get closer to reality as we develop better codes, better compilers, and better hardware.
But they give you a realistic picture of where the field stands and what needs to happen next.

Conclusion

In this lecture, we've explored three pillars of the quantum algorithm landscape.
Grover's algorithm showed us how quantum interference and geometric rotations in Hilbert space can achieve a quadratic speedup for unstructured search, with the insight that two reflections compose into a rotation.
The Quantum Fourier Transform revealed how quantum computers can extract periodicities exponentially faster than classical FFTs, by encoding information into relative phases of product states.
And Quantum Phase Estimation demonstrated how phase kickback and the inverse QFT can convert hidden eigenvalue information into measurable classical bits, powering applications from factoring to quantum chemistry.

But we also confronted a sobering reality.
The gap between theoretical speedups and practical implementations remains vast.
Shor's algorithm can break RSA-2048 in polynomial time, but doing so requires millions of physical qubits and weeks of coherent operation.
This gap isn't a fundamental limitation, it's an engineering challenge.

In the upcoming lectures, we'll see how quantum error correction works to bridge this gap, starting with simple repetition codes and building up to the surface code and other codes that are the leading candidates for fault-tolerant quantum computing.
We'll understand the threshold theorem that proves error correction can work even with imperfect gates, and we'll see what it takes to perform logical operations on encoded qubits without destroying the error protection.

References

1.: Chen, Z; Satzinger, K; Atalaya, J; Korotkov, A; Dunsworth, A; Sank, D; Quintana, C; McEwen, M; Barends, R; Klimov, P; Hong, S; Jones, C; Petukhov, A; Kafri, D; Demura, S; Burkett, B; Gidney, C; Fowler, A; Paler, A; Putterman, H; Aleiner, I; Arute, F; Arya, K; Babbush, R; Bardin, J; Bengtsson, A; Bourassa, A; Broughton, M; Buckley, B; Buell, D; Bushnell, N; Chiaro, B; Collins, R; Courtney, W; Derk, A; Eppens, D; Erickson, C; Farhi, E; Foxen, B; Giustina, M; Greene, A; Gross, J; Harrigan, M; Harrington, S; Hilton, J; Ho, A; Huang, T; Huggins, W; Ioffe, L; Isakov, S; Jeffrey, E; Jiang, Z; Kechedzhi, K; Kim, S; Kitaev, A; Kostritsa, F; Landhuis, D; Laptev, P; Lucero, E; Martin, O; McClean, J; McCourt, T; Mi, X; Miao, K; Mohseni, M; Montazeri, S; Mruczkiewicz, W; Mutus, J; Naaman, O; Neeley, M; Neill, C; Newman, M; Niu, M; O’Brien, T; Opremcak, A; Ostby, E; Pató, B; Redd, N; Roushan, P; Rubin, N; Shvarts, V; Strain, D; Szalay, M; Trevithick, M; Villalonga, B; White, T; Yao, Z; Yeh, P; Yoo, J; Zalcman, A; Neven, H; Boixo, S; Smelyanskiy, V; Chen, Y; Megrant, A; Kelly, J . Exponential suppression of bit or phase errors with cyclic error correction. Nature 595 (7867) , 383–387 (2021) . doi:10.1038/s41586-021-03588-y
2.: Chuang, I; Gershenfeld, N; Kubinec, M . Experimental Implementation of Fast Quantum Searching. Physical Review Letters 80 (15) , 3408–3411 (1998) . doi:10.1103/PhysRevLett.80.3408
3.: Grover, L . A fast quantum mechanical algorithm for database search. Proceedings of the twenty-eighth annual ACM symposium on Theory of Computing , 212–219 (1996) . doi:10.1145/237814.237866
4.: AbuGhanem, M . Characterizing Grover search algorithm on large-scale superconducting quantum computers. Scientific Reports 15 (1) , 1281 (2025) . doi:10.1038/s41598-024-80188-6
5.: Chu, J; He, X; Zhou, Y; Yuan, J; Zhang, L; Guo, Q; Hai, Y; Han, Z; Hu, C; Huang, W; Jia, H; Jiao, D; Li, S; Liu, Y; Ni, Z; Nie, L; Pan, X; Qiu, J; Wei, W; Nuerbolati, W; Yang, Z; Zhang, J; Zhang, Z; Zou, W; Chen, Y; Deng, X; Deng, X; Hu, L; Li, J; Liu, S; Lu, Y; Niu, J; Tan, D; Xu, Y; Yan, T; Zhong, Y; Yan, F; Sun, X; Yu, D . Scalable algorithm simplification using quantum AND logic. Nature Physics 19 (1) , 126–131 (2023) . doi:10.1038/s41567-022-01813-7
6.: Figgatt, C; Maslov, D; Landsman, K; Linke, N; Debnath, S; Monroe, C . Complete 3-Qubit Grover search on a programmable quantum computer. Nature Communications 8 (1) , 1918 (2017) . doi:10.1038/s41467-017-01904-7
7.: Main, D; Drmota, P; Nadlinger, D; Ainley, E; Agrawal, A; Nichol, B; Srinivas, R; Araneda, G; Lucas, D . Distributed quantum computing across an optical network link. Nature 638 (8050) , 383–388 (2025) . doi:10.1038/s41586-024-08404-x
8.: Thorvaldson, I; Poulos, D; Moehle, C; Misha, S; Edlbauer, H; Reiner, J; Geng, H; Voisin, B; Jones, M; Donnelly, M; Peña, L; Hill, C; Myers, C; Keizer, J; Chung, Y; Gorman, S; Kranz, L; Simmons, M . Grover’s algorithm in a four-qubit silicon processor above the fault-tolerant threshold. Nature Nanotechnology 20 (4) , 472–477 (2025) . doi:10.1038/s41565-024-01853-5
9.: Gidney, C . How to factor 2048 bit RSA integers with less than a million noisy qubits. (2025) . doi:10.48550/arXiv.2505.15917
10.: Zhou, H; Duckering, C; Zhao, C; Bluvstein, D; Cain, M; Kubica, A; Wang, S; Lukin, M . Resource Analysis of Low-Overhead Transversal Architectures for Reconfigurable Atom Arrays. (2025) . doi:10.48550/arXiv.2505.15907
11.: Reiher, M; Wiebe, N; Svore, K; Wecker, D; Troyer, M . Elucidating Reaction Mechanisms on Quantum Computers. Proceedings of the National Academy of Sciences 114 (29) , 7555–7560 (2017) . doi:10.1073/pnas.1619152114
12.: Chakrabarti, S; Krishnakumar, R; Mazzola, G; Stamatopoulos, N; Woerner, S; Zeng, W . A Threshold for Quantum Advantage in Derivative Pricing. (2021) . doi:10.48550/arXiv.2012.03819

Input state	Binary $\| q_{1} q_{2} q_{3} ⟩$	QFT Output $\| {\tilde{q}}_{1}, {\tilde{q}}_{2}, {\tilde{q}}_{3} ⟩$
$\| 0 ⟩$	$\| 000 ⟩$	$\| +, +, + ⟩$
$\| 1 ⟩$	$\| 001 ⟩$	$\| -, i, T_{1} ⟩$
$\| 2 ⟩$	$\| 010 ⟩$	$\| +, -, i ⟩$
$\| 3 ⟩$	$\| 011 ⟩$	$\| -, - i, T_{2} ⟩$
$\| 4 ⟩$	$\| 100 ⟩$	$\| +, +, - ⟩$
$\| 5 ⟩$	$\| 101 ⟩$	$\| -, i, T_{3} ⟩$
$\| 6 ⟩$	$\| 110 ⟩$	$\| +, -, - i ⟩$
$\| 7 ⟩$	$\| 111 ⟩$	$\| -, - i, T_{4} ⟩$

Lecture: Grover's Algorithm, Quantum Fourier Transform, Quantum Phase Estimation and Resource Estimates

Introduction

Grover's Algorithm

Quiz

Introduction

Algorithm Overview

The Initial Superposition

The Oracle or the Yes/No Verifier Blackbox Function

Mathematical Definition of the Oracle

Building the Oracle from a Classical Function

Phase Kickback: The Underlying Oracle Mechanism

Geometric Interpretation of Grover's Algorithm

Setting Up the 2D Subspace

Step 1: the Oracle as a Reflection

Step 2: The Diffusion Operator as a Second Reflection

The Diffusion Operator: Amplifying the Marked State

Iteration and Optimal Stopping

Generalization: Multiple Winners

Understanding the Amplitude Dynamics: Inversion About the Mean

The Physical Origin of Interference

Example: 2-Qubit Search (N=4)

Initial Setup

The Oracle

Implementing the Diffusion Operator

Qiskit Examples

Examples with 2 And 3 Qubits

Application: Sudoku

Experimental Implementations

Quantum Fourier Transform and Phase Estimation

Classical Fourier Transform

Discrete Fourier Transform (DFT)

The Quantum Fourier Transform

Matrix Representation

Key Properties

Building Intuition: How the QFT Encodes Information

QFT Circuit Implementation

The Inverse QFT

Do it Yourself!

Application: Quantum Phase Estimation (QPE)

The Problem

The Key Insight: the return of the Phase Kickback

Qiskit time

Resource Estimates

(Partial) Landscape of Quantum Algorithms

Resource Overheads: From Theory to Reality

Bridging Theory and Practice

Where We Stand Today

The Path Forward

Conclusion

References

Example: 2-Qubit Search ( $N = 4$ )