IBM Quantum Challenge 2020 - Part 5

This post is part of a serie of posts on my experience with the IBM Quantum Challenge 2020.

The serie is organised as follow:

  1. Introduction to the problem.
  2. Mathematical reformulation of the problem.
  3. Unoptimised implementation of the solution.
  4. How to optimise a quantum circuit?.
  5. Optimising multi-controlled X gates (this post).
  6. Optimising the QRAM (Quantum Random Access Memory).
  7. Optimising the oracle

In the last blog post. I showed how to benchmark a quantum circuit to isolate the optimisation cadidates.

In this post I will use the call-graph generated in the previous blog post (re-included below) to perform the first optimisation.

Profiling of the non-optimised solution. If you have hard time understanding the meaning of this graph, have a look at the [gprof2dot documentation]().
Profiling of the non-optimised solution. If you have hard time understanding the meaning of this graph, have a look at the gprof2dot documentation.

65k - Improving multi-controlled X gates

The call-graph obtained in the last section is very useful in understanding the first optimisation target: the mcx_gray procedure accounts for more than 95% of the total cost!

This basically means that if I am able to halve the cost of implementing a multi-controlled X gate, I will nearly halve the total cost!

The question is now “how to reduce the multi-controlled X gate cost?”.

The first step to answer this question is to understand that a multi-controlled X gate should be decomposed with only 1- and 2- qubit gates. Within the context of the challenge, this restriction comes from the fact that the cost of a circuit can only be computed if the circuit is exclusively composed of cx and u3 gates.

The decomposition step can be performed by following different algorithms that are more or less efficient.

If you are interested in multi-controlled X gate construction you can have a look at the excellent blog post from Craig Gidney: Constructing Large Controlled Nots.

Benchmarking mcx algorithms

It is easy to see from the documentation of QuantumCircuit.mcx that qiskit implements several construction algorithms that have different behaviours:

  1. “noancilla”, the default construction algorithm, implements a multi-controlled X gate without needing any additional ancilla qubits.
  2. “recursion” needs 1 ancilla qubit if we need more than 4 control qubits.
  3. “v-chain” needs \(c - 2\) ancilla qubits, \(c\) being the number of control qubits.

The benchmark of these different methods is quite straightforward:

import matplotlib.pyplot as plt

from qiskit import QuantumCircuit, QuantumRegister

# Warning: the "noancilla" method generates circuits with an exponential
#          number of gates. Setting this value to something higher will
#          likely increase by a lot the computational time.
max_qubits = 15

for method in ["noancilla", "recursion", "v-chain"]:
    costs = list()
    for ctrl_number in range(3, max_qubits):
        qctrl = QuantumRegister(ctrl_number)
        qtarget = QuantumRegister(1)
        qancilla = QuantumRegister(ctrl_number - 2)

        circuit = QuantumCircuit(qctrl, qtarget, qancilla)

        circuit.mcx(qctrl, qtarget, ancilla_qubits=qancilla, mode=method)
        print(ctrl_number, method, costs[-1])
    plt.semilogy(range(3, max_qubits), costs, label=method)

plt.title("Evolution of mcx cost for different construction algorithms")
plt.xlabel("Number of controls")
plt.ylabel("Cost of the multi-controlled X gate")

And here is the graph produced by the previous code snippet:

Costs of the different multi-controlled X gate construction methods with respect to the number of controls needed.
Costs of the different multi-controlled X gate construction methods with respect to the number of controls needed.

The “v-chain” method is a clear winner here, even if it requires more ancilla qubits. So let’s change the non-optimised code to use the “v-chain” method instead of the default “noancilla” one.

Changing the code to use the best mcx implementation

The code resulting from applying this optimisation is available in this Gist. With this small change the overall circuit cost went down from 144 423 to 65 429!

Re-using qprof, the following call-graph is generated:

Profiling of the solution after the `mcx` optimisation.
Profiling of the solution after the mcx optimisation.

The overall number of gates droped significantly but the most costly procedure is still the mcx gate. This is good hint for what should be the next optimisation: as I already optimised each mcx gate, the only optimisation left is to try to lower down the number of mcx gates used.

According to the call-graph, the mcx gates are called by the 2 main quantum circuits:

  1. The QRAM (qram and qram_dg) that accounts for more than 60% of the overall cost.
  2. The oracle (check_if_match_any_permutation) that represents 37% of the overall cost.

Because both the QRAM and the oracle takes a large portion of the overall cost, I will eventually need to optimise both of them.

The next blog post will be about QRAM optimisation. The oracle optimisation will be discussed in another blog post.

PhD student in Quantum Computing

I am a doctoral student currently involved in a PhD on quantum computing algorithms and how they can be applied to speed-up scientific computing.