Benchmarking IQM Spark#

This notebook allows you to run some useful benchmarks for the Spark system. Before starting, make sure you have installed all the necessary packages:

!pip install iqm-benchmarks
!pip install ipykernel

Connect to the backend#

import os
from iqm.qiskit_iqm import IQMProvider
import random

os.environ["IQM_TOKENS_FILE"]="YOUR TOKEN HERE"
iqm_url =  'YOUR URL HERE'
provider = IQMProvider(iqm_url)
backend = provider.get_backend()

We can access the Spark backend and plot its connectivity graph to check that everything is working properly.

import networkx as nx
import matplotlib.pyplot as plt

coupling_map = backend.coupling_map

G = nx.Graph()
G.add_edges_from(coupling_map) 
pos = nx.spring_layout(G, seed=42) 
nx.draw(G, pos, with_labels=True, node_color='lightblue', edge_color='gray', 
        node_size=1000, font_size=10, linewidths=1.5, width=2)
plt.show()

We run the cell below to ignore those warnings that are not critical for the correct run of the benchmarks.

import warnings
warnings.filterwarnings(action="ignore")  

GHZ state fidelity#

The GHZ (Greenberger-Horne-Zeilinger) state is a maximally entangled quantum state that involves three or more qubits, $n$ . It is an equal superposition of all qubits being in state 0 and all qubits being in state 1, i.e., $| G H Z ⟩ = \frac{1}{\sqrt{2}} (| 0 ⟩^{\otimes n} + | 1 ⟩^{\otimes n})$ .

The GHZ state fidelity acts as a witness for genuine multi-qubit entanglement if found to be above $0.5$ . This means that the measurement results cannot be explained without entanglement involving all qubits, so it is a great way to evaluate the “quantumness” of the computer.

The state $ρ_{ideal} = | G H Z ⟩ ⟨ G H Z |$ is a pure state, so in this case the fidelity can be computed as:

F (ideal, measured) = ⟨ G H Z | ρ_{measured} | G H Z ⟩,

where $ρ_{measured}$ is the density matrix given by the actual results of the quantum computer. The ideal GHZ state density matrix entries can be written as $ρ_{i, j} = ⟨ i | ρ_{ideal} | j ⟩$ where $i, j$ are the $n$ basis states ${| 00. .0 ⟩, . . ., | 11. .1 ⟩}$ ; only the corner entries $ρ_{0, 0}, ρ_{0, n}, ρ_{n, 0}$ and $ρ_{n, n}$ are non-zero. This simplifies the process since we only need to measure these four components. In the fidelity formula, all other entries are effectively nullified by the zero entries in the ideal state matrix. To measure the coherences (off-diagonal entries) we use the method of multiple quantum coherences Mooney, 2021.

from iqm.benchmarks.entanglement.ghz import GHZConfiguration, GHZBenchmark

GHZ = GHZConfiguration(
    state_generation_routine="tree",
    custom_qubits_array=[[0,1,2,3,4]],
    shots=1000,
    qiskit_optim_level=3,
    optimize_sqg=True,
    fidelity_routine="coherences", 
    rem=True,
    mit_shots=1000,
)

If you want to modify the settings above, please refer to the documentation here.

Before running the benchmark analysis, we can visualize the histogram of counts obtained from measuring a GHZ state on 5 qubits:

from iqm.benchmarks.entanglement.ghz import generate_ghz_spanning_tree, get_edges
from qiskit import transpile
from qiskit.visualization import plot_histogram

qubit_layout = [0,1,2,3,4]
graph = get_edges(coupling_map=backend.coupling_map, qubit_layout=qubit_layout)
ghz_circuit = generate_ghz_spanning_tree(graph, qubit_layout, n_state=5)[0]
qc_transp = transpile(ghz_circuit, backend=backend, optimization_level=3)
res = backend.run(qc_transp, shots=10000).result() 
counts=res.get_counts()

plot_histogram(counts)

benchmark_ghz = GHZBenchmark(backend, GHZ)
run_ghz = benchmark_ghz.run()

result_ghz = benchmark_ghz.analyze()
for observation in result_ghz.observations:
    if observation.identifier.string_identifier == str(qubit_layout):
        print(f"{observation.name}: {observation.value}") 

Quantum Volume#

Quantum volume is a single-number metric that was introduced in Cross, 2019. It evaluates the quality of a quantum processor via the largest random square circuit, i.e., with the same number of layers of parallel random 2-qubit unitaries as number of qubits, that it can run successfully.

The success of a run is based on the heavy output probability, which corresponds to the probability of observing heavy outputs, i.e. the measurement outputs that occcur with a probability greater than the median of the distribution. The heavy output generation problem asks if the generated distribution of the random circuit we run contains heavy outputs at least 2/3 of the time (on average) with a high confidence level, typically higher than 97.5%. It can be shown that the heavy output probability for an ideal device is at around 0.85 asymptotically. The quantum volume is then defined as

\log_{2} V_{q} = \underset{n}{argmax} min (n, d (n))

where $n \leq N$ is a number of qubits and $d (n)$ is the achievable depth, i.e. the largest depth such that we are confident the probability of observing a heavy output is greater than 2/3.

from iqm.benchmarks.quantum_volume.quantum_volume import QuantumVolumeConfiguration, QuantumVolumeBenchmark

We define a combination of qubits to test quantum volume on. Due to the star topology, the combinations must contain at least qubit #2 (see topmost graph).

QV = QuantumVolumeConfiguration(
    num_circuits=500, 
    shots=2**8,
    calset_id=None,
    num_sigmas=2,
    choose_qubits_routine="custom",
    custom_qubits_array=[[0,1,2], [0,2,3], [0,2,4], [1,2,3], [1,2,4]], 
    qiskit_optim_level=3,
    optimize_sqg=True,
    max_gates_per_batch=60_000,
    rem=True,
    mit_shots=1_000,
)

If you want to modify the settings above, please refer to the documentation here.

Warning: The following code cell may take few minutes to run since it will compute the benchmark on all the qubit layouts specified above.

benchmark_qv = QuantumVolumeBenchmark(backend, QV)
run_qv = benchmark_qv.run()

result_qv = benchmark_qv.analyze()
for v in result_qv.plots.values():
    display(v)

Circuit Layer Operations Per Second (CLOPS)#

CLOPS is a metric that estimates the speed at which a quantum computer can execute Quantum Volume (QV) layers of a quantum circuit. That is, the circuits to calculate this benchmark have the same structure as the ones used for QV. Here we follow the definition introduced in (Wack, 2021), but other versions of this benchmark exist.

CLOPS is measured by means of a quantum variational-like protocol, where templates of parametrized QV circuits are assigned random parameters, executed, and outcomes are used as a seed to assign new parameters and repeat the process. The ratio of number of templates ( $M$ ), parameter updates ( $K$ ), measurement shots ( $S$ ) and QV layers ( $\log_{2} QV$ ) with the time taken to run all, constitutes the CLOPS value:

CLOPS = M \times K \times S \times \log_{2} QV / total_time .

Notice that the total CLOPS time includes that of assignment of parameters, submission of circuits and retrieval of results.

from iqm.benchmarks.quantum_volume.clops import CLOPSConfiguration, CLOPSBenchmark, plot_times

CLOPS = CLOPSConfiguration(
    qubits=[0,1,2],
    num_circuits=100,
    num_updates=10, 
    num_shots=100, 
    calset_id=None,
    qiskit_optim_level=3,
    optimize_sqg=True,
    routing_method="sabre",
    physical_layout="fixed",
)

If you want to modify the settings above, please refer to the documentation here.

benchmark_clops = CLOPSBenchmark(backend, CLOPS)
run_clops = benchmark_clops.run()

result_clops = benchmark_clops.analyze()
fig_clops = plot_times(result_clops.dataset, result_clops.observations)
display(fig_clops[1])

Clifford Randomized Benchmarking#

The idea behind Clifford Randomized Benchmarking (CRB) is that under certain (simplified) types of noise, the average survival probability of an initial state $| 0 ⟩$ under random sequences of Clifford gates and a final sequence inverse will decay exponentially in the length of the sequences. This can be written as

⟨ 0 | C_{inv} C_{m} \dots C_{2} C_{1} | 0 ⟩ \sim A p^{m} + B,

where $C_{1}, C_{2}, \dots, C_{m}$ is the random sequences of Clifford gates, $C_{inv} = (C_{1} C_{2} \dots C_{m})^{- 1}$ , $0 \leq p \leq 1$ and $0 \leq A, B \leq 1$ are constants isolating the effects of state-preparation and measurement (SPAM) errors (Magesan,2012). From such decay, one can in turn infer the average fidelity of the corresponding Clifford group.

The main assumption we will make here is that the noise can be modeled as Markovian, time-stationary and gate-independent.

The theory of CRB under these approximations, and the fact that the multi-qubit Clifford group is a unitary 2-design (i.e., uniformly averaging with two pairs of $C$ , $C^{†}$ Clifford operators gives the same result as using fully random unitaries), ensures that the average fidelity of our gate set is given by

{\overset{―}{F}}_{CRB} = p + 2^{- n} (1 - p) .

CRB is not generally intended to work for $n > 2$ , both because of the scaling of the size of the $n$ -qubit Clifford group in $n$ , and because such gates have to eventually be transpiled to a native basis of 1Q and 2Q gates!

It is important to mention that the average Clifford fidelity is related to the average fidelity of IQM’s native gate set for single-qubit gates as (Barends, 2014)

{\overset{―}{F}}_{GATE} \approx 1 - \frac{1 - {\overset{―}{F}}_{CRB}}{1.875} .

This is because all the single-qubit Clifford gates can be decomposed using on average 1.875 gates from IQM’s native set. This formula shows that the value of ${\overset{―}{F}}_{GATE}$ will always be slightly higher than ${\overset{―}{F}}_{CRB}$ , so one must be careful when comparing with average fidelities reported in the specs of a QPU.

from iqm.benchmarks.randomized_benchmarking.clifford_rb.clifford_rb import CliffordRBConfiguration,CliffordRandomizedBenchmarking

CRB = CliffordRBConfiguration(
    qubits_array=[[0],[1],[2],[3],[4]],
    sequence_lengths=[2**(m+1)-1 for m in range(9)],
    num_circuit_samples=25,
    shots=2**8,
    calset_id=None,
    parallel_execution=False,
)

If you want to modify the settings above, please refer to the documentation here.

Warning: The following code cell may take few minutes to run since it will compute the average fidelities for all the qubits in the QPU (and we set parallel_execution=False).

benchmark_clifford_rb = CliffordRandomizedBenchmarking(backend, CRB)
run_clifford_rb = benchmark_clifford_rb.run()

result_clifford_rb = benchmark_clifford_rb.analyze()
for plot in result_clifford_rb.plots.values():
    display(plot)

Interleaved Randomized Benchmarking (IRB)#

Differently from the previous protocol, this benchmark aims at estimating the average fidelity of an individual quantum gate. This can be achieved interleaving random Clifford gates between the gate of interest. This method was introduced in Magesan, 2012, and just as CRB, it is robust with respect to SPAM errors.

The protocol runs two sets of sequences, one solely made up of random Clifford gates, as in CRB, and one made up of random Clifford sequences but interleaving the gate of interest among these (and compiling the corresponding sequence inverse). IRB then extracts the corresponding decay parameters (where we expect the decay rate for IRB to be smaller than the CRB one, because the sequence is longer), and the average fidelity of the gate we wish to characterize is then calculated with a simple formula using the two decay parameters.

from iqm.benchmarks.randomized_benchmarking.interleaved_rb.interleaved_rb import InterleavedRBConfiguration, InterleavedRandomizedBenchmarking

IRB_CZ = InterleavedRBConfiguration(
    qubits_array=[[0,2],[1,2],[2,3],[2,4]],
    sequence_lengths=[2**(m+1)-1 for m in range(7)],
    num_circuit_samples=25,
    shots=2**8,
    calset_id=None,
    parallel_execution=False,
    interleaved_gate = "CZGate",
    interleaved_gate_params = None,
    simultaneous_fit = ["amplitude", "offset"],
)

If you want to modify the settings above, please refer to the documentation here.

NB: Clifford RB is executed by default when running Interleaved RB!

Warning: The following code cells may take several minutes to run.

benchmark_irb_CZ = InterleavedRandomizedBenchmarking(backend, IRB_CZ)
run_irb_CZ = benchmark_irb_CZ.run()

result_irb_CZ = benchmark_irb_CZ.analyze()
for plot in result_irb_CZ.plots.values():
    display(plot)

Q-Score#

The Q-score measures the maximum number of qubits that can be used effectively to solve the MaxCut combinatorial optimization problem with the Quantum Approximate Optimization Algorithm - Martiel,2021

The graphs chosen for the benchmark are random Erdős-Rényi graphs with 50% edge-probability between nodes. The obtained cost of the solution, i.e. the average number of cut edges, must be above a certain threshold. Specifically, one has to find the cost of a graph to be above $β \geq 0.2$ on a scale where $β = 0$ corresponds to a random solution and $β = 1$ to an ideal solution.

from iqm.benchmarks.optimization.qscore import QScoreConfiguration, QScoreBenchmark

QSCORE = QScoreConfiguration(
    num_instances = 100,
    num_qaoa_layers= 1,
    shots = 1000,
    calset_id=None, 
    min_num_nodes = 2,
    max_num_nodes = 5,
    use_virtual_node = True,
    use_classically_optimized_angles = True,
    choose_qubits_routine = "custom",
    custom_qubits_array=[[2],
                    [2, 0],
                    [2, 0, 1],
                    [2, 0, 1, 3],
                    [2, 0, 1, 3, 4]],
    seed = random.randint(1, 999999),
    REM = True,
    mit_shots = 1000,
    qpu_topology = "crystal",
    )

If you want to modify the settings above, please refer to the documentation here.

Warning: The following code cell may take several minutes to run.

benchmark_qscore = QScoreBenchmark(backend, QSCORE)
run_qscore = benchmark_qscore.run()

result_qscore = benchmark_qscore.analyze()

result_qscore.plot_all()

Summary#

Typical performance for IQM Spark is summarized in the table below and compared to the values obtained with your device. The typical single- and two-qubit gate fidelities reported below refer to the median over the 5 qubits and 4 couplings of the system, respectively.

import numpy as np

### GHZ
obs_ghz = result_ghz.observations
fidelity = round(min([obs_ghz[i].value for i in range(len(obs_ghz)) if obs_ghz[i].name=='fidelity']),2)

### QV
obs_qv = result_qv.observations
qv = max([obs_qv[i].value for i in range(len(obs_qv)) if obs_qv[i].name=='QV_result'])

### CLOPS
obs_clops = result_clops.observations
clops = max([obs_clops[item]['clops_v']['value'] for item in obs_clops])

### CRB
obs_crb = result_clifford_rb.observations
f_crb = round(np.median([obs_crb[i].value for i in range(len(obs_crb))]),3)

### IRB
obs_irb = result_irb_CZ.observations
f_irb = round(np.median([obs_irb[i].value for i in range(len(obs_irb)) if obs_irb[i].name=='avg_gate_fidelity_interleaved']),3)

### QS 
obs_qs = result_qscore.observations
qs = np.argmin([obs_qs[i].value-0.2 for i in range(len(obs_qs)) if obs_qs[i].name == 'mean_approximation_ratio' and obs_qs[i].value-0.2>0])+2


summary = {'5-qubit GHZ state fidelity': ['≥ 0.5', fidelity],
    'Quantum Volume': ['≥ 8', qv], 
    'CLOPS': ['3000', clops], 
    'Single-qubit gate fidelity': ['≥ 0.999', f_crb],
    'Two-qubit gate (CZ) fidelity': ['≥ 0.98', f_irb], 
    'Q-Score': ['≥ 5', qs] 
}

print("{:<30} {:<15} {:<15}".format('Benchmark', 'Typical', 'Your device'))
for k, v in summary.items():
    label, num = v
    print("{:<30} {:<15} {:<15}".format(k, label, num))

Benchmarking IQM Spark

Contents

Benchmarking IQM Spark#

Connect to the backend#

GHZ state fidelity#

Quantum Volume#

Circuit Layer Operations Per Second (CLOPS)#

Clifford Randomized Benchmarking#

Interleaved Randomized Benchmarking (IRB)#

Q-Score#

Summary#