frontier.cliffordvolumebenchmark

Clifford volumetric benchmark implementation.

class CliffordVolumeBenchmark(number_of_qubits: int, sample_size: int = 10, **kwargs: Any)

Bases: Benchmark

Volumetric benchmark based on random Clifford operator.

For each sample, this benchmark:

Draws a random Clifford tableau on number_of_qubits qubits.
Converts the tableau into a base QuantumCircuit.
Randomly selects stabilizers and destabilizers.
Builds measurement circuits for each selected stabilizer/destabilizer.
Exports each measurement circuit as QASM plus an observable string.

The per-sample object returned by _create_single_sample() matches the JSON schema expected by Benchmark, for example:

{
  "sample_id": int,
  "sample_metadata": {...},
  "circuits": [
    {
      "circuit_id": str,
      "observable": str,    # e.g. "+XZI..."
      "qasm": str,
      "metadata": {...}
    },
    ...
  ]
}

number_of_measurements: Number of stabilizer/destabilizer measurement circuits per sample, computed by _compute_number_of_measurements().

compute_expectation_values() → Dict[str, float][source]

Compute expectation values for all circuits using experimental results.

This scans all circuits in the benchmark and computes expectation values for those with a non-None observable field, using Benchmark.expected_value().

Returns:: Mapping from circuit_id to expectation value.
Return type:: Dict[str, float]
Raises:: ValueError – If experimental results are missing, malformed, or inconsistent with the stored samples.

evaluate_benchmark(*, auto_save: bool | None = None, save_to: str | Path | None = None) → Dict[str, Any][source]

Evaluate the Clifford benchmark using experimental results.

Implements the manuscript criteria.

Per-observable (worst-case):: (I) \(\langle S \rangle - 2\sigma \ge \tau_S\) and \(|\langle D \rangle| + 2\sigma \le \tau_D\)
Per-Clifford-instance averages:: (II) \(\mathrm{mean}(\langle S \rangle) - 5\bar{\sigma} \ge \tau_S\) and \(|\mathrm{mean}(\langle D \rangle)| + 5\bar{\sigma} \le \tau_D\)

Returns:: Structured dictionary containing computed statistics and pass/fail flags.
Return type:: dict

get_all_expectation_value() → Dict[int, Dict[str, Dict[str, Tuple[float, float]]]][source]

Return expectation values and errors grouped by sample and kind.

The output is grouped first by sample_id, then by measurement kind ("stabilizer" or "destabilizer"), and finally by the Pauli observable string.

Uses stored "expectation_value" / "std_error" in experimental_results["results"] if available; otherwise computes them on the fly from counts and shots, and caches them.

Returns:

Nested mapping of the form:

{
  sample_id: {
    "stabilizer": {
        observable_str: (expectation_value, std_error),
        ...
    },
    "destabilizer": {
        observable_str: (expectation_value, std_error),
        ...
    },
  },
  ...
}

Return type:

Dict[int, Dict[str, Dict[str, Tuple[float, float]]]]

Raises:

ValueError – If experimental results or samples are missing, if required count entries are missing, or if shots is not a positive integer.

plot_all_expectation_values() → None[source]

Plot all stabilizer and destabilizer expectation values.

Plots expectation values (with standard error bars) across the entire benchmark, with separate markers for stabilizers and destabilizers and threshold guide lines for pass/fail criteria.

Requires evaluate_benchmark() to have been run so that self.experimental_results["evaluation"] is populated.

Raises:: ValueError – If experimental results or evaluation entries are missing, or if shots is not a positive integer.

plot_expectation_histograms(bins: int = 20) → None[source]

Plot histograms of stabilizer and destabilizer expectation values.

This is useful for understanding the distribution / quality of the measured expectation values across the entire benchmark.

Requires evaluate_benchmark() to have been run so that self.experimental_results["evaluation"] is populated.

Parameters:: bins – Number of histogram bins to use.
Raises:: ValueError – If experimental results or evaluation entries are missing.

plot_expected_values(sample_id: int) → None[source]

Plot stabilizer and destabilizer expectation values for a sample.

Generates two error-bar plots: one for stabilizer expectation values and one for destabilizer expectation values, for a given sample index.

Values are taken from get_all_expectation_value(), which computes or reuses cached expectation values and standard errors.

Parameters:: sample_id – Index of the benchmark sample to visualize.
Raises:: ValueError – If experimental results are missing, the benchmark has no samples, or the given sample ID is invalid or has missing data.