### Model and simulations

Our quantum spins occupy the nodes of a square lattice of side *L*, endowed with PBCs. The coupling matrix *J*_{x,y} in equation (1) is non-vanishing only for nearest lattice neighbours. A problem instance, or sample, is defined by the choice of the *J*_{x,y} matrix. The non-vanishing matrix elements, *J*_{x,y} = *J*_{y,x} are random, independent variables in our simulations. Specifically, we chose *J*_{x,y} = ± *J*_{0} with 50% probability. We chose energy units such that *J*_{0} = 1.

Given an observable \({\mathcal{A}}\), we shall refer to its thermal expectation value in a given sample as \(\langle {\mathcal{A}}\rangle \) (see equation (8) below; the temperature is as low as possible, ideally *T* = 0). Thermal expectation values are averaged over the choice of couplings (quenched disorder; see, for example, ref. ^{25}). We shall denote the second average (over disorder) as \(\overline{\langle {\mathcal{A}}\rangle }\).

#### Crucial symmetries

The most prominent symmetries in this problem are the gauge and the parity symmetries. Both symmetries are exact for the Hamiltonian (equation (1)) and for its Trotter–Suzuki approximation (‘The Trotter–Suzuki formula’).

The parity symmetry \(P=\prod _{x}{\sigma }_{x}^{X}\) is a self-adjoint, unitary operator that commutes with the Hamiltonian (equation (1)), as well as with the exact (equation (7)) and approximate (equation (8)) transfer matrices. The Hilbert space is divided into two subspaces of the same dimension according to the parity eigenvalues, which are either +1 (even states) or −1 (odd states). We also classify operators as either even (\(P\,{\mathcal{A}}P={\mathcal{A}}\)) or odd (\(P\,{\mathcal{A}}P=-{\mathcal{A}}\)). Matrix elements of even operators are non-vanishing only if the two states have the same parity (in contrast, the parity of the states should differ for odd operators). An oversimplified but enlightening cartoon of the spectra in our problem is represented in Extended Data Fig. 1 (see below some exact-diagonalization results that support this view).

Parity symmetry is just a particular type of gauge transformation. Let us arbitrarily choose for each site *n*_{x} = 0 or 1. The corresponding gauge operator \({G}_{\{{n}_{x}\}}=\prod _{x}{({\sigma }_{x}^{X})}^{{n}_{x}}\) is self-adjoint and unitary. It transforms the Hamiltonian in equation (1) into a Hamiltonian of the same type but with modified couplings^{61}: \({J}_{x,y}\to {J}_{x,y}{(-1)}^{{n}_{x}+{n}_{y}}.\) The gauge symmetry is enforced by the process of taking the disorder average. Indeed, the gauge-transformed coupling matrix has the same probability as the original one. Hence, meaningful observables should be invariant under an arbitrary gauge transformation. The parity operator is obtained by setting *n*_{x} = 1 for all sites, which does not modify *J*_{x,y} (hence, parity is a symmetry for a given problem instance not just a symmetry induced by the disorder average).

#### The Trotter–Suzuki formula

We follow the Trotter–Suzuki approximation^{62,63}, which replaces the original quantum spins on an *L* × *L* lattice by classical spins on an *L* × *L* × *L*_{τ} lattice, *S*_{x,τ} = ±1. The extra dimension *τ* is named Euclidean time. We shall write **S** as a shorthand for the *L*^{D}*L*_{τ} spins in the system (*D* = 2, here). Instead, **S**_{τ} will refer to the *L*^{D} spins at time *τ*. We take equal-strength couplings along the time and space directions (see, for example, ref. ^{5}). The probability of **S** is given by

$$p({\bf{S}})=\frac{{{\rm{e}}}^{-k{\mathcal{E}}({\bf{S}})}}{Z},\quad Z=\sum _{\{{\bf{S}}\}}{{\rm{e}}}^{-k{\mathcal{E}}({\bf{S}})},$$

(5)

where *Z* is the partition function, and

$$\begin{array}{l}{\mathcal{E}}({\bf{S}})\,=\,-\mathop{\sum }\limits_{\tau =0}^{{L}_{\tau -1}}\left[\frac{1}{2}\sum _{x,y}\,{J}_{x,y}{S}_{x,\tau }{S}_{y,\tau }+\sum _{x}{S}_{x,\tau }{S}_{x,\tau +1}\right],\\ \,\,\,\,\varGamma \,=\,-\frac{1}{2k}\log \,\tanh \,k.\end{array}$$

(6)

Although we have not done so, note that one could use a different coupling over Euclidean time, *k*_{τ}, which would relate *k* and *Γ* as \(\varGamma =-\frac{1}{2k}\log \,\tanh \,{k}_{\tau }\) or, equivalently, \({k}_{\tau }=-\frac{1}{2}\log \,\tanh (k\varGamma )\) (the continuum-time limit at *T* = 0 would be reached by taking *L*_{τ} → *∞* first and, afterwards, *k* → 0). PBCs are assumed over Euclidean time. Below, we shall find it useful to consider as well APBCs along only the *τ* direction. Besides, as the reader may check, *k* is a monotonically decreasing function of *Γ*.

Possibly the most direct connection between the *D* + 1 classical spin system and the original quantum problem is provided by the transfer matrix \(\widetilde{{\mathcal{T}}}\) (refs. ^{64,65}). Let us define \({{\mathcal{H}}}_{0}=-\frac{1}{2}\sum _{x,y}{J}_{x,y}{\sigma }_{x}^{Z}{\sigma }_{y}^{Z}\) and \({{\mathcal{H}}}_{1}=-\varGamma \sum _{x}{\sigma }_{x}^{X}\). The quantum thermal expectation value at temperature *T* = 1/(*k**L*_{τ}) is

$$\langle \langle {\mathcal{A}}\rangle \rangle =\frac{{\rm{Tr}}{\mathcal{A}}\,{\widetilde{{\mathcal{T}}}}^{{L}_{\tau }}}{{\rm{Tr}}{\widetilde{{\mathcal{T}}}}^{{L}_{\tau }}},\quad \widetilde{{\mathcal{T}}}={e}^{-k({{\mathcal{H}}}_{0}+{{\mathcal{H}}}_{1})}.$$

(7)

Now, for \({\mathcal{A}}={A}_{{\rm{cl}}}(\{{\sigma }_{{\bf{x}}}^{Z}\})\), \({A}_{{\rm{cl}}}\) being an arbitrary function, the Trotter–Suzuki approximation amounts to substituting the true transfer matrix in equation (7) by its proxy \({\mathcal{T}}\) (\({\mathcal{T}}=\widetilde{{\mathcal{T}}}+{\mathcal{O}}({k}^{3})\)):

$$\langle {\mathcal{A}}\rangle =\frac{{\rm{Tr}}{\mathcal{A}}{{\mathcal{T}}}^{{L}_{\tau }}}{{\rm{Tr}}{{\mathcal{T}}}^{{L}_{\tau }}},\quad {\mathcal{T}}={e}^{-k{{\mathcal{H}}}_{0}/2}{e}^{-k{{\mathcal{H}}}_{1}}{e}^{-k{{\mathcal{H}}}_{0}/2}.$$

(8)

\(\langle {\mathcal{A}}\rangle \) can be computed as well by averaging *A*_{cl}(*S*_{τ}), evaluated over configurations distributed according to equation (5). (The value of *τ* is arbitrary; hence, one may gain statistics by averaging over *τ*).

Finally, let us emphasize that both \({\mathcal{T}}\) and \(\widetilde{{\mathcal{T}}}\) are self-adjoint, positive-definite, transfer matrices that share the crucial symmetries discussed in ‘Crucial symmetries’.

### Observables

The quantities defined in ‘One-time observables’ were aimed at probing the ground state as *k* (and, hence, *Γ* (equation (6))) varies. These quantities will always be averaged over disorder before we proceed with the analysis.

Instead, the time correlations in ‘Two-times observables’ will probe the excitations. These time correlations will be analysed individually for each sample (sample-to-sample fluctuations are considered in ‘Fitting process and estimating the Euclidean correlation length’).

#### One-time observables

We consider the *L*^{D} × *L*^{D} correlation matrices *M* and \(\widehat{M}\) (refs. ^{45,46}) with **p** = (2π/*L*, 0) or (0, 2π/*L*):

$${M}_{x,y}=\langle {\sigma }_{x}^{Z}{\sigma }_{y}^{Z}\rangle ,\quad {[\widehat{M}]}_{x,y}={M}_{x,y}{e}^{\text{i}{\bf{p}}\cdot (x-y)}.$$

(9)

The *n*-body spin-glass susceptibilities at both zero and minimal momentum are

$${\chi }^{(n)}=\frac{\overline{{\rm{Tr}}\left[{M}^{n}\right]}}{{L}^{D}},\quad {F}^{(n)}=\frac{1}{{L}^{D}}\overline{{\rm{Tr}}\left[\widehat{M}{M}^{n-1}\right]}.$$

(10)

*χ*^{(n)} and *F*^{(n)} give us access to the second-moment correlation length (see, for example, ref. ^{48}):

$${\xi }^{(n)}=\frac{1}{2\sin ({\rm{\pi }}/L)}\sqrt{\frac{{\chi }^{(n)}}{{F}^{(n)}}-1}.$$

(11)

As *L* grows, *χ*^{(n)} and *ξ*^{(n)} remain of order 1 in the paramagnetic phase, whereas, in the critical region, they diverge as \({\chi }^{(n)}\propto {L}^{{\gamma }^{(n)}/\nu }\) and *ξ*^{(n)} ∝ *L*. In the spin-glass phase, *χ*^{(n)} ∝ *L*^{D(n−1)} (*ξ*^{(n)} ∝ *L*^{a} with some unknown exponent *a* > 1).

Our *χ*^{(n=2)} and *ξ*^{(n=2)} are just the standard quantities in the spin-glass literature^{66,67}. In fact, in the simplest approximation (see ref. ^{47} for a more paused exposition) at criticality and for large separations *r* between *x* and *y*, *M*_{x,y} ∝ *v*_{x}*v*_{y}/*r*^{a} with *v*_{x}, *v*_{y} of order 1 (so, *γ*^{(n)}/*ν* = (*n* − 1)*D* − *n**a* in this approximation). Hence, if *D* > *a*, *γ*^{(n)} grows with *n*. Indeed, *n* = 3 turns out to be a good compromise between statistical errors, which grow with *n*, and a strong enough critical divergence of *χ*^{(n)} (*χ*^{(n=2)} barely diverges^{5}).

Besides, we computed the Binder cumulant using *Q*_{2} = *L*^{D}*χ*^{(n=2)} as

$$B=\frac{{Q}_{4}}{{Q}_{2}^{2}},\quad {Q}_{4}=\sum _{x,y,z,u}\overline{{\langle {\sigma }_{x}^{Z}{\sigma }_{y}^{Z}{\sigma }_{z}^{Z}{\sigma }_{u}^{Z}\rangle }^{2}}.$$

(12)

The Gaussian nature of the fluctuations in the paramagnetic phase causes *B* to approach 3 as *L* grows for fixed *k* < *k*_{c}. *B* reaches different large-*L* limits for fixed *k* ≥ *k*_{c} (for *k* > *k*_{c}, different behaviours are possible, depending on the degree of replica symmetry breaking^{68}).

#### Two-times observables

Let us start by defining the time-correlation function of an observable \({\mathcal{A}}\) (for simplicity, consider a product of *σ*^{Z} operators at some sites):

$${C}_{{\mathcal{A}}}(\tau )=\frac{{\rm{Tr}}\,{\mathcal{A}}\,{{\mathcal{T}}}^{\tau }\,{\mathcal{A}}\,{{\mathcal{T}}}^{{L}_{\tau }-\tau }}{{\rm{Tr}}\,{{\mathcal{T}}}^{{L}_{\tau }}}.$$

(13)

\({C}_{{\mathcal{A}}}(\tau )\) can be computed from our spin configurations distributed according to the classical weight (equation (5)) by averaging \({\sum }_{{\tau }_{1}=0}^{{L}_{\tau }-1}{A}_{{\rm{cl}}}({{\bf{S}}}_{{\tau }_{1}}){A}_{{\rm{cl}}}({{\bf{S}}}_{{\tau }_{1}+\tau })/{L}_{\tau }^{2}\) (notation as in ‘The Trotter–Suzuki formula’).

Specifically, we have considered

$$C(\tau )=\frac{\sum _{x}{C}_{{\sigma }_{x}^{Z}}(\tau )}{{L}_{\tau }^{D}},\quad {Q}_{2}(\tau )=\frac{\sum _{x,y}{C}_{{\sigma }_{x}^{Z}{\sigma }_{y}^{Z}}(\tau )}{{L}_{\tau }^{2D}}.$$

(14)

Let us briefly recall some general results^{64,65} for \({C}_{{\mathcal{A}}}(\tau )\) that follow from the spectral decomposition of the transfer matrix (to simplify the notation, let us first disregard the parity symmetry and consider PBCs).

The *τ*-dependence is given by the additive contribution of every pair of states *E*_{n} < *E*_{m} (*n* = 0 is the ground state). Each pair generates an exponentially decaying term \({B}_{n,m}[{{\rm{e}}}^{-\tau /{\eta }_{n,m}}+{{\rm{e}}}^{-({L}_{\tau }-\tau )/{\eta }_{n,m}}]\) with correlation length *η*_{n,m} = 1/(*kΔ*_{n,m}), where *Δ*_{n,m} = (*E*_{m} − *E*_{n}). The amplitude is \({B}_{n,m}={{\rm{e}}}^{-{L}_{\tau }/{\eta }_{0,n}}\,| \langle n| A| m\rangle {| }^{2}/\widehat{Z}\), with \(\widehat{Z}=1+\sum _{n\ > \ 0}{{\rm{e}}}^{-{L}_{\tau }/{\eta }_{0,n}}\). Hence if *L*_{τ} ≫ *η*_{0,n} the contribution of this pair of states can be neglected. Besides, in the presence of parity symmetry, for even \({\mathcal{A}}\) we find *B*_{n,m} = 0 if the parity of \(\left|n\right\rangle \) and \(\left|m\right\rangle \) differ (for odd operators, *B*_{n,m} = 0 if both parities are equal). This is why the largest correlation length for *Q*_{2}(*τ*) is the maximum of *η*_{e} and *η*_{o}, whereas the relevant correlation length for *C*(*τ*) is *η* (Extended Data Fig. 1).

Moreover, for even operators, every state \(\left|n\right\rangle \) provides an additive contribution to a *τ*-independent term (namely, the plateau in Extended Data Fig. 2): \({{\rm{e}}}^{-{L}_{\tau }/{\eta }_{0,n}}\,| \langle n| A| n\rangle {| }^{2}/\widehat{Z}\). Instead, for odd operators, \(| \langle n| A| n\rangle | =0\) (hence, odd operators lack a plateau). To manage a situation with APBCs, one just needs to add an extra parity operator as a final factor in both the numerator and the denominator of both equations (8) and (13). If parity is a symmetry, as is the case for our problem, \(\widehat{Z}\) is modified as \(\widehat{Z}=1+\sum _{n > 0}{p}_{n}{{\rm{e}}}^{-{L}_{\tau }/{\eta }_{0,n}}\) (*p*_{n} = ±1 is the parity of the state) and the contribution to the APBC plateau gets an extra factor *p*_{n}, as well.

One may wish to average over samples \({C}_{{\mathcal{A}}}(\tau )\). The dominant time decay for a given sample will be approximately *B*e^{−τ/η}. Hence, the time decay for the averaged \(\overline{{C}_{{\mathcal{A}}}(\tau )}\) is an integral ∫d*B*d*η**ρ*(*B*, *η*)*B*e^{−τ/η}, where *ρ*(*B*, *η*) is the corresponding probability density (over the samples). For simplicity, let us assume that fluctuations of the amplitude *B* are mild. Then, the scaling in equation (4) implies that, for large *L*_{τ}, the asymptotic time decay of \(\overline{{C}_{{\mathcal{A}}}(\tau )}\) is a function of the scaled time *τ*/*L*^{z}, where *z* is a dynamic exponent that applies to the parity of *A*. One just needs to change the integration variable as *u* = *η*/*L*^{z} and recall the scaling form *ρ*(*η*) ≈ *f*(*η*/*L*^{z})/*L*^{z}, where *f* is a suitable scaling function.

#### The limit of zero temperature

We shall assume that we can reach *L*_{τ} large enough so that \({{\rm{e}}}^{-{L}_{\tau }/{\eta }_{{\rm{e}}}},\;{{\rm{e}}}^{-{L}_{\tau }/{\eta }_{{\rm{o}}}}\ll 1\) (the notation is explained in Extended Data Fig. 1). Moreover, we shall not assume that \({\epsilon }\equiv {{\rm{e}}}^{-{L}_{\tau }/\eta }\) is small (in fact, for some samples, one could even have *ϵ* ≈ 1).

Now, consider an even operator \({\mathcal{A}}\), and let us define \({{\mathcal{A}}}_{{\rm{e}}}=\langle {0}_{{\rm{e}}}| {\mathcal{A}}| {0}_{{\rm{e}}}\rangle \) and \({{\mathcal{A}}}_{{\rm{o}}}=\langle {0}_{{\rm{o}}}| {\mathcal{A}}| {0}_{{\rm{o}}}\rangle \) (the thermal expectation value at exactly *T* = 0 is \({{\mathcal{A}}}_{{\rm{e}}}\)). The plateau at *τ* ≫ *η*_{e}, *η*_{o} (Extended Data Fig. 2) is

$${C}_{{\mathcal{A}}}(\tau \gg {\eta }_{{\rm{e}}},{\eta }_{{\rm{o}}})={{\mathcal{A}}}_{{\rm{e}}}^{2}+[{{\mathcal{A}}}_{{\rm{o}}}^{2}-{{\mathcal{A}}}_{{\rm{e}}}^{2}]\frac{\zeta {\epsilon }}{1+\zeta {\epsilon }},$$

(15)

where *ζ* = 1 for PBCs and *ζ* = −1 for APBCs. Thus, we get for the plateau of *Q*_{2}(*τ*)

$${Q}_{2}(\tau \gg {\eta }_{{\rm{e}}},{\eta }_{{\rm{o}}})={Q}_{2,{\rm{e}}}+[{Q}_{2,{\rm{o}}}-{Q}_{2,{\rm{e}}}]\frac{\zeta {\epsilon }}{1+\zeta {\epsilon }},$$

(16)

where *Q*_{2,e} and *Q*_{2,o} are, respectively, the average over all pairs (*x*, *y*) of \({{\mathcal{A}}}_{{\rm{e}}}\) and \({{\mathcal{A}}}_{{\rm{o}}}\) (\({\mathcal{A}}={\sigma }_{x}^{Z}{\sigma }_{y}^{Z}\); recall equation (14)). Let us give a few hints about the derivation of equations (15) and (16). The contribution of state \(\left|n\right\rangle \) to the plateau is \({{\rm{e}}}^{-{L}_{\tau }/{\eta }_{0,n}}\,| \langle n| A| n\rangle {| }^{2}{p}_{n}/\widehat{Z}\) where *p*_{n} = 1 for PBCs whereas, for APBCs, *p*_{n} = 1 for even states and *p*_{n} = −1 for odd states. As explained before, we just keep the states \(\left|{0}_{{\rm{e}}}\right\rangle \) and \(\left|{0}_{{\rm{o}}}\right\rangle \) when estimating the plateau.

To excellent numerical accuracy, the left-hand side of equation (16) is also the value one gets for \({\rm{Tr}}{M}^{2}/{L}^{2D}\) (Extended Data Fig. 2). In fact, the difference between \({\langle {\mathcal{A}}\rangle }^{2}\) and its plateau is \(\zeta {\epsilon }{({{\mathcal{A}}}_{{\rm{e}}}-{{\mathcal{A}}}_{{\rm{o}}})}^{2}/{(1+\zeta {\epsilon })}^{2}\) (hence, quadratic in \(({{\mathcal{A}}}_{{\rm{e}}}-{{\mathcal{A}}}_{{\rm{o}}})\) rather than linear, as in equation (15)).

Now, despite their simplicity, two important consequences follow from equations (15) and (16).

First, the limit *T* → 0 (or *L*_{τ} → *∞*) is approached monotonically. Furthermore, the systems with PBCs and APBCs (Extended Data Fig. 2) approach the limit from opposite sides. We have explicitly checked all our samples, finding no instance where the APBC plateau lies above the PBC one (it is intuitively natural to expect that the PBC system will be more ordered than the APBC one). Hence, we conclude that the samples with PBCs converge to *T* → 0 from above, whereas the APBC ones converge from below.

Second, as 0 < *Q*_{2}(*τ*) < 1 also for APBCs, one has \(0 < {Q}_{2}^{{\rm{APBC}}}(\tau \gg {\eta }_{{\rm{e}}},{\eta }_{{\rm{o}}})\)\( < {Q}_{2,{\rm{e}}} < 1\). Thus, \(| {Q}_{2}^{{\rm{APBC}}}(\tau \gg {\eta }_{{\rm{e}}},{\eta }_{{\rm{o}}})-{Q}_{2,{\rm{e}}}| < 1\), and we conclude that ∣*Q*_{2,o} − *Q*_{2,e}∣ < (1 − *ϵ*)/*ϵ*. Hence, quite paradoxically, the particularly difficult samples with *ϵ* ≈ 1 generate a very small finite-temperature bias in the PBC estimator (compare the *L*_{τ} dependence of the PBC and the APBC plateaus in Extended Data Fig. 2). This is why we are able to reach the *T* → 0 limit for the even operators, even if a fraction of our samples suffers from a large value of *ϵ*.

#### Simulation details

We followed two approaches: exact diagonalization of the transfer matrix (equation (8)) and Markov chain Monte Carlo simulations of the classical weight (equation (5)). GPUs were crucial for both. We provide here only the main details (the interested reader is referred to ref. ^{69}).

Exact diagonalization is limited to small systems (up to *L* = 6 in our case). Indeed, the transfer matrix has a size \({2}^{{L}^{2}}\times {2}^{{L}^{2}}\). Parity symmetry has allowed us to represent \({\mathcal{T}}\) as a direct sum of two submatrices of half that size^{69}. Specifically, we computed the eigenvalues \({{\rm{e}}}^{-k{E}_{0,{\rm{e}}}}\), \({{\rm{e}}}^{-k{E}_{1,{\rm{e}}}}\), \({{\rm{e}}}^{-k{E}_{0,{\rm{o}}}}\) and \({{\rm{e}}}^{-k{E}_{1,{\rm{o}}}}\), as well as the corresponding eigenvectors \(| {0}_{{\rm{e}}}\rangle \), \(| {0}_{{\rm{o}}}\rangle \), \(\left|{1}_{{\rm{e}}}\right\rangle \) and \(\left|{1}_{{\rm{o}}}\right\rangle \), for 1,280 samples of *L* = 6 at *k* = 0.31 and 0.305 (the same samples at both *k* values). We repeated the calculations for a subset of 350 samples at *k* = 0.3 and 0.295. We managed to keep the computing time within an acceptable time frame of 20 min per diagonalization using 256 GPUs, thanks to a highly tuned custom matrix-vector product^{69}. These computations have proven to be invaluable in the process of taking the limit *L*_{τ} → *∞* (‘More about exact diagonalization’ in Methods).

Our Monte Carlo simulations used the parallel tempering algorithm^{70}, implemented over the *k* parameter in equation (5), to ensure equilibration. We equilibrated 1,280 samples for each lattice size (Extended Data Table 1). As a rule, we estimated errors using the bootstrap method^{71}, as applied to the disordered average.

We simulated six real replicas of every sample (six statistically independent simulations of the system), for several reasons. Replicas allowed us to implement the equilibration tests based on the tempering dynamics^{56}. They also provided unbiased estimators of products of thermal averages (equation (10)). Finally, fluctuations between replicas allowed us to estimate errors for the time-correlation functions (equation (14)), as computed in a single sample (‘Fitting process and estimating the Euclidean correlation length’).

The Monte Carlo code exploits a three-level parallelization (multispin coding, domain decomposition and parallel tempering), which kept the spin-update time below 0.5 ps (ref. ^{69}), competitive with dedicated hardware^{22}.

### More about exact diagonalization

The schematic representation of the spectrum in Extended Data Fig. 1 is based on the distribution functions in Extended Data Fig. 3 (we typically compute the inverse distribution function; see ‘Fitting process and estimating the Euclidean correlation length’ for details).

Indeed, the correlation length *η* displays very large sample-to-sample fluctuations (to the point that a logarithmic representation is advisable) and a very strong *k*-dependence (Extended Data Fig. 3, left). In contrast, *η*_{e} is always a number of order one in our *L* = 6 samples (Extended Data Fig. 3, middle). Furthermore, *η*_{o}/*η*_{e} ≈ 1 in all cases (Extended Data Fig. 3, right).

In fact, the distribution for *η* is a Lévy flight (Fig. 4c; for large *η*, \(F(\eta )=1-B/{\eta }^{b}\)). The mechanism allowing exponent *b* to vary with *k* (hence, with transverse field (equation (6))) is sketched in Extended Data Fig. 4. Let us compare the value of *η* for the same sample at *k*_{1} and *k*_{2} (*k*_{1} < *k*_{2}). With great accuracy, \(\eta ({k}_{2})=\alpha {[\eta ({k}_{1})]}^{1+\beta }\), where *α* and *β* are constants (for fixed *k*_{1} and *k*_{2}) and *β* > 0. Thus, ordering samples according to their *η*(*k*_{1}) is the same as ordering by *η*(*k*_{2}), because one is a monotonically increasing function of the other. Hence, the same sample occupies percentile *F* in the distribution for both *k*_{1} and *k*_{2}. It follows that *b*(*k*_{2}) = *b*(*k*_{1})/(1 + *β*) for the exponent characterizing the Lévy flight. In other words, because *b*(*k*_{2}) < *b*(*k*_{1}), the tail at large *η* becomes heavier as *k* increases (see ‘On the magnetic susceptibilities’ for an extended discussion).

### The critical point and critical exponents

After taking care of the *L*_{τ} → *∞* limit (within errors) in our study of the phase transition, we still need to cope with the finite spatial dimension *L*. We shall do so using finite-size scaling^{72,73,74,75} (Fig. 1c). The main questions we shall address are the computation of the critical exponents and the estimation of the critical point. Our main tool will be the quotients method^{48,60,76}, which, surprisingly, keeps our two questions somewhat separate.

The quotients method starts by comparing a dimensionless quantity at two sizes *L*_{a} < *L*_{b} (in our case, *ξ*^{(3)}/*L* as a function of *k*). First, we located a coupling *k*^{*}(*L*_{a}, *L*_{b}) such that the curves for *L*_{a} and *L*_{b} cross (Fig. 1b). Now, for dimensionful quantities *A*, scaling in the thermodynamic limit as \({\xi }^{{x}_{A}/\nu }\), we consider the quotient \({Q}_{A}={A}_{{L}_{a}}/{A}_{{L}_{b}}\) at *k*^{*}(*L*_{a}, *L*_{b}). Barring scaling corrections, \({Q}_{A}={({L}_{a}/{L}_{b})}^{{x}_{A}/\nu }\), which yields an effective estimate of *x*_{A}/*ν*. Indeed, considering only the leading correction to the scaling exponent *ω*, we have for the effective exponent:

$${\left.\frac{{x}_{A}}{\nu }\right|}_{{L}_{a},{L}_{b}}=\frac{{x}_{A}}{\nu }\,+\,\frac{1}{\log ({L}_{b}/{L}_{a})}\log \frac{1+{D}_{A}{L}_{b}^{-\omega }}{1+{D}_{A}{L}_{a}^{-\omega }},$$

(17)

where *D*_{A} is an amplitude. Our estimates for the effective exponents can be found in Extended Data Table 2. Yet, effective exponents need to be extrapolated to the thermodynamic limit through equation (17). Unfortunately, we have not been able to estimate exponent *ω*, as there were two difficulties. First, the range of *L* values at our disposal was small. Second the analytic background^{48} for the *r* = 2 observables and for the Binder parameter (‘One-time observables’) compete with the *L*^{−ω} corrections. Hence, we followed an alternative strategy. We fitted our effective exponents to equation (17) with fixed *ω* (the fitting parameters were the extrapolated *x*_{A}/*ν* and the amplitude *D*_{A}). To account for our ignorance about *ω*, we made it vary in a wide range 0.5 ≤ *ω* ≤ 2. The central values in equations (2) and (3) were obtained with *ω* = 1, whereas the second error estimate accounts for the *ω*-dependence of *x*_{A}/*ν*. Indeed, the first error estimate is the statistical error as computed for *ω* = 1, whereas the second error estimate is the semi-difference between the extrapolations to infinite size obtained with *ω* = 2.0 and *ω* = 0.5. To take into account the data correlation, we employed a bootstrap method^{77}. We considered only the diagonal part of the covariance matrix in the fits and performed a new fit for every bootstrap realization. Errors were computed from the fluctuations of the fitting parameters. Fortunately, the systematic errors turned out to be comparable (for 1/*ν* smaller) with the statistical ones.

Like the critical point, we expected scaling corrections of the form *k**(*L*_{a}, *L*_{b}) = *k*_{c} + *D*_{k}*F*(*L*_{a}, *L*_{b}), where *D*_{k} is an amplitude^{78}:

$$F({L}_{a},{L}_{b})={L}_{a}^{-\left(\omega +\frac{1}{\nu }\right)}\frac{1-{s}^{-\omega }}{{s}^{1/\nu }-1},\quad s=\frac{{L}_{b}}{{L}_{a}}\,.$$

(18)

Unfortunately, this result is not of much use without a *ω* estimate. Fortunately, see Extended Data Table 2 and Extended Data Fig. 5, the values of *k**(*L*_{a}, *L*_{b}) obtained from *ξ*^{(3)}/*L* seem not to depend on size. In fact, our estimate for *k*_{c} in equation (2) is an interval that encompasses all our results (the shaded area in Extended Data Fig. 5). Furthermore, the crossing points for *B* and *ξ*^{(2)}/*L* (Extended Data Fig. 5) seem also reasonably well represented by equation (18).

### Fitting process and estimating the Euclidean correlation length

Our aim in this section is to determine the relevant correlation lengths for *C*(*τ*) and *Q*_{2}(*τ*) at a fixed *k*, for our *N*_{S} = 1,280 samples. The results are characterized through their empirical distribution function (Figs. 3 and 4). Given that *N*_{S} is large, we needed an automated approach.

The first step was estimating, for a given sample, *C*(*τ*) and *Q*_{2}(*τ*), as well as their standard errors, by using our six replicas. Now, the analysis of a noisy correlation function (such as *C*(*τ*) and *Q*_{2}(*τ*); see, for example, Extended Data Fig. 2) needs a fitting window^{79,80}. We chose the window upper limit as \({\tau }_{{\rm{w}},f}\equiv \mathop{\min }\limits_{\tau }\{\tau | \,f(\tau )=3.5{\sigma }_{f(\tau )}\}\), with *f*(*τ*) either *C*(*τ*) or *Q*_{2,s}(*τ*) = *Q*_{2}(*τ*) − *Q*_{2,pl}, where *Q*_{2,pl} is the plateau (Extended Data Fig. 2) and *σ*_{f(τ)} is the corresponding standard error. We faced two problems. First, for odd *C*(*τ*), some samples have *τ*_{w,C} ≥ *L*_{τ}/2. For these samples, *η* > *L*_{τ}, and hence, it was impossible to estimate them (Fig. 4b). *Q*_{2,s}(*τ*) was not affected by this problem (Fig. 3). Second, we needed to estimate the plateau *Q*_{2,pl}. To do so, we fitted *Q*_{2}(*τ*) for *τ* ∈ [*L*_{τ}/4, *L*_{τ}/2] to a constant *Q*_{2,pl}. In the few exceptions where this fit was not acceptable (as determined by its figure of merit *χ*^{2}/degrees of freedom (dof) computed with the diagonal part of the covariance matrix), we proceeded as explained below (we used \({\tau }_{{\rm{w}},{Q}_{2}}={L}_{\tau }/2\) in those cases).

We determined the correlation lengths through fits to \(C(\tau )\,=\)\(B[{\text{e}}^{-\tau /\eta }+{\text{e}}^{(\tau -{L}_{\tau })/\eta }]\) and \({Q}_{2}(\tau )={Q}_{2,\text{pl}}+{B}_{\text{e}}{e}^{-\tau /{\eta }_{\text{e}}}+{B}_{\text{o}}{e}^{-\tau /{\eta }_{\text{o}}}\). The fitting parameters were the amplitudes and the correlation lengths (and, for the above-mentioned exceptional samples, also *Q*_{2,pl}). Actually, for *Q*_{2}(*τ*) we considered fits with one and with two exponential terms, keeping the fit with the smallest *χ*^{2}/dof, as we could not determine which of the two correlation lengths obtained in the fit corresponded to the even gap (Extended Data Fig. 1). Hereafter, *η*_{e} is the larger of the two. As for the lowest limit of the fitting window, we started from \({\tau }_{\min ,{Q}_{2}}=1\) and \({\tau }_{\min ,C}={\tau }_{{\rm{w}},C}/10\), and we kept increasing the corresponding \({\tau }_{\min }\) until *χ*^{2}/dof went below 0.5 for *Q*_{2} (below 1 for *C*(*τ*)).

Finally, we determined the empirical distribution function for the correlation lengths. Let *X* be either \(\log \eta \) or *η*_{e} (see below for some subtleties regarding *η*). We actually computed the inverse function *X*[*F*] by sorting in increasing order the *N*_{S} values of *X* and setting \(X[F=i/{N}_{{\rm{S}}}]\) as the *i*th item in the ordered list. We obtained *X*[*F*] at the value of *k* of interest through linear interpolation of *X*[*F*] computed at the two nearest values of *k* in the parallel tempering grid. To estimate the errors in *X*[*F*], we employed a bootstrap method with 10,000 as the resampling value. In each resampling, we randomly picked *N*_{S} values of *X*. For the chosen sample, we extracted *X* from a normal distribution centred in *X* as obtained from the fit. The standard deviation was the fitting error for *X*.

For \(X=\log \eta \), we needed to cope with the problem that we could determine *X* for only *N*_{OK} of our *N*_{S} samples. We decided to determine *X*[*F*] only up to \({F}_{{\rm{safe}}}\equiv ({N}_{{\rm{OK}}}-4\sqrt{({N}_{{\rm{S}}}-{N}_{{\rm{OK}}}){N}_{{\rm{OK}}}/{N}_{{\rm{S}}}})/{N}_{{\rm{S}}}\) (the maximum possible *F* minus four standard deviations). We imposed for every bootstrap resampling that *X* could be obtained in at least *F*_{safe}*N*_{S} samples (this limitation was irrelevant in practice).

Let us conclude by mentioning that the estimates in equation (4) were obtained through a joint fit for *η*_{e}[*F*], with *F* = 0.5, 0.6, 0.7, 0.8 or 0.9. Errors were estimated as explained in ‘The critical point and critical exponents’.

### On the magnetic susceptibilities

The sample-averaged linear susceptibility to an external magnetic field at *T* = 0, \({\chi }_{{\rm{lin}}}^{(h)}\), can diverge only if \(\overline{C(\tau )}\) decays slowly for large *τ* (because \({\chi }_{{\rm{lin}}}^{(h)}=1+2{\sum }_{\tau =1}^{\infty }\overline{C(\tau )}\); Extended Data Fig. 6). Yet, the periodicity induced by the PBCs (Extended Data Fig. 6a) made it difficult to study the behaviour at large *τ*. Fortunately, representing \(\overline{C(\tau )}\) as a function of \(\widetilde{\tau }=\frac{{L}_{\tau }}{{\rm{\pi }}}\sin ({\rm{\pi }}\tau /{L}_{\tau })=\tau [1+{\mathcal{O}}({\tau }^{2}/{L}_{\tau }^{2})]\) greatly alleviated this problem (Extended Data Fig. 6b). Thus armed, we could study the long-time decay of \(C(\tau )\propto 1/{\widetilde{\tau }}^{\widetilde{b}}\) as a function of *k* (Extended Data Fig. 6c). Indeed, \(\widetilde{b}\) decreased as *k* increased. As *C*(*τ*) ≈ *B*e^{−τ/η} for any sample, the mechanism discussed in ‘More about exact diagonalization’ in Methods is clearly at play. The heavy tail of *F*(*η*) became heavier as *k* increased, which resulted in a decreasing exponent \(\widetilde{b}\). In fact, the critical exponent \(\widetilde{b}=1\) was encountered at *k* ≈ 0.285, well into the paramagnetic phase (\({\chi }_{{\rm{lin}}}^{(h)}=\infty \) if \(\widetilde{b}\le 1\)).

The Lévy-flight perspective provides a simple explanation for the results in refs. ^{6,41}. In a single sample, the different susceptibilities to a magnetic field (linear, third-order and so on) are proportional to increasing powers of *η*. Hence, the existence of the disorder average of a given (generalized) susceptibility boils down to the existence of the corresponding moment of the distribution *F*(*η*). As soon as *F*(*η*) decays for large *η* as a power law, some disorder-averaged susceptibility will diverge (probably a higher-order one). Lower-order susceptibilities diverge at larger values of *k*. Hence, it is not advisable to use this approach to locate the critical point.