\documentclass[11pt]{article} \usepackage{indentfirst} \usepackage[utf8]{inputenc} \RequirePackage{amsmath} \DeclareMathOperator{\var}{var} \begin{document} \title{Stat 8931 Spin Glass Homework} \author{Charles J. Geyer} \maketitle \section{Introduction} \subsection{Model} The \emph{Edwards-Anderson spin glass model} is a spatial lattice process on a square lattice with nearest neighbors. It is like the Ising model explained in the notes but with haphazard coupling constants. Its unnormalized density is \begin{equation} \label{eq:spin-unnor-den} h(x) = \exp\left( \frac{1}{2 \tau} \sum_{(i, j) \in E} \beta_{i j} x_i x_j \right) \end{equation} where, as in the Ising model, the random variables $x_i$ take values in $\{-1, +1\}$ and the edges $(i, j) \in E$ are nearest neighbors in the lattice. In the Edwards-Anderson model the $\beta_{i j}$ are themselves random variables, taken to be IID standard normal. However, we are uninterested in the joint distribution of the $\beta_{i j}$ and the $x_i$. We are only interested in the conditional distribution of the $x_i$ given the $\beta_{i j}$. Thus we will always consider the $\beta_{i j}$ as fixed known numbers. (The only point of the Gaussian distribution here is to give the $\beta_{i j}$ haphazard values. Gaussianity itself is uninteresting here.) To finish the model specification we must specify boundary conditions, which we take to be periodic. The parameter $\tau$ plays the role of temperature. \subsection{Coupling Coefficients} We need two matrices of coupling coefficients (coupling to neighbor to the right and coupling to the down neighbor). We generate these as follows. <>= n <- 6 set.seed(42) br <- matrix(rnorm(n^2), n, n) bd <- matrix(rnorm(n^2), n, n) @ To not rely on the R random numbers changing, we write these out to a file and read them back in. <>= foo <- try(scan("betas.txt")) if (inherits(foo, "try-error")) { write(c(br, bd), file = "betas.txt") foo <- scan("betas.txt") } n <- sqrt(length(foo) / 2) n br <- matrix(foo[1:n^2], n, n) bd <- matrix(foo[n^2 + 1:n^2], n, n) br bd @ The tricky code using the R \verb@try@ function is how one does something that may cause an error and respond to the error. Here we write the file only if it does not already exist (more precisely, if it does not already exist or can not be read without error). If the first \verb@scan@ command works, then we just use its result. We will consider these betas fixed throughout the exercise. The temperature parameter $\tau$ will be variable. We start with <>= tau <- 1 @ \subsection{Coding Sets} Because we have negative coupling coefficients, the Swendsen-Wang algorithm does not apply, and the only algorithms we know for updating $x$ while preserving this equilibrium distribution are variable-at-a-time Metropolis or Gibbs. It is a well-known trick in lattice processes to update using \emph{coding sets} (introduced by Julian Besag). If we think of our square lattice as colored like a chess board, we see that all neighbors of white nodes are black and vice versa. As mentioned in the notes, the conditional distribution of one variable given the rest depend only on the values of its neighbors. Thus white nodes are conditionally independent given black nodes and vice versa. Hence a block Gibbs (or block Metropolis) sampler that updates using coding sets as blocks can use the coding sets to make the block updates very efficient. With periodic boundary conditions, we must have $n$ and hence $n^2$ even in order for coding sets to work properly. \subsection{Data} We take make up data on an $n \times n$ lattice. This means we have $n^2$ nodes and $2 n^2$ edges in the graph. In order to do block updates efficiently in R we precalculate lots of things. <>= i <- matrix(seq(1, n^2), n, n) i ir <- cbind(i[ , -1], i[ , 1]) ir il <- cbind(i[ , n], i[ , -n]) il id <- rbind(i[-1, ], i[1, ]) id iu <- rbind(i[n, ], i[-n, ]) iu x <- matrix(1, n, n) @ If \verb@i@ is used as an index into the vector \verb@x@ of spin variables, so \verb@x[i]@ is just the vector arranged in an array, \verb@x[i]@ is the same as \verb@x@, then \begin{itemize} \item \verb@x[ir]@ is the vector of neighbors to the right, \item \verb@x[il]@ is the vector of neighbors to the left, \item \verb@x[iu]@ is the vector of neighbors upwards, and \item \verb@x[id]@ is the vector of neighbors downwards \end{itemize} so the unnormalized log density is \begin{verbatim} sum((x * x[ir] * br + x * x[id] * bd) / tau) \end{verbatim} and the unnormalized log conditional density of $x$ given the rest is \begin{verbatim} function(x) x * (x[ir] * br + x[id] * bd + x[il] * br[il] + x[iu] * br[iu]) / tau \end{verbatim} or, more precisely, this is the vector of such conditional probabilities, but such a vector can be combined only over a coding set, which is given by <>= co <- (i + (1 - n %% 2) * col(i)) %% 2 co ico0 <- i[co == 0] ico1 <- i[co == 1] @ \section{Sampler} \subsection{Elementary Block Update} Thus we can do two block Gibbs updates, one for each coding set, as follows <>= block.gibbs <- function(x, tau) { foo <- x[ir] * br + x[id] * bd + x[il] * br[il] + x[iu] * br[iu] foo <- foo / tau p <- 1 / (1 + exp(- 2 * foo)) x[ico0] <- as.numeric(runif(n^2 / 2) < p[ico0]) * 2 - 1 foo <- x[ir] * br + x[id] * bd + x[il] * br[il] + x[iu] * br[iu] foo <- foo / tau p <- 1 / (1 + exp(- 2 * foo)) x[ico1] <- as.numeric(runif(n^2 / 2) < p[ico1]) * 2 - 1 return(x) } @ <>= print(x) @ \subsection{Doing a Run} By symmetry every $X_i$ has mean zero, which also variance one, because the variance is then $E(X_i^2)$ and $X_i^2 = 1$ always. Since our process is non-homogeneous in the coupling constants every neighbor pair $(X_i, X_j)$ is different. Of the five first and second order moments of the pair, we know four (the means and variances), but each pair has a (possibly) different correlation. Let us take as the functional of the state of interest the whole state vector (all the $X_i$) and the products $X_i X_j$ for neighbor pairs. We know the $X_i$ should have mean zero, but this provides a useful convergence check. (All useful convergence checks ``cheat'' in this way. They use a known property of the equilibrium distribution.) <>= nbatch <- 100 blen <- 100 @ <>= batch <- matrix(NA, nbatch, 3 * n^2) for (ibatch in 1:nbatch) { xbatch <- rep(0, 3 * n^2) for (iiter in 1:blen) { x <- block.gibbs(x, tau) xbatch <- xbatch + as.numeric(c(x, x * x[ir], x * x[id])) } batch[ibatch, ] <- xbatch / blen } mu <- apply(batch, 2, mean) mcse <- apply(batch, 2, sd) / sqrt(nbatch) xmu <- matrix(mu[1:n^2], n, n) xmcse <- matrix(mcse[1:n^2], n, n) xxrmu <- matrix(mu[n^2 + 1:n^2], n, n) xxrmcse <- matrix(mcse[n^2 + 1:n^2], n, n) xxdmu <- matrix(mu[2 * n^2 + 1:n^2], n, n) xxdmcse <- matrix(mcse[2 * n^2 + 1:n^2], n, n) @ <>= round(xmu, 3) round(xmcse, 3) round(xxrmu, 3) round(xxrmcse, 3) round(xxdmu, 3) round(xxdmcse, 3) @ An examination of all \Sexpr{3 * n^2} autocorrelation plots (one for each coordinate of the vector of batch means) shows that perhaps doubling the batch size is necessary for the mean coordinates (those contributing to \verb@xmu@). We would also like the standard errors to be smaller. Let's redo. <>= nbatch <- 200 blen <- 1000 <> <> @ \section{Lower Temperature} So far, so good. But how about if we lower the temperature? <>= tau <- 0.2 @ <>= nbatch <- 200 blen <- 1000 <> <> @ Now our ``convergence diagnostic'' (available only because of a known symmetry in the problem) diagnoses complete failure. The \verb@xmu@ results are completely wrong. The standard errors for them are also completely wrong (we know the right answer, exactly zero, thus the \emph{correct} MCSE are one in this case). \section{The Assigned Homework} Finally we get to the homework assignment. \begin{enumerate} \item Figure out what sort of \verb@nbatch@ and \verb@blen@ this sampler with \verb@tau ==@ \Sexpr{tau} needs to get (a) working with all of the reported MC estimates and MCSE apparently correct and (b) all of the MCSE less than 0.001. Note that you do not actually need to get the MCSE below 0.001. You need to run long enough so that you are getting apparently correct results and can thus infer using the ``square root law'' how long you would need to run to get all MCSE below 0.001. \item Implement a parallel tempering sampler with one ``helper'' chain the same model except for a different \verb@tau@. Make the helper \verb@tau@ such that (a) the helper chain mixes better (is that higher temperature or lower?) and (b) the acceptance rate of Metropolis rejection for the swap steps is in the range 20--30\%. Then proceed as in part~1 figuring out what sort of \verb@nbatch@ and \verb@blen@ this sampler needs to get all of the MCSE for the distribution of interest (with \verb@tau ==@ \Sexpr{tau}) less than 0.001. \end{enumerate} \end{document}