Improved Estimation and Interpretation of Correlations in Neural Circuits

Correlations in neural activity measured in such recordings can reveal important aspects of neural circuit organization. However, estimating and interpreting large correlation matrices is statistically challenging. Estimation can be improved by regularization, i.e. by imposing a structure on the estimate. The amount of improvement depends on how closely the assumed structure represents dependencies in the data. Therefore, the selection of the most efficient correlation matrix estimator for a given neural circuit must be determined empirically. lmportantly, the identity and structure of the most efficient estimator informs about the types of dominant dependencies governing the system. We sought statistically efficient estimators of neural correlation matrices in recordings from large, dense groups of cortical neurons. Using fast 3D random-access laser scanning microscopy of calcium signals, we recorded the activity of nearly every neuron in volumes 200 um wide and 100 um deep (150—350 cells) in mouse visual cortex. We hypothesized that in these densely sampled recordings, the correlation matrix should be best modeled as the combination of a sparse graph of pairwise partial correlations representing local interactions and a low-rank component representing common fluctuations and external inputs. Indeed, in cross-validation tests, the covariance matrix estimator with this structure consistently outperformed other regularized estimators. The sparse component of the estimate defined a graph of interactions. These interactions reflected the physical distances and orientation tuning properties of cells: The density of positive ‘excitatory’ interactions decreased rapidly with geometric distances and with differences in orientation preference whereas negative ‘inhibitory’ interactions were less selective. Because of its superior performance, this ‘sparse+latent’ estimator likely provides a more physiologically relevant representation of the functional connectivity in densely sampled recordings than the sample correlation matrix.

A meaningful statistical description of the collective activity of these neural populations—their ‘functional connectivity’—is a forefront challenge in neuroscience. We addressed this problem by identifying statistically efficient estimators of correlation matrices of the spiking activity of neural populations. Various underlying processes may reflect differently on the structure of the correlation matrix: Correlations due to common network fluctuations or external inputs are well estimated by low-rank representations, whereas correlations arising from linear interactions between pairs of neurons are well approximated by their pairwise partial correlations. In our data obtained from fast 3D two-photon imaging of calcium signals of large and dense groups of neurons in mouse visual cortex, the best estimation performance was attained by decomposing the correlation matrix into a sparse network of partial correlations (‘interactions’) combined with a low-rank component. The inferred interactions were both positive (‘excitatory’) and negative (‘inhibitory’) and reflected the spatial organization and orientation preferences of the interacting cells. We propose that the most efficient among many estimators provides a more informative picture of the functional connectivity than previous analyses of neural correlations.

Functional connectivity reflects local synaptic connections, shared inputs from other regions, and endogenous network activity. Although functional connectivity is a phenomenological description without a strict mechanistic interpretation, it can be used to generate hypotheses about the anatomical architecture of the neural circuit and to test hypotheses about the processing of information at the population level.

In particular, noise correlations, i. e. the correlations of trial-to-trial response variability between pairs of neurons, have a profound impact on stimulus coding [1, 2, 6—11]. In addition, noise correlations and correlations in spontaneous activity have been hypothesized to reflect aspects of synaptic connectivity [12]. Interest in neural correlations has been sustained by a series of discoveries of their nontrivial relationships to various aspects of circuit organization such as the physical distances between the neurons [13, 14], their synaptic connectivity [15], stimulus response similarity [3—5, 15—22] , cell types [23], cortical layer specificity [24, 25], progressive changes in development and in learning [26—28], changes due to sensory stimulation and global brain states [21, 29—33].

They can arise from monosynaptic or polysynaptic interactions, common or correlated inputs, oscillations, top-down modulation, and background network fluctuations, and other mechanisms [34—39]. But multineuronal recordings do provide more information than an equivalent number of separately recorded pairs of cells. For example, the eigenvalue decomposition of the covariance matrix expresses shared correlated activity components across the population; common fluctuations of population activity may be accurately represented by only a few eigenvec-tors that affect all correlation coefficients. On the other hand, a correlation matrix can be specified using the partial correlations between pairs of the recorded neurons. The partial correlation coefficient between two neurons reflects their linear association conditioned on the activity of all the other recorded cells [40]. Under some assumptions, partial correlations measure conditional independence between variables and may more directly approximate causal effects between components of complex systems than correlations [40]. For this reason, partial correlations have been used to describe interactions between genes in functional genomics [41, 42] and between brain regions in imaging studies [43, 44]. These opportunities have not yet been explored in neurophysiological studies where most analyses have only considered the distributions of pairwise correlations [2, 4, 5, 13].

The amount of recorded data grows only linearly with population size whereas the number of estimated coefficients increases quadratically. This mismatch leads to an increase in spurious correlations, overestimation of common activity (i. e. overestimation of the largest eigenvalues) [45] , and poorly conditioned partial correlations [41]. The sample correlation matrix is an unbiased estimate of the true correlations but its many free parameters make it sensitive to sampling noise. As a result, on average, the sample correlation matrix is farther from the true correlation matrix than some structured estimates.

To ‘impose a struc-ture’ on an estimate means to bias (‘shrink’) it toward a reduced representation with fewer free parameters, the target estimate. The optimal target estimate and the optimal amount of shrinkage can be obtained from the data sample either analytically [41, 45, 47] or by cross-validation [48]. An estimator that produces estimates that are, on average, closer to the truth for a given sample size is said to be more efficient than other estimators. Although regularized covariance matrix estimation is commonplace in finance [47], functional genomics [41], and brain imaging [44], surprisingly little work has been done to identify optimal regularization of neural correlation matrices.

For example, improved estimates can be used to optimize decoding of the population activity [48, 49]. But reduced estimation error is not the only benefit of regularization. Finding the most efficient among many regularized estimators leads to insights about the system itself: the structure of the most efficient estimator is a parsimonious representation of the regularities in the data.

With the advent of big neural data [50], the search for optimal regularization schemes will become increasingly relevant in any model of population activity. Since optimal regularization schemes are specific to systems under investigation, the inference of functional connectivity in large-scale neural data will entail the search for optimal regularization schemes. Such schemes may involve combinations of heuristic rules and numerical techniques specially designed for given types of neural circuits. What structures of correlation matrices best describe the multineuronal activity in specific circuits and in specific brain states? More specifically, are correlations in the visual cortex during visual stimulation best explained by common fluctuations or by local interactions within the recorded microcircuit? To address these questions, we evaluated four regularized covariance matrix estimators that imposed different structures on the estimate. The estimators are designated as follows: CsampIe—the sample covariance matrix, the unbiased estimator. Cfactor—a low-rank approximation of the sample covariance matrix, representing inputs from unobserved shared factors (latent units). Csparse—sparse partial correlations, i. e. a large fraction of the partial correlations between pairs of neurons are set to zero. Csparse+latent—sparse partial correlations between the recorded neurons and linear interactions With a number of latent units. First, we used simulated data to demonstrate that the selection of the optimal estimator indeed pointed to the true structure of the dependencies in the data.

In our data, the sample correlation coefficients were largely positive and low. We found that the most efficient estimator of the correlation matriX in these data was CsparseHatent. This estimator revealed a sparse network of partial correlations (‘interactions’), between the observed neurons; it also inferred a number of latent units interacting with the observed neurons. We analyzed these networks of partial correlations and found the following: Whereas significant noise correlations were predominantly positive, the inferred interactions had a large fraction of negative values possibly reflecting inhibitory circuitry. Moreover, the inferred positive interactions eXhibited a substantially stronger relationship to the physical distances and to the differences in preferred orientations than noise correlations. In contrast, the inferred negative interactions were less selective.

The covariance matrix is defined as Where the p X 1 vector x is a single observation of the firing rates of p neurons in a time bin of some duration, denotes expectation, and [,4 is the vector of expected firing rates. firing rates in time bin t, and an independent estimate of the mean firing rates 5c, the sample covariance matrix,

e. E[Csample] = Z. In all cases When the unbiasedness of the sample covariance matriX matters in this paper, the mean activity is estimated independently from a separate sample. Given any covariance matrix estimate C, the corresponding correlation matriX R is calculated by normalizing the rows and columns of C by the square roots of its diagonal elements to produce unit entries on the diagonal: Where diag(C) denotes the diagonal matrix With the diagonal elements from C. The partial correlation between a pair of variables is the Pearson correlation coefficient of the residuals of the linear least-squares predictor of their activity based on all the other variables, excluding the pair [40, 51]. Partial correlations figure prominently in probabilistic graphical modeling Wherein the joint distribution is eXplained by sets of pairwise interactions [40].

More generally, partial correlations can serve as a measure of conditional independence under the assumption that dependencies in the system are close to linear effects [40, 53]. As neural recordings become increasingly dense, partial correlations may prove useful as indicators of conditional independence (lack of functional connectivity) between pairs of neurons.

Zero elements in the precision matrix signify zero partial correlations between the corresponding pairs of variables. Given the covariance estimate C, the matriX of partial correlations P is computed by normalizing the rows and columns of the precision matrix C‘1 to produce negative unit entries on the diagonal: Increasing the number of recorded neurons results in a higher condition number of the sample covariance matrix [45] making the partial correlation estimates more ill-conditioned: small errors in the covariance estimates translate into greater errors in the estimates of the partial correlations. With massively multineuronal recordings, partial correlations cannot be estimated Without regularization [41, 45].

In probabilistic models With exclusively linear dependencies, the target estimates of these estimators correspond to distinct families of graphical models (Fig. 1 Row 1). The target estimate of estimator Cdiag is the diagonal matriX D containing estimates of neu-rons’ variances. Regularization is achieved by linear shrinkage of the sample covariance matriX Csample toward D as controlled by the scalar shrinkage intensity parameter A E [0, 1]:

1 Row 1, A). If sample correlations are largely spurious, Cdiag is expected to be more efficient than other estimators. Estimator Cfactor approximates the covariance matriX by the factor model,

This approximation is the basis for factor analysis [51] , Where matrix L represents co-variances arising from latent factors. The rank of L corresponds to the number of latent factors. Matrix D contains the variances of the cells’ independent activity from the latent factors. The estimator is regularized by selecting the rank of L and by shrinking the independent variances in D toward their mean. The structure imposed by Cfactor describes a population Whose activity is linearly driven by a number of latent factors that affect many cells While direct interactions between the recorded cells are insignificant (Fig. 1 Row 1, B). Estimator Csparse is produced by approximating the sample covariance matrix by the inverse of a sparse matrix S: The estimator is regularized by adjusting the sparsity (fraction of off-diagonal zeros) of S. The problem of finding the optimal set of nonzero elements in S is known as covariance selection [52]. The structure imposed by Csparse describes conditions in Which neural correlations arise from direct linear effects (‘interactions’) between some pairs of neurons (Fig. 1 Row 1, C). Estimator Csparse+latent is obtained by approximating the sample covariance matriX by a matrix Whose inverse is the difference of a sparse component and a low-rank component:

The estimator is regularized by adjusting the sparsity of S and the rank of L. See Methods for more detailed explanations. The structure imposed by CsparseHatent favors conditions in Which the activity of neurons is determined by linear effects between some observed pairs of neurons and linear effects from several latent units (Fig. 1 Row 1, D) [54, 55]. We refer to the sparse partial correlations in estimators CSparse and Csparse+latent as ‘interactions’.

We constructed four families of 50 X 50 covariance matrices, each with structure that matched one of the four regularized estimators (Fig. 1 Row 2, A—D and Methods). We used these covariance matrices as the ground truth in multivariate Gaussian distributions with zero means and drew samples of various sizes. The sample correlation matrices from finite samples (e. g. n = 500 in Fig. 1 Row 3) were contaminated with sampling noise and their underlying structures were difficult to discern.

The loss function is chosen to attain its minimum when C = 2. Here, in the role of the loss function we adopted the Kullback-Leibler divergence between multivariate normal distributions with equal means, scaled by; to make its values comparable across different population sizes:

When the ground truth is not accessible, the loss cannot be computed directly but may be estimated from data through validation. In a validation procedure, a validation sample covari-for computing C. Then the validation loss £(C, Cgample CI sample ' is computed from a testing data set that is independent from the data used ) measures the discrepancy of C from Here, in the role of validation loss, we adopted the negative multivariate normal log likelihood of C given CI

6, also scaled by; and omitting the constant term: sample P sample Since £(-, is additive in its second argument and C; is an unbiased estimate of Z, ample then, for given C and Z, the validation loss is an unbiased estimate of the true loss, up to a con-Therefore, the validation procedure allows comparing the relative values of the loss attained by different covariance matrix estimators even without access to the ground truth.

The hyperpara-meters of the regularized estimators were optimized by nested cross-validation using only the data in the sample. All the regularized estimators produced better estimates (lower loss) than the sample covariance matriX. However, estimators whose structure matched the true model outperformed the other estimators (Fig. 1 Rows 4 and 5). The validation loss computed by tenfold cross-validation (see Methods) accurately reproduced the relative values of the true loss as well as the rankings of the estimators even without access to the ground truth (Fig. 1 Row 6).

Similarly, when the number of latent units was zero (Column C), Csparse+latent performed nearly equally well to C3,,arse because it correctly inferred zero latent units. With increasing sample sizes, all estimators converged to the ground truth (zero loss) but the estimators with correct structure outperformed the others even for large samples.

To demonstrate that estimator rankings were robust to deviations from Gaussian models, we repeated the same cross-validated evaluation using pairwise Ising models to generate the data. Ising models have been used to infer functional connectivity from neuronal spike trains [56]. Conveniently, the Ising model has equivalent mathematical form to the Gaussian distribution, but the Ising model is defined on the multivariate binary domain rather than the continuous domain. Both models are maximum-entropy models constrained to match the mean firing rates and the covariance matriX [57]. The partition function Z(], k) normalizes the distributions on the models’ respective domains. In the Gaussian model, the matriX — I ‘1 is the covariance matriX; and the mean values are [,4 = I ‘1 h. For the Ising model, I is the matriX of pairwise interactions and h is the vector of the cells’ individual activity drives, although they do not have a simple relationship to the means and the covariance matriX. Both distributions have the same structure of pairwise conditional dependencies: zeros in the matrix I indicate conditional independence between the corresponding pair of neurons.

2). Identical interaction matrices I of the joint distributions over the observable and latent variables were used for both the Gaussian and the Ising models. This simulation study demonstrated that cross-validated evaluation of regularized estimators of the covariance matrices of population activity can discriminate between structures of dependencies in the population. The selection of the most efficient covariance estimators for particular neural circuits is therefore an empirical finding characteristic of the nature of circuit interactions.

3 A—B) [58—60]. This technique allowed fast sampling (100—150 HZ) from large numbers (150—350) of cells in 200 x 200 x 100 Mm3 volumes of cortical tissue (Fig. 3 C and D). The instantaneous firing rates were inferred using sparse nonnegative deconvolution [61] (Fig. 3 C). Only cells that produced detectable calcium activity were included in the analysis (see Methods). First, 30 repetitions of full-field drifting gratings of 16 directions were presented in random order. Each grating was played for 500 ms, without intervening blanks. This stimulus was used to compute the orientation tuning of the recorded cells (Fig. 3 D). To estimate the noise correlation matrix, we presented only two distinct directions in some experiments or five directions in others with 100—300 repetitions of each condition. Each grating lasted 1 second and was followed by a l-second blank. The traces were then binned into 150 ms intervals aligned on the stimulus onset for the estimation of the correlation matrix. The sample correlation coefficients were largely positive and low (Fig. 3 E and F). The average value of the correlation coefficient across sites ranged from 0.0065 to 0.051 with the mean across sites of 0.018. In these densely sampled populations, direct interactions between cells are likely to influence the patterns of population activity. We therefore hypothesized that covariance matrix cell number

Acquisition of neural signals for the estimation of noise correlations. Visual stimuli comprising full-field drifting gratings interleaved with blank screens (A) presented during two-photon recordings of somatic calcium signals using fast 3D random-access microscopy (B). C—F. Calcium activity data from an example site. C. Representative calcium signals of seven cells, downsampled to 20 Hz, out of the 292 total recorded cells. Spiking activity inferred by nonnegative deconvolution is shown by red ticks below the trace. D. The spatial arrangement and orientation tuning of the 292 cells from the imaged site. The cells’ colors indicate their orientation preferences. The gray cells were not significantly tuned. E. The sample noise correlation matrix of the activity of the neural population. F. Histogram of noise correlation coefficients in one site. The red line indicates the mean correlation coefficient of 0.020.

However, the observed neurons must also be strongly influenced by global activity fluctuations and by unobserved common inputs to the advantage of estimators that eXplicitly model common fluctuations of the entire population: Cfactor and Csparse+latent. If both types of effects are significant, then Csparse+latent should outperform the other estimators.

The hyperparameters of each estimator were optimized by nested cross-validation (See 81 Fig. and Methods). Indeed, the sparse +latent estimator outperformed the other estimators (Fig. 4). The respective median differences of the validation loss were 0.039, 0.0016, 0.0029, and 0.0059 nats/cell/bin, significantly greater than zero (p < 0.01 in each comparison, Wilcoxon signed rank test).

5 and Fig. 6). Although the regularized estimates were similar to the sample correlation matrix (Fig. 5 A and B), the corresponding partial correlation matrices differed substantially (Fig. 5 C and D). The estimates separated two sources of correlations: a network of linear interactions expressed by the sparse component of the inverse and latent units expressed by the low-rank components of the inverse (Fig. 5 E). The sparse partial correlations revealed a network that differed substantially from the network composed of the greatest coefficients in the sample correlation matrix (Fig. 5 F, G, H, and I).

5), the sparse component had 92.8% sparsity (or conversely, 7.2% connectivity: connectivity = 1—sparsity) with average node degree of 20.9 (Fig. 5 G). The average node degree, 1'. e. the average number of interactions linking each neuron, is related to connectivity as degree 2 connectivity(p—l), where p is the number of neurons. The low-rank component had rank 72, denoting 72 inferred latent units. The number of latent units increased with population size (Fig. 6 A) but the connectivity was highly variable (Fig. 6 B): Several sites, despite their large population sizes, were driven by latent units and had few pairwise interactions. This variability may be explained by differences in brain states and recording quality and warrants further investigation. fl 15 % neg. interactions # avg reg partial oorr 0 .02 .04 avg sample can % connectivity

Properties of Csparsemtent estimates from all imaged sites. Each point represents an imaged site with its color indicating the population size as shown in panels A and B. The example site from Figs. 3 and 5 is circled in blue. A. The number of inferred latent units vs. population size. B. The connectivity of the sparse component of partial correlations as a function of population size. C. The average sample correlations vs. the average partial correlations (Eq. 4) of the Csparsegatem estimate. D. The percentage of negative interactions vs. connectivity in the Csparsegatent estimates.

6 C). This suggests that correlations between neurons build up from multiple chains of smaller interactions. Furthermore, the average partial correlations were less variable (p = 0.002 Brown-Forsythe test): the coefficient of variation of the average sample correlations across sites was 0.45 whereas that of the average partial correlations was 0.29.

5 F). The fraction of negative interactions increased with the inferred connectivity (Fig. 6 D), suggesting that negative interactions can be inferred only after a sufficient density of positive interactions has been uncovered. Thresholded sample correlations have been used in several studies to infer pairwise interactions [26, 62—64]. We therefore compared the interactions in the sparse component of

The networks revealed by the two methods differed substantially. In the example site With 7.2% connectivity in Csparse+latent, only 27.7% of the connections coincided With the above-threshold sample correlations (Fig. 5 F, H, and I). In particular, most of the inferred negative interactions corresponded to low sample correlations (Fig. 5 F) Where high correlations are eXpected given the rest of the correlation matrix.

7). Five sites with highest pairwise connectivities were included in the analysis. Partial correlations were computed using Eq. 4 based on the regularized estimate, including both the sparse and the latent component. Connectivity was computed as the fraction of pairs of cells connected by nonzero elements (interactions) in the sparse component of the estimate, segregated into positive and negative connectivities.

The partial correlations decayed more rapidly with Aori than did sample correlations (Fig. 7 A and D. p < 10'9 in each of the five sites, two-sample t-test of the difference of the linear regression coefficients in normalized data). Positive connectivity decreased with Aori (p < 0.005 in each of the five sites, t-test on the logistic regression coefficient) whereas negative connectivity did not decrease (Fig. 7 G): The slope in the logistic model of connectivity with respect to Aori was significantly higher for positive than for negative interactions (p < 0.04 in each of the five sites, two-sample t-test of the difference of the logistic regression coefficient).

We distinguished between the lateral distance, Ax, in the plane parallel to the pia, and the vertical distance, AZ, orthogonal to the pia. When considering the dependence on Ax, the analysis was limited to cell pairs located at the same depth with A2 < 30 gm; conversely, when considering the dependence on A2, only vertically aligned cell pairs with Ax < 30 gm were included. Again, the partial correlations decayed more rapidly both laterally and vertically than sample correlations (Fig. 7 B, C, E, F. p < 10—6 in each of the five sites, for both lateral and vertical distances, two-sample t-test of the difference of the linear regression coefficients in normalized data). Positive connectivity decayed with distance (p < 10—6 in each of the five sites for positive interactions, t-test on the logistic regression coefficient in normalized data) (Fig. 7 E, H, I), so that cells separated laterally by less than 25 gm were 3.2 times more likely to be connected than cells separated laterally by more than 150 gm. Although the positive connectivity appeared to decay faster with vertical than with lateral distance, the differences in slopes of the respective logistic regression models were not significant with available data. The negative connectivity decayed slower with distance (Fig. 7 H and I): The slope in the respective logistic models with respect to the lateral distance was significantly higher for positive than for negative connectivities (p < 0.05 in each of the five sites, two-sample t-test of the difference of the logistic regression coefficients).

The goal of many studies of functional connectivity has been to estimate anatomical connectivity from observed multineuronal spiking activity. For example, characteristic peaks and troughs in the pairwise cross-correlograms of recorded spike trains contain statistical signatures of monosyn-aptic connections and shared synaptic inputs [12, 14, 34, 35, 65]. Such signatures are ambiguous as they can arise from network effects other than direct synaptic connections [66]. With simultaneous recordings from more neurons, ambiguities can be resolved by inferring the conditional dependencies between pairs of neurons. Direct causal interactions between neurons produce statistical dependency between them even after conditioning on the state of the remainder of the network and external input. Therefore, conditional independence shown statistically can signify the absence of a direct causal influence.

For example, generalized linear models (GLMs) have been constructed to include biophysically plausible synaptic integration, membrane kinetics, and individual neurons’ stimulus drive [67]. Maximum entropy models constrained by observed pairwise correlations are among other models with pairwise coupling between cells [68—72]. Assuming that the population response follows a multivariate normal distribution, the conditional dependencies between pairs of neurons are expressed by the partial correlations between them. Each probabilistic model, fitted to the same data may reveal a completely different network of ‘inter-actions’, 1'. e. conditional dependencies between pairs of cells.

Little experimental evidence is available to answer this question. The connectivity graphs inferred by various statistical methods are commonly reported without examining their relation to anatomy. Topological properties of such graphs have been interpreted as principles of circuit organization (e. g. small-world organization) [62—64, 70]. However, the topological properties of functional connectivity graphs can depend on the method of inference [73]. Until a physiological interpretation of functional connectivity is established, the physiological relevance of such analyses remains in question and we did not attempt applying graph-theoretical analyses to our results.

Unobserved portions of the circuit may manifest as conditional dependencies between observed neurons that do not directly interact. For this reason, statistical models of population activity have been most successfully applied to in vitro preparations of the retina or cell cultures where high-quality recordings from the complete populations were available [67]. In cortical tissue, electrode arrays record from a small fraction of cells in a given volume, limiting the validity of inference of the pairwise conditional dependencies. Perhaps for this reason, partial correlations have not, until now, been used to describe the functional connectivity in cortical populations.

While the temporal resolution of calcium signals is limited by the calcium dye kinetics, fast imaging techniques combined with spike inference algorithms provide millisecond-scale temporal resolution of single action potentials [74]. However, such high temporal precision comes at the cost of lower accuracy of inferred spike rates. Better accuracy is achieved when calcium signals are analyzed on scales of tens of milliseconds [60, 75]. The major advantage of calcium imaging is its ability to characterize the spatial arrangement and types of recorded cells. Recently, advanced imaging techniques have allowed recording from nearly every cell in a volume of cortical tissue in vivo [59, 60] and even from entire nervous systems [76, 77]. These techniques may provide more incisive measurements of functional connectivity than electrophysiological recordings.

Hence, most studies of functional connectivity have relied on instantaneous sample correlations [23, 26, 29, 63]. Although some investigators have interpreted such correlations as indicators of (chemical or electrical) synaptic connectivity, most used them as more general indicators of functional connectivity without relating them to underlying mechanisms.

We hypothesized that partial correlations correspond more closely to underlying mechanisms than sample correlations when recordings are sufficiently dense. Since neurons form synaptic connections mostly locally and sparsely [78] , we a priori favored solutions with sparse partial correlations. Under the assumptions that the recorded population is sufficiently complete and that the model correctly represents the nature of interactions, the network of partial correlations can better represent the functional dependencies in the circuit than correlations.

Depending on the method of their extraction, coacti-vation patterns may be referred to as assemblies, factor loadings, principal components, independent components, activity modes, eigenvectors, or coactivation maps [79—84]. Coactivation patterns could be interpreted as signatures of Hebbian cell assemblies, 1'. e. groups of tightly interconnected groups of cells involved in a common computation [79, 82]. Coactivation patterns could also result from shared input from unobserved parts of the circuit, or global network fluctuations modulating the activity of the local circuit [32, 85].

However, an analysis of coactivation shifts the focus from detailed interactions to collective behavior. In our study, the functional connectivity solely through modes of coactivations was represented by the factor analysis-based estimator

In the effort to account for the joint activity patterns that are poorly explained by pairwise interactions, investigators have augmented models of pairwise interactions with additional factors such as latent variables, higher-order correlations, or global network fluctuations In our study, we combined pairwise interactions with collective coactivations by applying the recently developed numerical techniques for the inference of the partial correlation structure in systems with latent variables [54, 55]. The resulting estimator, Csparse+latent, effectively decomposed the functional connectivity into a sparse network of pairwise interactions and coactivation mode vectors.

5 C and D). The problem becomes worse as the number of recorded neurons increases until such models lose their statistical validity [90]. As techniques have improved to allow recording from larger neuronal populations, experimental neu-roscientists have addressed this problem by extending the recording durations to keep sampling noise in check and verified that existing models are not overf1tted [87]. However, ambitious projects already underway, such as the BRAIN initiative [50], aim to record from significantly larger populations. Simply increasing recording duration will be neither practical nor sufficient, and the problem must be addressed by using regularized estimators. Regularization biases the solution toward a small subspace in order to counteract the effects of sampling noise in the empirical data. However, biasing the solution to an inappropriate subspace does not allow signif1cant estimation improvement and hinders interpretation.

For example, Ganmor et al. [86] developed a heuristic rule to identify the most significant features that must be fitted by a maximum entropy model for improved performance in the retina. As another example of regularization, generalized linear models typically employ L1 penalty terms to constrain the solution space and to effectively reduce the dimensionality of the solution [67]. Our study demonstrates regularization schemes empirically optimized for specific types of neural data.

Various model selection criteria have been devised to select between families of models and the optimal subsets of variables in a given model family based on observed data. Despite its high computational demands, cross-validation is among the most popular model selection approaches due to its minimal assumptions about the data-generating process [91].

However, this does not limit the applicability of its conclusions to normal distributions. Other probabilistic models, fitted to the same data, could also serve as estimators of the covariance matriX. If a different model yields better estimation of the covariance matriX than the estimator proposed here, we believe that its structure should deserve consideration as the better representation of the functional connectivity.

As we demonstrated by simulation, even models with incorrect forms of dependencies can substantially improve estimates (Fig. 1). Therefore, showing that a more constrained model has better cross-validated performance than a more compleX model does not necessarily support the conclusion that it reveals a better representation of dependencies in the data. This caveat is related to Stein ’5 Paradox [92]: The biasing of an estimate toward an arbitrary low-dimensional target can consistently outperform a less constrained estimate.

We showed that among several models a sparse network of linear interactions with several latent inputs yielded the best estimates of the noise covariance matrix for cortical microcircuits. This finding is valuable in itself: improved estimates of the noise covariance matrix for large datasets are important in order to understand the role of noise correlations in population coding [1, 6, 7, 9,11]

Importantly, the inferred functional interactions were substantially different from the network of the highest sample correlations. For example, the Csparse+latent estimator reveals a large number of negative interactions that were not present in the sample correlation matrix (Fig. 5 F) and may reflect inhibitory circuitry.

7 A—F). These differences support the idea that correlations are built up from partial correlations in chains of intermediate cells positioned closer and tuned more similarly to one another, with potentially closer correspondence to anatomical connectivity. These differences may also be at least partially explained by a trivial effect of regularization: the L1 penalty applied by the estimator (Eq. 18) suppresses small partial correlations to a greater extent than large partial correlations, enhancing the apparent effect of distance and tuning. Still, the distinct positive and negative connectivity patterns (Fig. 7 G—I) may reflect geometric and graphical features of local excitatory and inhibitory networks. Indeed, the relationships between patterns of positive and negative connectivities inferred by the estimator resembled the properties of excitatory and inhibitory synaptic connectivities with respect to distance, cortical layers, and feature tuning [23, 78, 93—98]. For example, while excitatory neurons form synapses within highly specific local cliques [78], inhibitory interneurons form synapses with nearly all excitatory cells within local microcircuits [23, 96, 99]. To further investigate the link between synaptic connectivity and inferred functional connectivity, in future experiments, we will use molecular markers for various cell types with followup multiple whole-cell in vitro recordings [23, 28] to directly compare the inferred functional connectivity graphs to the underlying anatomical circuitry. Finally, the latent units inferred by the estimator can be analyzed for their physiological functions. For example, these latent units may be modulated under different brain states

Ethics statement All procedures were conducted in accordance With the ethical guidelines of the National Institutes of Health and were approved by the Baylor College of Medicine IACUC.

For surgery, animals were initially anesthetized with isoflurane (3%). During the experiments, animals were sedated with a mixture of fentanyl (0.05 mg/ kg), midazolam (5 mg/kg), and medetomidine (0.5 mg/kg), with boosts of half the initial dose every 3 hours. A craniotomy was performed over the right primary Visual corteX. Membrane-permeant calcium indicator Oregon Green 488 BAPTA-l AM (0GB- 1, InVitrogen) was loaded by bolus injection. The craniotomy was sealed using a glass coverslip secured with dental cement.

All imaging was performed using 3D-RAMP two-photon microscopy [60]. First, a 3D stack was acquired and cells were manually segmented. Then calcium signal were collected by sampling in the center of each cell at rates of 100 Hz or higher, depending on the number of cells.

Two types of stimuli were presented for each imaging site: First, directional tuning was mapped using a pseudo-random sequence of drifting gratings at sixteen directions of motion, 500 ms per direction, without blanks, with 12—30 trials for each direction of motion. Second, to measure correlations, the stimulus was modified to include only two directions of motion (in 9 datasets) or five directions (in 22 datasets) and the gratings were presented for 1 second and were separated by 1-second blanks, with 100—300 trials for each direction of motion.

All data were processed in MATLAB using the Dataloint data processing chain toolbox (http:// datajoint.github.com).

The resulting traces were high-pass filtered above 0.1 Hz and downsampled to 20 HZ (Fig. 3 C). Then, the firing rates were estimated using by nonnegative deconvolution [61]. Orientation tuning was computed by fitting the mean firing rates for each direction of mo-1 c eXp (cos(¢ — 9 —|— 7t) — 1)] where b 2 c are the amplitudes of the two respective peaks, w is the tuning width, and 9 is the preferred direction. The significance of the fit was determined by the permutation test: the labels of the direction were randomly permuted 10,000 times; the p-values of the fits were computed as the fraction of permutations that yielded R2 equal to or higher than that of the original data. Cells were considered tuned with p < 0.05. For covariance estimation, the analysis was limited to the period with two or five stimulus conditions and lasted between 14 and 27 minutes (mean 22 minutes). Cells that did not have substantial spiking activity (those whose variance was less than 1% of the median across the site) or whose activity was unstable (those whose variance in the least active quarter of the recording did not exceed 1% of the variance in the most active quarter) were excluded from the analysis.

Each subset was then used as the testing sample with the rest of the data used as the training sample for estimating the covariance matrix. The average validation loss over the 10 folds was reported.

Hyperparameters were optimized by a two-phase search algorithm: random search to find a good starting point for the subsequent pattern search to find the global minimum. The inner cross-validation loop subdivided the training dataset from the outer loop to perform 10-fold cross-validation in order to evaluate each choice of the hyperparameter values. Thus the size of the training dataset within the inner loop comprised 81% of the entire recording. 81 Fig. illustrates the dependence of the validation loss on the hyperparameters of the Csparse+latent estimator for the example site shown in Figs. 3 and 5 and the optimal value found by the pattern search algorithm.

This approach was used to compute the covariance matrix estimates and their true loss in the simulation study (Fig. 1 Rows 4 and 5) and to analyze the partial correlation structure of the Csparse+latent estimator (Fig. 5—7).

Within the inner loop of cross-validation, regularized covariance matrix estimation required only the sample covariance matriX Csample of the training dataset and the hyperparameter values provided by the outer loop. were shrunk linearly toward their mean value 11) tr(C sample)

In estimator Cfactor (Eq. 6), the low-rank matrix L and the diagonal matriX D were found by solVing the minimization problem using an expectation-maximization (EM) algorithm for a specified rank of L. After that, the diagonal of D was linearly shrunk toward the its mean diagonal value similar to Eq. 13. In estimator CSparse (Eq. 7), the sparse precision matrix S was found by minimizing the Ll-penalized loss With regularization parameter A: Where 3 > 0 denotes the constraint that 3 be a positive-definite matrix and 3 1 is the ele-ment-Wise L1 norm of the matrix 3. This problem formulation is known as graphical lasso [102, 103]. To solve this minimization problem, we adapted the alternative-direction method of multipliers (ADMM) [55]. Unlike Cdiag and Cfactor, this estimator does not include linear shrinkage: the selection of the sparsity level provides sufficient flexibility to fine-tune the regularization level. Estimator CsparseHatent (Eq. 8) estimates a larger sparse precision matrix 8* of the joint distribution of the p observed neurons and d latent units. Where the p x p partition 8 corresponds to the Visible units. Then the covariance matrix of the observed population is

Rather than finding 812 and 822 separately, L can be estimated as a low-rank positive semidefinite matriX. To simultaneously optimize the sparse component 8 and the low-rank component L, we adapted the loss function With an L1 penalty on S and another penalty on the trace of L [54, 55]:

e. its nuclear norm; penalty on tr(L) favors solutions with few nonzero eigenvalues or, equivalently, low-rank solutions while keeping the convexity of the overall optimization problem [104, 105]. This allows using convex optimization algorithm such as ADMM to be applied with great computational efficiency [55].

4) computed from CsparseHatent includes interactions between the visible and latent units and was used in Fig. 5 C and D and Fig. 6 C, and Fig. 7 D—F). The partial correlation matrix computed from 8 alone expresses strengths of pairwise interactions and were used in Fig. 5 F, G, H. The MATLAB code for these computations is available online at http://github.com/atlab/ cov-est.

All evaluations and optimization in this study were defined with respect to the covariance matrices. However, neuroscientists often estimate a common correlation matriX across multiple stimulus conditions when the variances of responses are conditioned on the stimulus [106, 107]. In this study, we too conditioned the variances on the stimulus but estimated a single correlation matrix across all conditions. Here we describe the computation of the validation loss (Eq. 10) when the variances were allowed to vary with the stimulus condition. Let TC and T; denote the sets of time bin indices for the training and testing samples, respectively, limited to condition c. Similar to Eq. 2, the training and testing sample covariance matrices for condition c are and Here me and n’c denote the sizes of Tc and T2, respectively. Note that 956 = 52x0) is estimated from the training sample but used in both estimates, an unbiased estimate of the true covariance matrix, 2. As such, C’ can be making Césample used for validation. The common correlation matriX Rsample is estimated by averaging the condition-specific sample variances. Then Rsample is simply the covariance matrix of the z-score signal z(t) = For consistency with prior work, we applied regularization to covariance matrices rather than to correlation matrices. The common covariance matrix was estimated by scaling Rsample sample sample sample sample Note that Csample differs from the sample covariance matrix computed without conditioning the variances on c and this computation helps avoid any biases that would be introduced by ignoring changes in variance. The covariance matriX estimators Cdiag, Cfactor, CSparse or Csparse+latent convert Csample into its regularized counterpart denoted here as Creg. To evaluate the estimators, we regularized the conditioned variances by linear shrinkage toward their mean value across all conditions. This was done by scaling Creg by the conditioned variance adjustment matrix Q = 51 —|— (1 — 5)V‘]L V sample C’sample to produce the conditioned regularized covariance matriX estimate: The variance regularization parameter 6 E [0, 1] was optimized in the inner loop of cross-Validation along With the other hyperparameters. The overall validation loss is obtained by averaging the validation losses across all conditions:

10) and the unbiased validation covariance matrix Casample, the loss function in Eq. 25 is an unbiased estimate of the true loss. Hence, it was used for evaluations reported in Fig. 4.

The covariance matrices were then subjected to the respective regularizations to produce the ground truth matrices for the simulation studies (Fig. 1 Row 2). Samples were then drawn from multivariate normal distributions models with the respective true covariance matrices to be estimated by each of the estimators. For Ising models, the negative inverse of the true covariance matrix was used as the matriX of coupling coefficients and the sampling was performed by the Metropolis-Hastings algorithm.

A. Validation loss (Eq. 25) for the example site in Fig. 3 and 5 as a function of the hyperparameters a and fl of the CsparseHatent estimator (Eq. 8 and Eq. 18). In all panels, the red cross marks the optimal value found by the pattern search algorithm described in Methods. B. The connectivity (1 — sparsity) of the sparse component 8 as a function of a and fl for the example site. C. The number of latent units, 1'. e. the rank of the low-rank component L, as a function of hyperparameters a and fl. D. The loss function as a function of the connectivity and the number of latent units.

We thank Genevera Allen for a helpful discussion, and Eftychios Pnevmatikakis for helpful suggestions and feedback on the manuscript.

Performed the experiments: DY EF RIC. Analyzed the data: DY RIC. Wrote the paper: DY K] ASE AST.

Appears in 27 sentences as: Functional connectivity (5) functional connectivity (23)

In *Improved Estimation and Interpretation of Correlations in Neural Circuits*

- Because of its superior performance, this ‘sparse+latent’ estimator likely provides a more physiologically relevant representation of the functional connectivity in densely sampled recordings than the sample correlation matrix.Page 1, “Abstract”
- We propose that the most efficient among many estimators provides a more informative picture of the functional connectivity than previous analyses of neural correlations.Page 2, “Author Summary”
- Functional connectivity is a statistical description of observed multineuronal activity patterns not reducible to the response properties of the individual cells.Page 2, “Introduction”
- Functional connectivity reflects local synaptic connections, shared inputs from other regions, and endogenous network activity.Page 2, “Introduction”
- Although functional connectivity is a phenomenological description without a strict mechanistic interpretation, it can be used to generate hypotheses about the anatomical architecture of the neural circuit and to test hypotheses about the processing of information at the population level.Page 2, “Introduction”
- Pearson correlations between the spiking activity of pairs of neurons are among the most familiar measures of functional connectivity [1—5].Page 2, “Introduction”
- Since optimal regularization schemes are specific to systems under investigation, the inference of functional connectivity in large-scale neural data will entail the search for optimal regularization schemes.Page 3, “Introduction”
- As neural recordings become increasingly dense, partial correlations may prove useful as indicators of conditional independence (lack of functional connectivity ) between pairs of neurons.Page 5, “Covariance estimation”
- Ising models have been used to infer functional connectivity from neuronal spike trains [56].Page 8, “an”
- Functional connectivity as a network of pairwise interactionsPage 15, “Functional connectivity as a network of pairwise interactions”
- Functional connectivity is often represented as a graph of pairwise interactions.Page 15, “Functional connectivity as a network of pairwise interactions”

See all papers in *March 2015* that mention functional connectivity.

See all papers in *PLOS Comp. Biol.* that mention functional connectivity.

Back to top.

Appears in 9 sentences as: synaptic connections (3) synaptic connectivities (1) synaptic connectivity (5)

In *Improved Estimation and Interpretation of Correlations in Neural Circuits*

- Functional connectivity reflects local synaptic connections , shared inputs from other regions, and endogenous network activity.Page 2, “Introduction”
- In addition, noise correlations and correlations in spontaneous activity have been hypothesized to reflect aspects of synaptic connectivity [12].Page 2, “Introduction”
- Interest in neural correlations has been sustained by a series of discoveries of their nontrivial relationships to various aspects of circuit organization such as the physical distances between the neurons [13, 14], their synaptic connectivity [15], stimulus response similarity [3—5, 15—22] , cell types [23], cortical layer specificity [24, 25], progressive changes in development and in learning [26—28], changes due to sensory stimulation and global brain states [21, 29—33].Page 2, “Introduction”
- Such signatures are ambiguous as they can arise from network effects other than direct synaptic connections [66].Page 15, “Functional connectivity as a network of pairwise interactions”
- Although some investigators have interpreted such correlations as indicators of (chemical or electrical) synaptic connectivity , most used them as more general indicators of functional connectivity without relating them to underlying mechanisms.Page 16, “Functional connectivity as a network of pairwise interactions”
- Since neurons form synaptic connections mostly locally and sparsely [78] , we a priori favored solutions with sparse partial correlations.Page 16, “Functional connectivity as a network of pairwise interactions”
- Coactivation patterns and pairwise connectivity are not mutually exclusive since assemblies arise from patterns of synaptic connectivity .Page 17, “Functional connectivity as coactivations”
- Indeed, the relationships between patterns of positive and negative connectivities inferred by the estimator resembled the properties of excitatory and inhibitory synaptic connectivities with respect to distance, cortical layers, and feature tuning [23, 78, 93—98].Page 18, “Physiological interpretation and future directions”
- To further investigate the link between synaptic connectivity and inferred functional connectivity, in future experiments, we will use molecular markers for various cell types with followup multiple whole-cell in vitro recordings [23, 28] to directly compare the inferred functional connectivity graphs to the underlying anatomical circuitry.Page 18, “Physiological interpretation and future directions”

See all papers in *March 2015* that mention synaptic connectivity.

See all papers in *PLOS Comp. Biol.* that mention synaptic connectivity.

Back to top.

Appears in 8 sentences as: correlation coefficient (4) correlation coefficients (4)

In *Improved Estimation and Interpretation of Correlations in Neural Circuits*

- For example, the eigenvalue decomposition of the covariance matrix expresses shared correlated activity components across the population; common fluctuations of population activity may be accurately represented by only a few eigenvec-tors that affect all correlation coefficients .Page 2, “Introduction”
- The partial correlation coefficient between two neurons reflects their linear association conditioned on the activity of all the other recorded cells [40].Page 2, “Introduction”
- In our data, the sample correlation coefficients were largely positive and low.Page 4, “Introduction”
- Where diag(C) denotes the diagonal matrix With the diagonal elements from C. The partial correlation between a pair of variables is the Pearson correlation coefficient of the residuals of the linear least-squares predictor of their activity based on all the other variables, excluding the pair [40, 51].Page 4, “Covariance estimation”
- The sample correlation coefficients were largely positive and low (Fig.Page 9, “The Csparse+latent estimator is most efficient in neural data”
- The average value of the correlation coefficient across sites ranged from 0.0065 to 0.051 with the mean across sites of 0.018.Page 9, “The Csparse+latent estimator is most efficient in neural data”
- F. Histogram of noise correlation coefficients in one site.Page 10, “The Csparse+latent estimator is most efficient in neural data”
- The red line indicates the mean correlation coefficient of 0.020.Page 10, “The Csparse+latent estimator is most efficient in neural data”

See all papers in *March 2015* that mention correlation coefficient.

See all papers in *PLOS Comp. Biol.* that mention correlation coefficient.

Back to top.

Appears in 7 sentences as: firing rates (9)

In *Improved Estimation and Interpretation of Correlations in Neural Circuits*

- Where the p X 1 vector x is a single observation of the firing rates of p neurons in a time bin of some duration, denotes expectation, and [,4 is the vector of expected firing rates .Page 4, “Covariance estimation”
- firing rates in time bin t, and an independent estimate of the mean firing rates 5c, the sample covariance matrix,Page 4, “Covariance estimation”
- Both models are maximum-entropy models constrained to match the mean firing rates and the covariance matriX [57].Page 8, “an”
- The instantaneous firing rates were inferred using sparse nonnegative deconvolution [61] (Fig.Page 9, “The Csparse+latent estimator is most efficient in neural data”
- The measured fluorescent traces were deconvolved to reconstruct the firing rates for each neuron: First, the first principal component was subtracted from the raw traces in order to reduce common mode noise related to small cardiovascular movements [60].Page 19, “Data processing”
- Then, the firing rates were estimated using by nonnegative deconvolution [61].Page 19, “Data processing”
- Orientation tuning was computed by fitting the mean firing rates for each direction of mo-1 c eXp (cos(¢ — 9 —|— 7t) — 1)] where b 2 c are the amplitudes of the two respective peaks, w is the tuning width, and 9 is the preferred direction.Page 19, “Data processing”

See all papers in *March 2015* that mention firing rates.

See all papers in *PLOS Comp. Biol.* that mention firing rates.

Back to top.

Appears in 6 sentences as: normal distribution (2) normal distributions (3) normally distributed (1)

In *Improved Estimation and Interpretation of Correlations in Neural Circuits*

- Here, in the role of the loss function we adopted the Kullback-Leibler divergence between multivariate normal distributions with equal means, scaled by; to make its values comparable across different population sizes:Page 7, “Simulation”
- Assuming that the population response follows a multivariate normal distribution , the conditional dependencies between pairs of neurons are expressed by the partial correlations between them.Page 15, “Functional connectivity as a network of pairwise interactions”
- We evaluated the covariance matriX estimators using a loss function derived from the normal distribution .Page 18, “Model selection”
- However, this does not limit the applicability of its conclusions to normal distributions .Page 18, “Model selection”
- For simulation, ground truth covariance matrices were produced by taking 150 independent samples from an artificial population of 50 independent, identically normally distributed units.Page 23, “Simulation”
- Samples were then drawn from multivariate normal distributions models with the respective true covariance matrices to be estimated by each of the estimators.Page 23, “Simulation”

See all papers in *March 2015* that mention normal distributions.

See all papers in *PLOS Comp. Biol.* that mention normal distributions.

Back to top.

Appears in 6 sentences as: t-test (6)

In *Improved Estimation and Interpretation of Correlations in Neural Circuits*

- 7 A and D. p < 10'9 in each of the five sites, two-sample t-test of the difference of the linear regression coefficients in normalized data).Page 14, “Relationship of Csparse+latent to orientation tuning and physical distances”
- Positive connectivity decreased with Aori (p < 0.005 in each of the five sites, t-test on the logistic regression coefficient) whereas negative connectivity did not decrease (Fig.Page 15, “Relationship of Csparse+latent to orientation tuning and physical distances”
- 7 G): The slope in the logistic model of connectivity with respect to Aori was significantly higher for positive than for negative interactions (p < 0.04 in each of the five sites, two-sample t-test of the difference of the logistic regression coefficient).Page 15, “Relationship of Csparse+latent to orientation tuning and physical distances”
- 7 B, C, E, F. p < 10—6 in each of the five sites, for both lateral and vertical distances, two-sample t-test of the difference of the linear regression coefficients in normalized data).Page 15, “Relationship of Csparse+latent to orientation tuning and physical distances”
- Positive connectivity decayed with distance (p < 10—6 in each of the five sites for positive interactions, t-test on the logistic regression coefficient in normalized data) (Fig.Page 15, “Relationship of Csparse+latent to orientation tuning and physical distances”
- 7 H and I): The slope in the respective logistic models with respect to the lateral distance was significantly higher for positive than for negative connectivities (p < 0.05 in each of the five sites, two-sample t-test of the difference of the logistic regression coefficients).Page 15, “Relationship of Csparse+latent to orientation tuning and physical distances”

See all papers in *March 2015* that mention t-test.

See all papers in *PLOS Comp. Biol.* that mention t-test.

Back to top.

Appears in 5 sentences as: logistic regression (5)

In *Improved Estimation and Interpretation of Correlations in Neural Circuits*

- Positive connectivity decreased with Aori (p < 0.005 in each of the five sites, t-test on the logistic regression coefficient) whereas negative connectivity did not decrease (Fig.Page 15, “Relationship of Csparse+latent to orientation tuning and physical distances”
- 7 G): The slope in the logistic model of connectivity with respect to Aori was significantly higher for positive than for negative interactions (p < 0.04 in each of the five sites, two-sample t-test of the difference of the logistic regression coefficient).Page 15, “Relationship of Csparse+latent to orientation tuning and physical distances”
- Positive connectivity decayed with distance (p < 10—6 in each of the five sites for positive interactions, t-test on the logistic regression coefficient in normalized data) (Fig.Page 15, “Relationship of Csparse+latent to orientation tuning and physical distances”
- Although the positive connectivity appeared to decay faster with vertical than with lateral distance, the differences in slopes of the respective logistic regression models were not significant with available data.Page 15, “Relationship of Csparse+latent to orientation tuning and physical distances”
- 7 H and I): The slope in the respective logistic models with respect to the lateral distance was significantly higher for positive than for negative connectivities (p < 0.05 in each of the five sites, two-sample t-test of the difference of the logistic regression coefficients).Page 15, “Relationship of Csparse+latent to orientation tuning and physical distances”

See all papers in *March 2015* that mention logistic regression.

See all papers in *PLOS Comp. Biol.* that mention logistic regression.

Back to top.

Appears in 4 sentences as: cross-validated (4)

In *Improved Estimation and Interpretation of Correlations in Neural Circuits*

- We then performed a cross-validated evaluation to establish which of the four regularized estimators was most efficient for representing the population activity of dense groups of neurons in mouse primary visual corteX recorded with high-speed 3D random-access two-photon imaging of calcium signals.Page 4, “Introduction”
- To demonstrate that estimator rankings were robust to deviations from Gaussian models, we repeated the same cross-validated evaluation using pairwise Ising models to generate the data.Page 8, “an”
- This simulation study demonstrated that cross-validated evaluation of regularized estimators of the covariance matrices of population activity can discriminate between structures of dependencies in the population.Page 9, “an”
- Therefore, showing that a more constrained model has better cross-validated performance than a more compleX model does not necessarily support the conclusion that it reveals a better representation of dependencies in the data.Page 18, “Model selection”

See all papers in *March 2015* that mention cross-validated.

See all papers in *PLOS Comp. Biol.* that mention cross-validated.

Back to top.

Appears in 3 sentences as: Gaussian distribution (1) Gaussian distributions (2)

In *Improved Estimation and Interpretation of Correlations in Neural Circuits*

- For multivariate Gaussian distributions , zero partial correlations indicate conditional independence of the pair, implying a lack of direct interaction [40, 52].Page 5, “Covariance estimation”
- We used these covariance matrices as the ground truth in multivariate Gaussian distributions with zero means and drew samples of various sizes.Page 7, “Simulation”
- Conveniently, the Ising model has equivalent mathematical form to the Gaussian distribution , but the Ising model is defined on the multivariate binary domain rather than the continuous domain.Page 8, “an”

See all papers in *March 2015* that mention Gaussian distributions.

See all papers in *PLOS Comp. Biol.* that mention Gaussian distributions.

Back to top.

Appears in 3 sentences as: sample size (1) sample sizes (2)

In *Improved Estimation and Interpretation of Correlations in Neural Circuits*

- An estimator that produces estimates that are, on average, closer to the truth for a given sample size is said to be more efficient than other estimators.Page 3, “Introduction”
- We drew 30 independent samples with sample sizes 11 = 250, 500, 1000, 2000, and 4000 from each model and computed the loss €(C, Z) for each of the five estimators.Page 8, “an”
- With increasing sample sizes , all estimators converged to the ground truth (zero loss) but the estimators with correct structure outperformed the others even for large samples.Page 8, “an”

See all papers in *March 2015* that mention sample sizes.

See all papers in *PLOS Comp. Biol.* that mention sample sizes.

Back to top.

Appears in 3 sentences as: simulation studies (1) simulation study (2)

In *Improved Estimation and Interpretation of Correlations in Neural Circuits*

- This simulation study demonstrated that cross-validated evaluation of regularized estimators of the covariance matrices of population activity can discriminate between structures of dependencies in the population.Page 9, “an”
- This approach was used to compute the covariance matrix estimates and their true loss in the simulation study (Fig.Page 20, “Cross-validation”
- The covariance matrices were then subjected to the respective regularizations to produce the ground truth matrices for the simulation studies (Fig.Page 23, “Simulation”

See all papers in *March 2015* that mention simulation study.

See all papers in *PLOS Comp. Biol.* that mention simulation study.

Back to top.