# corrcoef

Correlation coefficients

## Syntax

`R = corrcoef(X)R = corrcoef(x,y)[R,P]=corrcoef(...)[R,P,RLO,RUP]=corrcoef(...)[...]=corrcoef(...,'param1',val1,'param2',val2,...)`

## Description

`R = corrcoef(X)` returns a matrix `R` of correlation coefficients calculated from an input matrix `X` whose rows are observations and whose columns are variables. The matrix `R = corrcoef(X)` is related to the covariance matrix `C = cov(X)` by

$R\left(i,j\right)=\frac{C\left(i,j\right)}{\sqrt{C\left(i,i\right)C\left(j,j\right)}}.$

`corrcoef(X)` is the zeroth lag of the normalized covariance function, that is, the zeroth lag of `xcov(x,'coeff')` packed into a square array.

`R = corrcoef(x,y)` where `x` and `y` are column vectors is the same as `corrcoef([x y])`. If `x` and `y` are not column vectors, `corrcoef` converts them to column vectors. For example, in this case `R=corrcoef(x,y)` is equivalent to `R=corrcoef([x(:) y(:)])`.

`[R,P]=corrcoef(...)` also returns `P`, a matrix of p-values for testing the hypothesis of no correlation. Each p-value is the probability of getting a correlation as large as the observed value by random chance, when the true correlation is zero. If `P(i,j)` is small, say less than `0.05`, then the correlation `R(i,j)` is significant.

`[R,P,RLO,RUP]=corrcoef(...)` also returns matrices `RLO` and `RUP`, of the same size as `R`, containing lower and upper bounds for a 95% confidence interval for each coefficient.

`[...]=corrcoef(...,'param1',val1,'param2',val2,...)` specifies additional parameters and their values. Valid parameters are the following.

 `'alpha'` A number between 0 and 1 to specify a confidence level of 100*(1 – `alpha`)%. Default is 0.05 for 95% confidence intervals. `'rows'` Either `'all'` (default) to use all rows, `'complete'` to use rows with no `NaN` values, or `'pairwise'` to compute `R(i,j)` using rows with no `NaN` values in either column `i` or `j`.

The p-value is computed by transforming the correlation to create a t statistic having `n`-2 degrees of freedom, where `n` is the number of rows of `X`. The confidence bounds are based on an asymptotic normal distribution of `0.5*log((1+R)/(1-R))`, with an approximate variance equal to `1/(n-3)`. These bounds are accurate for large samples when `X` has a multivariate normal distribution. The `'pairwise'` option can produce an `R` matrix that is not positive definite.

## Examples

Generate random data having correlation between column 4 and the other columns.

```x = randn(30,4); % Uncorrelated data x(:,4) = sum(x,2); % Introduce correlation. [r,p] = corrcoef(x) % Compute sample correlation and p-values. [i,j] = find(p<0.05); % Find significant correlations. [i,j] % Display their (row,col) indices. r = 1.0000 -0.3566 0.1929 0.3457 -0.3566 1.0000 -0.1429 0.4461 0.1929 -0.1429 1.0000 0.5183 0.3457 0.4461 0.5183 1.0000 p = 1.0000 0.0531 0.3072 0.0613 0.0531 1.0000 0.4511 0.0135 0.3072 0.4511 1.0000 0.0033 0.0613 0.0135 0.0033 1.0000 ans = 4 2 4 3 2 4 3 4```