Covariance in Python
Covariance is defined as the estimate of the difference of change between two variables or more variables. It defines the changes of two variables together.
In Python, The covariance can be calculated between two Numpy arrays by using the numpy.cov(a1,a2) function.
Here, a1 expresses the set of values of the first variable, and a2 expresses the set of values of the second variable.
A 2D array is returned by the numpy.cov() function. The value at index[0][0] is the covariance between a1 and a1 and the index[0][1] value is the covariance between a1 and a2.
Index[1][0] the value is the covariance between a2 and a1 and the index[1][1] value is the covariance between a2 and a2.
Example:
import numpy as np
array1 = np.array([3,1,4])
array2 = np.array([5,1,3])
covariance = np.cov(array1, array2)[0][1]
print(covariance)
Output:
2.0
Now, we will see the two equal length vectors with one increasing and the other decreasing.
It returns the square covariance matrix and access covariance for the two variables as [0,1].
Example:
from numpy import cov
x = np.array([2,4,1,6,2,1,7,5,8])
print(x)
y = np.array([7,2,3,1,8,6,2,3,4])
print(y)
Sigma = cov(x,y)[0,1]
print(Sigma)
Output:
[2,4,1,6,2,1,7,5,8]
[7,2,3,1,8,6,2,3,4]
-3.75
Covariance Matrix
The covariance matrix is defined as the symmetric matrix and square matrix. The covariance between two or more two arbitrary variables is described.
The covariance matrix element Cij is the covariance of xi and xj. The element Cii is the variance of xi.
- If COV(xi, xj) = 0 then variables are discrete.
- If COV(xi, xj) > 0 then variables are positively discrete.
- If COV(xi, xj) > < 0 then variables are negatively discrete.
Syntax:
numpy.cov( )
Example 1:
import numpy as np
x = np.array([[3, 1, 4], [7, 3, 8], [1, 5, 7]])
print("Array shape:\n", np.shape(x))
print("Covarinace matrix of x:\n", np.cov(x))
Output:
Array shape:
(3, 3)
Covarinace matrix of x:
[[2.33333333 4. 0.66666667]
[4. 7. 0. ]
[0.66666667 0. 9.33333333]]
Example 2:
import numpy as np
x = [1.21, 2.33, 3.21, 4.56]
y = [2.87, 2.78, 3.43, 3.65]
cov_mat = np.stack((x, y), axis = 0)
print(np.cov(cov_mat))
Output:
[[2.00389167 0.536775 ]
[0.536775 0.179825 ]]
Example 3:
import numpy as np
x = [1.21, 2.33, 3.21, 4.56]
y = [2.87, 2.78, 3.43, 3.65]
cov_mat = np.stack((x, y), axis = 1)
print("matrix x and y:", np.shape(cov_mat))
print("covariance matrix:", np.shape(np.cov(cov_mat)))
print(np.cov(cov_mat))
Output:
matrix x and y: (4, 2)
covariance matrix: (4, 4)
[[ 1.3778 0.3735 0.1826 -0.7553 ]
[ 0.3735 0.10125 0.0495 -0.20475]
[ 0.1826 0.0495 0.0242 -0.1001 ]
[-0.7553 -0.20475 -0.1001 0.41405]]
Straight Covariance Matrix
The structured relation in a matrix of random variables is separated by the tool provided by the covariance matrix. It is used for decorrelating the variables and practiced as a transform to other variables. It is referred to as the key element utilized in the PCA(Principal component analysis).
Now, we take an example of two 9 element vectors and calculating the straight covariance matrix.
Example:
from numpy import array
from numpy import cov
x = np.array([2,4,1,6,2,1,7,5,8])
print(x)
y = np.array([7,2,3,1,8,6,2,3,4])
print(y)
Sigma = cov(x,y)
print(Sigma)
Output:
[2,4,1,6,2,1,7,5,8]
[7,2,3,1,8,6,2,3,4]
[[ 7. -3.75]
[-3.75 6. ]]
Conclusion
In the above article, we have learned about the covariance in Python with which we can determine the differences between the variables.