Effacer les filtres
Effacer les filtres

how to calculate singular values in collin test to detect multicollinearity

84 vues (au cours des 30 derniers jours)
NAFISA SIDDIQUI
NAFISA SIDDIQUI le 29 Juin 2024 à 1:59
Commenté : Paul le 11 Juil 2024 à 15:07
I am a bit confused as to how we find the singular values and therefore condition index number to detect multicollinearity in multiple linear regression analysis. Some mathematicians say the singular values are the square roots of the eigenvalues of the correlation matrix of the predictors of a model. While others says we use covariance matrix instead. Again some math publications said The singular values are the square roots of the eigenvalues of the square matrix X'X of multiple linear regression model. Then I treid using all three methods but when I cross checked with MATLAB results using collintest it does not match with either of my calculations. it does not explain how we go the output. can someone explain it to me?
  5 commentaires
Umar
Umar le 30 Juin 2024 à 17:25
Hi Nafisa, Glad to help, to answer your question regarding Belsely collinearity diagnostics, we have to understand the concept of comparing the singular values obtained from SVD (Singular Value Decomposition) with those from the Belsley collinearity diagnostics in Matlab, differences may arise due to the nature of the methods. SVD directly computes the singular values of a matrix, while collinearity diagnostics like the collin test in Matlab focus on assessing multicollinearity in regression models rather than directly computing singular values. It's essential to understand the specific purpose and methodology of each approach to interpret the results correctly. If you seek singular values, SVD is the appropriate method, whereas collinearity diagnostics are more suitable for assessing multicollinearity in regression analysis.
Hope this help clarifies to resolve your problem.
NAFISA SIDDIQUI
NAFISA SIDDIQUI le 30 Juin 2024 à 20:51
Below I have attached a dataset of boston house prices and when I tried calculating condition indices for each singular value, dividing the largest singular value by each of the singular values individually using your code. However I am getting different answer as to real one. I have also attached the example page. https://stataiml.com/posts/42_condition_index_r/#input-dataset

Connectez-vous pour commenter.

Réponse acceptée

Umar
Umar le 30 Juin 2024 à 21:29
Hi Nafisa,
In order to help you further with your problem, can you share the matlab code which is causing error, I have to review it in order to share my detailed thoughts. Hope that should not be a problem.
  1 commentaire
NAFISA SIDDIQUI
NAFISA SIDDIQUI le 30 Juin 2024 à 21:44
Absolutely,
data = readtable('boston house prices.xlsx', 'VariableNamingRule','preserve');
x1 = data.CRIM;
x2 = data.ZN;
x3 = data.INDUS;
x4 = data.CHAS;
x5 = data.NOX;
x6 = data.RM;
x7 = data.AGE;
x8 = data.DIS;
x9 = data.RAD;
x10 = data.TAX;
x11 = data.PTRATIO;
x12 = data.B;
x13 = data.LSTAT;
Predictors = [x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13];
[U, S, V] = svd(Predictors); % Singular value decomposition
singular_values = diag(S) % Extract singular values
condition_index = (max(singular_values) ./ singular_values)

Connectez-vous pour commenter.

Plus de réponses (4)

Umar
Umar le 30 Juin 2024 à 21:58
Hi Nafisa,
Due to incorrect calculation of the condition indices formula. The condition index should be the ratio of the largest singular value to the smallest singular value, not to each singular value individually. So, in order to correctly calculate the condition indices, modify the calculation as follows:
condition_indices = singular_values(1) ./ singular_values;
By modifying it will ensure that the largest singular value divided by each singular value individually, which gives us the correct condition indices for the predictors.
Hope this will help resolve your problem.

Paul
Paul le 30 Juin 2024 à 22:14
Modifié(e) : Paul le 30 Juin 2024 à 22:17
Hi NAFISA,
Using the example from the the doc collintest
load Data_Canada
Output from collintest
[sValues,condInx] = collintest(Data);
Variance Decomposition sValue condIdx Var1 Var2 Var3 Var4 Var5 --------------------------------------------------------- 2.1748 1 0.0012 0.0018 0.0003 0.0000 0.0001 0.4789 4.5413 0.0261 0.0806 0.0035 0.0006 0.0012 0.1602 13.5795 0.3386 0.3802 0.0811 0.0011 0.0137 0.1211 17.9617 0.6138 0.5276 0.1918 0.0004 0.0193 0.0248 87.8245 0.0202 0.0099 0.7233 0.9979 0.9658
The same output for the sValue and condIndx can be found by (not including various error checks)...
Scale Data so that each column has unit magnitude
sData = Data./vecnorm(Data,2,1);
Take the svd of the scaled data.
[~,S,V] = svd(sData);
S = diag(S);
Compute the indices
idx = max(S)./S;
table(S,idx)
ans = 5x2 table
S idx ________ ______ 2.1748 1 0.47889 4.5413 0.16015 13.579 0.12108 17.962 0.024763 87.825
Hopefully that makes sense in the context of what collintest is supposed to do. I have zero knowledge of that function.
  7 commentaires
NAFISA SIDDIQUI
NAFISA SIDDIQUI le 11 Juil 2024 à 14:12
Hello Paul, I need to understand theoreteically how it is calculated. I did not quite inderstand when you said doc age
Paul
Paul le 11 Juil 2024 à 15:07
I meant the doc page. Here is the link

Connectez-vous pour commenter.


Umar
Umar le 3 Juil 2024 à 3:00
Hi Nafisa,
You asked how do I calculate the singular values from collin test and from svd function theoretically.
To answer this question, regarding collinearity test, you can calculate singular values by examining the condition number of a matrix. The condition number is the ratio of the largest to the smallest singular value. Higher condition numbers indicate a higher degree of collinearity. While using the Singular Value Decomposition (SVD) function in Matlab, you can directly compute the singular values of a matrix.
  1 commentaire
NAFISA SIDDIQUI
NAFISA SIDDIQUI le 3 Juil 2024 à 17:22
Hello Umar, I did not quite understand when you said by examining the condition number of a matrix. It would be better if you can provide a formula or something. Also you said that While using the Singular Value Decomposition (SVD) function in Matlab, I can directly compute the singular values of a matrix.To compute the singular values I need to have a square matrix so I tried finding the eigenvalues of X'X where X isthe design matrix and when I did that I am getting different eigenvalues.

Connectez-vous pour commenter.


Umar
Umar le 3 Juil 2024 à 19:02
Hi Nafisa,
Glad to hear back from you. Please see my answers to your comments below.
Comment#1:did not quite understand when you said by examining the condition number of a matrix. It would be better if you can provide a formula or something.
Answer:The condition number of a matrix measures its sensitivity to changes in input, where a high condition number indicates potential numerical instability. In Matlab, you can compute the condition number of a matrix A using the cond() function: cond(A).The condition number of a matrix indicates how sensitive the matrix is to changes in its input values.Here is an example demonstrating how to compute the condition number of a matrix A in Matlab by creating a 2x2 matrix A and then use the cond() function to calculate its condition number. The result is displayed using disp().
>> % Define a matrix A A = [1, 2; 3, 4];
>> % Calculate the condition number of matrix A condition_number = cond(A);
>> disp(['The condition number of matrix A is: ', num2str(condition_number)]);
This information is valuable in assessing the stability and accuracy of numerical computations involving the matrix A.
Comment#2:Also you said that While using the Singular Value Decomposition (SVD) function in Matlab, I can directly compute the singular values of a matrix
Answer: Yes, you can directly compute the singular values of a matrix A using the svd() function: [U, S, V] = svd(A). Ensure that A is a square matrix for SVD computation.
Comment#3: To compute the singular values I need to have a square matrix so I tried finding the eigenvalues of X'X where X isthe design matrix and when I did that I am getting different eigenvalues.
Answer: When finding the eigenvalues of X'X, ensure that X is properly defined and that X'X results in a square matrix. To compute eigenvalues, you can use the eig() function: eigenvalues = eig(X' * X). Ensure that X is correctly defined to match the expected behavior.

Catégories

En savoir plus sur Linear Algebra dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by