Matlab return zero's and NaN after a regression

Hello all,
I want to do a fitlm regression on my dataset. The dataset contains 80.000 rows and 30-35 columns. Running the first 2750 columns return a value for all the variables but running 2900 rows or more will return a '0' for all the variables and a NaN for the Tstat and the Pvalue. Does anyone had a clue what I am doing wrong?
Thanks in advance
This is the outcome for the first 2700 rows:
linear regression model:
rel_spr ~ 1 + post_dumm + isXBRL + size + leverage + Earnings_per_share + turnover
Estimated Coefficients:
Estimate SE tStat pValue
___________ __________ _______ __________
(Intercept) 0.026918 0.0020461 13.155 4.0497e-38
post_dumm -0.0016953 0.00094519 -1.7936 0.073016
isXBRL 0.0025597 0.0011885 2.1537 0.03137
size -0.0017311 0.0001673 -10.348 1.5309e-24
leverage -0.0037944 0.0025795 -1.471 0.14144
Earnings_per_share 6.3255e-09 1.2627e-08 0.50095 0.61645
turnover -0.00026829 0.00022523 -1.1912 0.23372
Number of observations: 2230, Error degrees of freedom: 2223
Root Mean Squared Error: 0.0216
R-squared: 0.0596, Adjusted R-Squared: 0.0571
F-statistic vs. constant model: 23.5, p-value = 4.93e-27
This is de outcome for 2900 rows or more:
Linear regression model:
rel_spr ~ 1 + post_dumm + isXBRL + size + leverage + Earnings_per_share + turnover
Estimated Coefficients:
Estimate SE tStat pValue
________ __ _____ ______
(Intercept) 0 0 NaN NaN
post_dumm 0 0 NaN NaN
isXBRL 0 0 NaN NaN
size 0 0 NaN NaN
leverage 0 0 NaN NaN
Earnings_per_share 0 0 NaN NaN
turnover 0 0 NaN NaN
Number of observations: 2278, Error degrees of freedom: 2278
Root Mean Squared Error: 0.0247
R-squared: NaN, Adjusted R-Squared: NaN
F-statistic vs. constant model: NaN, p-value = NaN

12 commentaires

Rik
Rik le 3 Oct 2019
Your fit is already horrible in the first output. Can you attach the data (or a subsection) and the code you're using?
Thomas Van Gorkom
Thomas Van Gorkom le 3 Oct 2019
Modifié(e) : Rik le 3 Oct 2019
Please find in the new comment the files
Edit Rik, attached files and text below moved from an answer posted as comment:
Hi Rik,
Thanks for you reply. Attached the first 1000 rows of the file and the Matlab Code that I am using.
Rik
Rik le 3 Oct 2019
Your code needs two csv files, but you included a single Excel file. Try to make a MWE so we can run your code without any other dependencies and can reproduce your issue.
Sorry, forgot those files. Got an MWE for the CAPM file containing the first 1000 rows again.
Would you propose more rows so that the code does not work anymore?
Rik
Rik le 3 Oct 2019
If you want us to track down the issue, that would be better, yes.
Rik
Rik le 3 Oct 2019
Again, xlsx instead of csv. Details probably matter.
Thomas, I uploaded the files, and I get the error:
Undefined operator '/' for input arguments of type 'cell'.
when MATLAB tries to execute the line
big_tableX.PRC = big_tableX.PRC/100000;
because big_tableX.PRC is apparently a cell array. (I think it is possible that we have different default settings for reading in the table?) Simple conversion didn't work for me.
Possibly the easiest thing, which would avoid us worrying about the preprocessing steps, would be for you to upload a MAT file as your workspace exists just before fitlm is executed. Then all we need to do is load that MAT file, and run fitlm.
Rik
Rik le 3 Oct 2019
Agreed. Maybe even the release is causing a difference that makes it difficult to reproduce the issue.
Nick
Nick le 18 Oct 2021
Modifié(e) : Nick le 18 Oct 2021
Hi Thomas, did you ever find the reason for this outcome? I'm having the same issue too. Regression works with a smaller (clean) dataset but not with the (clean) entire set.
@Nick, I'd guess so, but he never came back here with the answer, so you'll have better luck if you post your question and your data in a new question.

Connectez-vous pour commenter.

Réponses (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by