Table performance very slow

Question

Byron le 19 Nov 2015

2
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/256482-table-performance-very-slow

Modifié(e) : Victor le 26 Juin 2017

I have used tables within a physics model that is solved by ode23. The performance is very slow, and in troubleshooting (using profiler) I found that the majority of the time is spent in various table functions.

The three functions table.subsasgnDot, table.subsref, table.subsref alone take approximately 30% of the execution time. Within those functions it seems to be variable name checking that takes the majority of the time.

The variable names in every table are all known at the start of the program and don't change. It seems like it would be much far efficient to check once rather than every pass through the loop.

This is for a simulation that takes several minutes to run each case, and would be used to call many cases. So, the slow performance is a significant problem.

I understand performance is better when the problem is vectorized. One of the subroutines can calculate 15,000 points in 10 sec if called as a vector, but takes 1 hr if called in a loop.

However, since this problem is being solved with ode23 it is being called in a loop unavoidably and unfortunately I used tables everywhere before discovering how slow they are.

Is there any way to improve performance without major rewriting to remove all use of the table class?

2 commentaires
Afficher AucuneMasquer Aucune

dpb le 19 Nov 2015

I'm guessing likely not for the question as asked but might there a relatively simple way you could do the queries first and return the necessary data as an ordinary array or arrays for the solver to crunch on?

Victor le 26 Juin 2017

Modifié(e) : Victor le 26 Juin 2017

Added similar issue to Stackoverflow, it may be helpful: Matlab Table / Dataset type optimization

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Oleg Komarov le 28 Nov 2016

3
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/256482-table-performance-very-slow#answer_244998

Modifié(e) : Oleg Komarov le 28 Nov 2016

I have been using table() way before they were introduced into the core package, since de facto they are the ported version of the dataset() class from the Statistics Toolbox. I also noticed long time ago many limitations in terms of performance and functionality, and have logged feature enhancements with TMW.

To address the limitations of the table(), while waiting for the ufficial implementation of my enhancement requests, I created the tableutils(). Among the problems, you would be astonished to know that the disp() of a big table can literally freeze your pc until the next ice age (and I am not talking about the movies...). This is somethig that I fixed with a buffered disp method.

While my tableutils() do not address directly the problems in subsref/subsasgn, anyone is welcome to contribute to this effort to make the table() class better by submitting an issue or a Pull Request on Github.

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Answer 2

Daniel Petrini le 5 Oct 2016

1
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/256482-table-performance-very-slow#answer_237457

Modifié(e) : dpb le 5 Oct 2016

Ouvrir dans MATLAB Online

In my view: tables are very sporadic in perforance. Ranging from quick to very slow. I mean, do a clear and just >> table(). On mu 2016.b that can take many seconds. :-S I had to rewrite a (large) class based on tables to multiple vectors of same types. Performance is much more linear and trustworthy. Seems that the JIT does not know what to do with them? I wish Mathworks would post more about performance on these new data structures... In addition: the

<t=tic;my_class.insert_new_entry(...);toc(t)>

reported excellent times. Problem is that Matlab is "busy" and the output of toc(t) could take 2 sec to display (0.12 s)... What am I missing? I'm guessing it is some overhead in creating tables. i.e., table_1(1:5,my_col), creates a new table, and freezes...? Disclaimer: sitting on a 8 GB iCore7.

/Daniel Petrini, Stardots AB

2 commentaires
Afficher AucuneMasquer Aucune

Daniel Petrini le 6 Oct 2016

My answer is probably not an answer, but rather a comment. Sorry. My first contribution to Matlab Answers.

Oleg Komarov le 28 Nov 2016

The native table.disp() has a huge problem, and can freeze your pc for a long time. I implemented a buffered disp, that avoids this issue. See my answer below.

Connectez-vous pour commenter.

Answer 3

jbpritts le 24 Nov 2016

1
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/256482-table-performance-very-slow#answer_244600

I have Matlab 2016b. I can confirm that tables are terribly slow. Unless you really need it for heterogeneous data, then avoid them in any performance critical code. I will have to rewrite a fairly complicated section of code using legacy data structures. Matlab should address this extreme performance deficiency.

2 commentaires
Afficher AucuneMasquer Aucune

Image Analyst le 24 Nov 2016

They are tremendously better and faster than cell arrays though, and use far less memory.

Oleg Komarov le 28 Nov 2016

Internally, table() stores data in a cell array, where each column is a cell. So, your statement about speed and memory cannot be true, since there is additional overhead linked to VariableNames and matlab-coded subsref/subsasgn.

I do agree, that tables are more convenient.

Connectez-vous pour commenter.

Answer 4

Peter Perkins le 7 Oct 2016

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/256482-table-performance-very-slow#answer_237965

Ouvrir dans MATLAB Online

Byron, it's hard to make specific suggestions without knowing exactly what you're doing, but here are some thoughts.

Tables are best at managing data and doing vectorized operations. Based on your description, it sounds like you are probably doing scalar operations such as

t.Var(i) = x

in a loop. You've described your alternative as a complete rewrite to not use tables at all. But there is often a middle ground where you can find a localised scope in which you can pull some of the variables in a table out as, say, ordinary double vectors, do all the non-vectorized calculations on them in a scalar loop, and then assign back into the table. Sometimes you can even convert the table to a scalar struct and use the exact same syntax. Of course, separate variables or a scalar struct will not enforce equal number of rows, or provide a simple syntax for arbitrary rectangular selections, or the other things that tables are designed to do.

Hope this helps.

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Table performance very slow

2 commentaires
Afficher AucuneMasquer Aucune

Réponses (4)

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

2 commentaires
Afficher AucuneMasquer Aucune

2 commentaires
Afficher AucuneMasquer Aucune

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Voir également

Catégories

Tags

Community Treasure Hunt

Table performance very slow

2 commentaires Afficher AucuneMasquer Aucune

Réponses (4)

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

2 commentaires Afficher AucuneMasquer Aucune

2 commentaires Afficher AucuneMasquer Aucune

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Voir également

Catégories

Tags

Community Treasure Hunt

2 commentaires
Afficher AucuneMasquer Aucune

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

2 commentaires
Afficher AucuneMasquer Aucune

2 commentaires
Afficher AucuneMasquer Aucune

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens