Are these expressions equivalent? (seeing greater errors in vectorized expression)

Question

0 votes

After converting a loop-based set of expressions to what I thought was a equivalent vectorized one I see a much more rapid accumulation of errors in a simple euler style numerical integration I am performing. This is confusing as the expressions appear to me to be equivalent. Does MATLAB use some reduced precision when computing vectorized expressions (float vs double maybe?). I can't think of any other reasons these two might not be equivalent.

Well, except for the fact that unlike other vectorized expressions, each element of B may be updated more than once here. I shouldn't think this should matter however since we are doing simple addition and the order that elements are updated does not matter.

Details of what the program is doing I don't think are relevant but follow the code snippet below

    % loop-based expression
    for j=1:size(L,1)
        B(L(j,1),3:4)=B(L(j,1),3:4)+F1(j,:)./m.*dt;
        B(L(j,2),3:4)=B(L(j,2),3:4)+F2(j,:)./m.*dt;
    end
    % Vectorized expression
    B(L(:,1),3:4)=B(L(:,1),3:4)+F1./m.*dt;
    B(L(:,2),3:4)=B(L(:,2),3:4)+F2./m.*dt;

F1 (note: F2=-F1) has two columns and the same number of rows as L.

Notes on what the script does:

This is from a simple toy script that performs a numerical integration of a mass-spring collection. These lines simply update the velocities of the blocks (B(:,3:4) are vx and vy). The spring links (L) columns 1 and 2 indicated which blocks are connected and tie back to that row in B. The force on each block at the end of each link (F1 and F2) is just above this in the script.

I can provide the entire script if needed.

6 commentaires
Afficher 4 commentaires plus anciens Masquer 4 commentaires plus anciens

dpb le 14 Déc 2015

I'd wager if you take the code snippets above out of the overall script and provide a set of static elements for the variables that you'll find for the single evaluation the results are identical. My guess is that it's owing to the fact it's in an integration routine and it's related to the path by which you get there; hence the comment regarding order possibly being significant (as well as the comment re: possible number of evaluations).

Adam V Steele le 14 Déc 2015

Ouvrir dans MATLAB Online

I have distilled the error into a very simple form so it is easy to see. Code attached below

Put simply: When performing a vectorized operation performed using a secondary array indexing, into the first MATLAB loses precision. This is true even if all numbers in the sum are similar in magnitude.

The results for the addition vectorized operation are always smaller than (or equal to) the non-vectorized version, never larger. (note: the results are nearly but not quite inverted for subtraction, the vectorized result is always larger or equal)
The larger the number of indices into the first array, the greater the error becomes.

I am having a hard time understanding how this can be intended behavior.

Code follows

clearvars;
clc;
% rng(1234);  %use whatever seed if desired. generally, no impact on
% result if enough trials run
els=10; % number of elemnts in accumulator array
trials=500;
%  !!! error rate increases with this number !!!
els2=20; % number of elemnts in indexing array.
A=zeros(els,1);
vec_is_smaller=0;
vec_is_larger=0;
vec_is_equal=0;
for t=1:trials
      B=rand(els2,1);
      idx=ceil(rand(els2,1)*size(A,1));
      A_nv=A;  % vectorized accumulator
      A_v=A;  % non-vectorized accumulator
      for j=1:els2
          A_nv(idx(j))= A_nv(idx(j)) + B(j);
      end
      A_v(idx(:))= A_v(idx(:)) + B(:);
      %logical array, ==1 if A_v is less than and not equal to A_nv
      element_is_smaller= (A_v ~= A_nv) & (A_v < A_nv);
      element_is_larger = (A_v ~= A_nv) & (A_v > A_nv);
      element_is_equal = (A_v == A_nv);
      vec_is_smaller = vec_is_smaller + sum(element_is_smaller)/size(A_v,1)/trials;
      vec_is_larger =  vec_is_larger + sum(element_is_larger)/size(A_v,1)/trials;
      vec_is_equal =  vec_is_equal + sum(element_is_equal)/size(A_v,1)/trials;
  end
['Vectorized accumulation produced a smaller result '...
    ,num2str(vec_is_smaller*100,2),'% of the time']
['Vectorized accumulation produced a larger result '...
    ,num2str(vec_is_larger*100,2),'% of the time']
['Vectorized accumulation produced an equal result '...
    ,num2str(vec_is_equal*100,2),'% of the time']

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Follow Question

Answer 1

dpb le 14 Déc 2015

Modifié(e) : dpb le 15 Déc 2015

Ouvrir dans MATLAB Online

0 votes

Well, it isn't so easy to see; but the two expressions are not the same thing...I was wrong in my conjecture earlier in not having the details of the indexing to be able to tell it is a "many to one" addressing, not just a reordering.

Consider a sample incarnation of the test code; where

>> idx.'
 4   4   9   9   2   7   5   7  10   3   6   8   5   7   6  10   4   3   6   4
>>

NB: there are duplicated row indices

>> n=histc(idx,[1:10]).'
n =
   0     1     2     4     2     3     3     1     2     2
>>

Consequently, when you run the loop, there are going to be four sequential entries summed into location 4, none at 1, etc., and the magic number of 1 for the 2nd and 8th. Consequently only the those latter two plus the first location (with a trivial zero result) are going to match. You could compute the percentage in error at this point for each trial without anything else as 0.3 correct/0.7 wrong. The alternate construction is identically the same as writing

A_v(idx) = B;

there's only a mapping of the elements of B to the indexed locations in A; since

length(idx)==length(B)

there's not a mismatch of number of elements to be assigned, simply an overwriting of the various given locations multiple times leaving the last one of B in that location specified by idx.

OTOH, sequentially you are building a summation which is why the locations with a discrepancy are always larger for a B uniformly >0. If B were all negative then the observation regarding using plus vis a vis minus as the operation would swap directions; if it were centered with zero mean then it would be random based on the sample.

The "vectorized" solution for this case would be the use of accumarray with the indices vector as the locations.

4 commentaires
Afficher 2 commentaires plus anciens Masquer 2 commentaires plus anciens

dpb le 15 Déc 2015

"...the two objects on the RHS and LHS of the equation with the same name do not reference the same object in memory"

What makes you think they don't? The problem is that the target for the LHS is multiply-defined. There's nothing in a simple assignment about summing and I don't think there's any other language that would do that for a similar assignment.

dpb le 15 Déc 2015

Modifié(e) : dpb le 15 Déc 2015

Ouvrir dans MATLAB Online

As noted before, what you're looking for here is accumarray...shortened example

>> A=zeros(10,1);
>> for i=1:20,A(idx(i))=A(idx(i))+B(i);end
>> V=accumarray(idx,B);
>> all(A==V)
ans =
   1
>>

See

doc accumarray % for details

Connectez-vous pour commenter.

Are these expressions equivalent? (seeing greater errors in vectorized expression)

6 commentaires
Afficher 4 commentaires plus anciens Masquer 4 commentaires plus anciens

Réponse acceptée

4 commentaires
Afficher 2 commentaires plus anciens Masquer 2 commentaires plus anciens

Plus de réponses (0)

Catégories

Tags

Community Treasure Hunt

Are these expressions equivalent? (seeing greater errors in vectorized expression)

6 commentaires Afficher 4 commentaires plus anciens Masquer 4 commentaires plus anciens

Réponse acceptée

4 commentaires Afficher 2 commentaires plus anciens Masquer 2 commentaires plus anciens

Plus de réponses (0)

Catégories

Tags

Voir également

Community Treasure Hunt

6 commentaires
Afficher 4 commentaires plus anciens Masquer 4 commentaires plus anciens

4 commentaires
Afficher 2 commentaires plus anciens Masquer 2 commentaires plus anciens