# Trying to make 2 data sets the same length

21 views (last 30 days)

Show older comments

I have two datasets. One is a 1x102437 and the other is 1x41716. I am trying to make them the same length so that I can perform a paired t-test on the data. How do I make the 1x41716 the same length as the 1x102437? I have tries using interp1 but keep running into trouble with this.

Thanks!

##### 0 Comments

### Answers (3)

Voss
on 28 Feb 2023

Edited: Voss
on 28 Feb 2023

If you want to "expand" the shorter dataset to match the length of the longer one using interp1, here's one way:

dataset1 = rand(1,51); % random data

dataset2 = rand(1,21);

n1 = numel(dataset1);

n2 = numel(dataset2);

x1 = 1:n1;

x2 = linspace(1,n1,n2);

dataset2_interp = interp1(x2,dataset2,x1);

subplot(2,1,1)

hold on

plot(dataset1,'.-b')

plot(dataset2,'.-r')

legend

title('Original')

subplot(2,1,2)

hold on

plot(dataset1,'.-b')

plot(dataset2_interp,'.-r')

legend

title('Dataset 2 Expanded')

##### 0 Comments

William Rose
on 28 Feb 2023

You can do it, but that does not mean you should do it.

On what basis do you justify the pairing of the samples from the long vector with the interoplated samples form the short vector?

As for generating an equal number of samples (but I don;t recommend doing it unless you have a good justification):

Let's call the vectors y1 (long) and y2 (short). Illustrate with example vectors that are 1000 times shorter than yours:

y1=rand(1,102); y2=rand(1,42);

To interpolate y2 to be as long as y1, you need associated x values. Create a vector x2:

x2=1:length(y2);

Create a query vector, xq:

xq=linspace(1,length(y2),length(y1));

Interpolate:

y2int=interp1(x2,y2,xq);

disp([length(y1),length(y2int)])

y2int has the same length as y1. But this does not mean each value in y2 is paired with a certain element in y1.

##### 3 Comments

William Rose
on 28 Feb 2023

@Matthew, you mentioned "The only problem with this [ttest2()] is that it assumes that the data sets come from two independent samples." The paired t test (ttest()) also assumes independence of the individual samples from one another, and it assumes or makes use of the built-in pairing of the data. That second assumption might be quesitoned when the raw data is not paired point-by point, as in this case. Both the paired (ttest) and unpaired (ttest2) tests assume the samples are normally distributed with equal variance. If you don;t want to make assumptions about normality and equal variance, use the Wilcoxon rank-sum test, also known as the Mann-Whitney U test, without interpolation, on the unequal-size samples.

[p,h]=ranksum(y1,y2);

If you want to interpolate, and you want to do a paired test, without the assumptions of normality and equal variance that are inherent in a t test, do the sign test, which is the paired equivalent of the rank-sum test:

[p,h]=signrank(y1,y2int);

Image Analyst
on 28 Feb 2023

help ttest2

set1 = rand(1, 100);

set2 = rand(1, 50); % Second set has a different number of observations.

[h,p] = ttest2(set1, set2)

##### 2 Comments

William Rose
on 28 Feb 2023

### See Also

### Categories

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!