How can I identify a pattern of occurrences over multiple days?

3 vues (au cours des 30 derniers jours)
Peter
Peter le 12 Avr 2012
Hello all,
I am attempting to write a script that will look for a pattern of event occurrences over multiple days of data. Seems like it should be simple enough, yet I am scratching my head.
I want to identify spans of time of a minimum of five days where the event occurred on at least 5/7 of the days. For example, if the event occurred on all five weekdays, then did not occur over the weekend, then occurred again on the next 3 weekdays, I would want to return an index of all 10 of those days. A week later (perhaps after some random occurrences in between) if the event occurred for 3 days in a row, skipped a day, then occurred on the 5th day, then I would want a separate set of indices for this pattern.
The input: An array containing the date of each event as a round-number datenum, e.g:
dates= [734841 734842 734843 734844 734845 734848 734849 734850 734859 734860 734861 734863]
The output: a structure containing indices of the members of each separate pattern. e.g:
patternStructure(1).index = [1 2 3 4 5 6 7 8 9 10]
patternStructure(2).index = [20 21 22 23 24]
Thanks,
Peter

Réponses (1)

Geoff
Geoff le 13 Avr 2012
Well, what you could say is that a value is in the required set if you subtract the date 4 events ago from the date at the current event, and that difference is less than 7 days. That is:
in = (dates(5:end) - dates(1:end-4)) < 7;
Here, too, you can exploit regexp to find the start and end indices of each sequence:
[s,e] = regexp( char(in+'0'), '1+', 'start', 'end' );
And then, accounting for the end being 4 values out, you can construct an array of indices:
patternStructure = arrayfun( @(n) struct('index', s(n):e(n)+4), 1:numel(s) );
But now, these are indices into dates, and not actual date ranges. Your question is a little strange, given your data and your result.
See, the indices for the first detected pattern in your dates array is 1:8, not 1:10, but dates(8)-dates(1)+1 is indeed 10. This is the only range in your supplied data that fits the requirement. For testing, I added:
dates(end+1) = 734864;
Which gave a 5-out-of-6 pattern from indices 9:13
Anyway, this code will detect your patterns, and it's up to you what you want to do with the indices after that =)

Catégories

En savoir plus sur Calendar dans Help Center et File Exchange

Produits

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by