Mathematical formulation of algorithms

#1
Hello,

I am working with ecological time series data (15 minutes interval = 96 measurements / day), i.e. measurements of tree stem dimension changes during the vegetation period. The data is rather complex, as the data contains irreversible (actual wood growth) and reversible (swelling and shrinking of non-woody tissues in the bark) components.

A problem that concerns me is defining the growth onset and cessation. I have tried several different approaches, some more and some less satisfying. The most convenient results were produced by algorithms that I have written in R by myself. The algorithms probably seem quite constructed, nevertheless they produce for all my datesets the a very broad consensus of biologically plausible results.

I am wondering if I can transfer my algortihms into mathematical formulations, however I have basically no experience in that and hope I can find some ideas here.

The algorithm for growth onset expressed in words is:
1. Finding the daily mean (one value), which is significantly bigger than the average of all previous daily maxima. For this I used a Wilcoxon signed-rank test (one-tailed test, P <0.01). Once, such a day (t) is identified in the data, the second precondition is that no following day (t+x) to day (t) violates the same criteria (i.e. the daily mean value of t+x, is not significantly bigger than the average of all its previous daily maxima).

2. Once the criteria above are fullfilled, the gradient of the maximum of day(t) to all subsequent daily maxima is calculated. The day beloning to the gradient, which irreversibly crosses the predefined threshold of 3 micrometers, is eventually defined as growth onset.

The algorithm for growth cessation expressed in words is:

Find the maximum value in the time series data and draw a horizontal line to the end of the data set (to the right). Starting in the horizontal position, increase the slope of the line past (left) until it touches the next maximum point. Once the slope
between two such maxima crosses the predefined threshold of 3 micrometers, the day belonging to the maxima later in time is considered as point of time, where growth ends.

The threshold of 3µm is a consensus, which is empirically based on a comparative method for studying seasonal growth dynamics.

Please see the following figure, which shows the data structure and the work of my algorithms:
https://www.dropbox.com/s/6m9wma75x4mwu2h/data_example.pdf

Thank you for your interest and support! Any help is appreciated.

Best regards
Florian
 
#2
This reminds me of the time like 300 or 400 years ago, when two writers from the same country, wrote to each other in Latin, just because it had such a high prestige. They could preferably written to each other in their own language.

If you can write to your fellow biologist and be understood by them, I think that is great. Think of the person who said: "It must have been very advanced. I didn't understand anything". Then, what is the point of communicating? The thing with mathematics is that sometimes you can derive properties that was not immediately obvious. But otherwise I can't see any point in writing it down like mathematical formulas.

1)Imagine a person she has a biological vision. 2)She writes down some equations to describe that vision. 3)She formulates estimators and algorithms to compute what need to be computed. 4) She checks if the results fits to the data. If not go back to point 2)

But then the only use of the math is as an intermediating step to develop an algorithm. Since you already have the algorithm, what use is there for the mathematical formula? Is it just for decoration? Maybe you can get plus points from some people because it looks advanced, but then you must choose carefully what to present for what audience.

I believe that you algorithm will be quite sustainable in comparison to many other methods.

I believe that if you simply search for "isotonic regression" and "monotone regression" you will find many good models. Then you will also find many mathematical formulations of the model.

If you also add the assumption that the first and second order derivatives exist, so that the model is "smooth", then I guess that you can find many useful sources.

Also, since you write programs in R, I know that there are many applications in this area written in R.

Please tell us about your findings in the future.