University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy  
 
 
Master's Thesis in MSc. Wind Energy 
Systems 
 
Analysis and Method Selection of a Measure-Correlate-Predict 
Methodology for a Digital Wind Buoy 
 
 
January 2022 
 
University of Kassel 
Department: Mechanics and Dynamics1 
Degree programme: ONLINE M.SC. WIND ENERGY SYSTEMS – WES.ONLINE 
 
First examiner : Dr. Julia Gottschall 
Second examiner: Prof. Dr.-Ing. Detlef Kuhl 
Starting date: 13 July 2021 
Date of submission: 13 January 2022 
 
 
Ahmet Okan Sargin 
 
 
Status of 
confidentiality 
x Public  
 Internal  
 Confidential  
 
1 Fachgebiet:  FB 14 Bauingenieur- und Umweltingenieurwesen 
Institut für Baustatik und Baudynamik 
 Baumechanik /Baudynamik 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy ii 
Abstract 
The determination of the site-specific wind conditions has a significant influence on the 
development and use of offshore wind energy. Lower uncertainties of wind potential result in 
cost-effective project financing. Floating lidar systems (FLS) or wind lidar buoys have become 
increasingly common in recent years as a measuring technology for the determination of 
offshore wind resource. However, due to harsh offshore environmental conditions, offshore 
measurements with FLS are prone to reliability issues which might result in lower data 
availabilities than required by industry guidelines. FLS are hard to reach during winter times in 
high wind periods with higher wave heights. It is not an exception that several months of FLS 
data would not be available for an MCP process. 
Motivated by this purpose, this work used a measure-correlate-predict (MCP) method to 
determine whether an interim step of gap-filling was required as part of a long-term correction 
procedure. With an hourly temporal resolution, the performance of a data filling algorithm with 
omnidirectional linear least squares was analyzed in depth. KPIs including MBE, MAE, and 
RMSE of mean wind speeds throughout concurrent periods were summarized from the 
investigation of deviations introduced by incremental sliding gaps of 1-day to 60-days gap 
scenarios. The model performance was assessed both for the training (SelfDF) and validation 
(ValDF) periods. The long-term wind speeds were derived for each iteration with and without 
a data-filling algorithm.  
Between the SelfDF and ValDF root mean square error of mean wind speed, a strong negative 
association was identified for all gap scenarios. This novel relationship (ISPE method) was 
used to determine the uncertainties in data-filling. The jackknife algorithm was deployed to 
assess the uncertainties in the long-term correction of both scenarios.  
One of the study's main questions was whether a short-term data filling phase was required 
before applying the long-term correction. Both scenarios showed identical long-term wind 
speed predictions negating this requirement for the considered MCP method. This was 
primarily due to the omnidirectional regression parameters and the reduced impact of the 
proportion of gaps on the model fit. 
The study reaffirmed the industry recommendation of 80% minimum availability for 
measurement campaign data as a reliable threshold since the mean deviation during 60-day 
gap periods was not more than 0.3% throughout the investigated iterations. 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy iii 
Acknowledgements 
I received much help, support, understanding, and empathy while working on this master's 
thesis.  
I'd like to sincerely thank my supervisor, Dr Julia Gottschall, for her invaluable advice in 
developing the research questions and methodology. I really appreciated the insightful 
feedback, which always reminded me of the big picture while correctly identifying 
potential issues in the assessment and keeping me in the right direction. 
I want to express my appreciation to Dr Andre Bisevic, Mr Oppermann from Unikims and the 
rest of the WES team for always being available to answer questions about the online wind 
energy systems (WES) programme over the years. My heartfelt gratitude also goes to Prof. 
Kuhl, who established such a magnificent concept that allowed us to develop and stay up with 
the new world without compromising our personal and professional responsibilities. 
I am grateful and honoured to have had the opportunity to work with such outstanding 
colleagues throughout my professional career. Thank you, Bungo, for introducing me to the 
wind industry, and thank you, Michael, Iain, for being there when I was learning the subtle 
nuances. Further,  I would like to express my gratitude to Wilhelm, Wolfgang, Anna, and OWC 
colleagues for their patience and support throughout the master thesis process. My warmest 
gratitude goes to the valuable wind resource analysts and specialists who took part in the 
questionnaire and provided helpful feedback. I hope that this research will be beneficial to 
them.  
I am indebted for the opportunity and possibilities to live in a free social democracy dedicated 
to human rights, peace, and justice built on scientific principles. Society and our broader 
network are fundamental components of who and what we are, no matter where or when we 
live.  
My dearest friends may have endured even more than I have, and they deserve to be 
appreciated generously. Thank you for being there, Nicole and Friedrich, even while I was 
swamped with work and studies. My soul-brother Kivanc was my mentor and brain trust during 
the whole process. Many others, alongside myself, genuinely value his remarkable positivity, 
productivity, and heartfelt compassion. My gratitude also goes out to pg.lost for making 
the study time enjoyable. 
My mother and father, with the unconditional support and love that have supported me 
throughout my life, deserve enormous appreciation. I am privileged and inexpressibly thankful 
for their presence. Asuka, you invested even more energy in this study than me. Thank you 
for your love, understanding,  patience and support. Without you, nothing would be possible. 
    
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy iv 
Table of contents 
Abstract ii 
Acknowledgements iii 
Table of contents iv 
List of figures vi 
List of tables x 
Acronyms and abbreviations xii 
List of notation xiv 
1 Introduction 1 
1.1 Research questions 2 
1.2 Literature review and questionnaire 5 
1.3 Methodology overview 6 
2 Methods and materials 10 
2.1 Wind resource assessment 10 
2.2 Statistical methods 12 
2.2.1 Definition of uncertainty 12 
2.2.2 Definition of type A and type B uncertainties 12 
2.2.3 The mean 12 
2.2.4 Variance and standard deviation 13 
2.2.5 Covariance and correlation coefficient 13 
2.2.6 Coefficient of determination 14 
2.2.7 Mean bias, absolute bias and root mean square errors 14 
2.2.8 Standard error 15 
2.2.9 Kolmogorov-Smirnov statistic 15 
2.2.10 Normal distribution 16 
2.2.11 Weibull distribution 17 
2.3 Review of MCP methods in wind resource assessments 17 
2.3.1 Linear regression methods 19 
2.3.2 Bin methods 22 
2.3.3 Matrix methods 24 
2.3.4 Novel computational methods 28 
2.3.5 Quantile mapping methods 32 
2.3.6 Empirical methods 32 
2.4 Definition of the measure-correlate-predict (MCP) algorithms 33 
2.4.1 Type classification of MCP 33 
2.4.2 Definition of an algorithm 34 
2.4.3 Classification of MCP methods 35 
2.5 Questionnaire results 37 
2.6 Definition of the key performance indicators and uncertainties 41 
    
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy v 
2.6.1 The interface of the KPIs to the uncertainty method 44 
2.6.2 Uncertainties in the long-term correction 45 
2.6.3 MCP method uncertainty 46 
2.7 Selection of the base-algorithm 51 
2.8 Design of the code for iterative analysis 55 
2.9 Datasets 59 
2.9.1 Selection of the measurement dataset 59 
2.9.2 Selection of the long-term reference dataset 59 
2.9.3 Measurement campaign overview 60 
2.9.4 Pre-processing and data preparation 61 
3 Results 66 
3.1 Evaluation of the MCP algorithms 66 
3.2 Evaluation of the base-case algorithm results 67 
3.2.1 Key performance indicators during the process 68 
3.2.2 Data filling results 84 
3.2.3 Long term correction results 87 
3.3 Evaluation of the DF and LTC uncertainties 92 
3.4 Proposed combined MCP uncertainty method 96 
4 Discussion and conclusions 98 
5 References 101 
Annex A Questionnaire A-1 
Annex B PreDF - Sectorwise exemplary results of the concurrent period B-1 
Annex C KPI Results C-1 
Annex D Evolution of self-prediction RMSE of MWS results D-1 
Annex E Evolution of validation MBE of MWS results E-1 
Annex F Evolution of validation MAE of MWS results F-1 
Annex G Evolution of validation RMSE of MWS results G-1 
Annex H Regression plots of self-prediction and validation RMSE H-1 
Annex I MMIJ transfer functions to obtain data-filling uncertainties in an 
representative location I-1 
Annex J Evolution of DFWS J-1 
Annex K Evolution of LTWS K-1 
Annex L Evolution of DF uncertainties L-1 
Annex M Evolution of JK uncertainties M-1 
Annex N Evolution of final uncertainties in LTWS N-1 
 
    
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy vi 
List of figures 
Figure 1-1. Scheme of the measure-correlate-predict (MCP) procedure ................................ 3 
Figure 1-2. Flow chart of the methodology............................................................................. 9 
Figure 2-1. Illustration of KS test ..........................................................................................16 
Figure 2-2. Block diagram of a typical MCP ..........................................................................18 
Figure 2-3. Illustration of the TLS method .............................................................................20 
Figure 2-4. Minimization of errors in LLS (left) and TLS (right) with respect to model fit........21 
Figure 2-5. Model fits [57] based on measurements (left) and representative algorithm points 
(right) ....................................................................................................................................23 
Figure 2-6. VS method (left) and NL-MoM (right) with bins and resulting piecewise linear fits
 .............................................................................................................................................24 
Figure 2-7. Sample data and first-order model for the wind speed-up ...................................25 
Figure 2-8. Flowchart of the matrix method ..........................................................................26 
Figure 2-9. Polynomial model used within matrix method, samples (left) and polynomial model 
(right) ....................................................................................................................................27 
Figure 2-10. Schematic diagram of an ANN with 2N wind speed and wind direction input signals 
of N reference stations and two wind data output signals of the target station ......................29 
Figure 2-11. Definition of an MCP algorithm at the example of linear regression ..................35 
Figure 2-12. Mind map of energy production uncertainty according to the draft IEC 61400-15
 .............................................................................................................................................46 
Figure 2-13. Selection of number of subsets based on concurrent period ............................49 
Figure 2-14. Sketch of the difference between JK and bootstrap resampling ........................49 
Figure 2-15. Mapping of sub-uncertainty components ..........................................................50 
Figure 2-16. Flowchart of evaluation of MCP uncertainty sub-uncertainty components at the 
example of industry practice [18] and TG6 technical guideline [35], with "correlation" 
uncertainty shown in amber as target of this study ...............................................................51 
Figure 2-17. Flow chart of the code ......................................................................................56 
Figure 2-18. Picture of the MMIJ station ...............................................................................61 
    
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy vii 
Figure 2-19. Weibull fit and histogram of MMIJ measurements in 2013 (left: ECN analysis, 
right: Fraunhofer IWES dataset) ...........................................................................................62 
Figure 2-20. Time synchronisation .......................................................................................63 
Figure 2-21. Annual trend analysis and comparison of reference datasets for the selected long-
term period 2000-2018 .........................................................................................................64 
Figure 3-1. MBE, MAE and DE results of the investigated MCP methods.............................66 
Figure 3-2. Comparison of LTWS with different MCP methods .............................................67 
Figure 3-3. PreDF – Heatmap of measured Weibull scale and shape factors for 1-day (left) and 
60-days gap scenarios (right) in each column, respectively ..................................................69 
Figure 3-4. PreDF – Heatmap of R² values of sectorwise hourly wind speeds correlation for 1-
day (left) and 60-days gap scenarios (right) ..........................................................................70 
Figure 3-5. PreDF – Box plot of R² values of sectorwise hourly wind speeds correlation for 1-
day .......................................................................................................................................71 
Figure 3-6. PreDF – Box plot of R² values of sectorwise hourly wind speeds correlation for 60-
days gap ..............................................................................................................................72 
Figure 3-7. PreDF – Heatmap of MBE of Weibull shape (left) and scale (right) factors for all 
iterations and gap scenarios – weighted from sectorwise analysis .......................................74 
Figure 3-8. SelfDF – Boxplot of R² values of sectorwise hourly wind speeds correlation for 1-
day scenario .........................................................................................................................76 
Figure 3-9. SelfDF – Boxplot of R² values of sectorwise hourly wind speeds correlation for 60-
day scenario .........................................................................................................................76 
Figure 3-10. SelfDF - 3D evolution of RMSE of MWS for all sectors and gaps .....................77 
Figure 3-11. SelfDF – Heatmap of MBE of Weibull shape (left) and scale (right) factors for all 
iterations and gap scenarios – weighted from sectorwise analysis .......................................78 
Figure 3-12. SelfDF – Evolution of RMSE of MWS for 1-day (top) and 60-days (bottom) 
scenarios – omnidirectional analysis ....................................................................................79 
Figure 3-13. SelfDF – Heatmap of RMSE of MWS for all iterations and gap scenarios – 
omnidirectional analysis .......................................................................................................80 
Figure 3-14. ValDF – Evolution of MBE of MWS for 1-day (top) and 60-days (bottom) scenarios
 .............................................................................................................................................81 
    
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy viii 
Figure 3-15. ValDF – Evolution of MAE of MWS for 1-day (top) and 60-days (bottom) scenarios
 .............................................................................................................................................82 
Figure 3-16. ValDF – Evolution of RMSE of MWS for 1-day (top) and 60-days (bottom) 
scenarios ..............................................................................................................................82 
Figure 3-17. Regression plots of self-prediction and validation RMSE for 1-day (top) and 60-
days (bottom) scenarios .......................................................................................................83 
Figure 3-18. Evolution of MBE of observed vs predicted wind speeds for 60-days gap period
 .............................................................................................................................................84 
Figure 3-19. Time series of observed vs predicted wind speeds for 60-days gap period starting 
on 01.07.2012 ......................................................................................................................85 
Figure 3-20. Scatter plot of observed vs predicted wind speeds for 60-days gap period starting 
on 01.07.2012 ......................................................................................................................85 
Figure 3-21. Comparison of wind direction frequency of observed vs predicted wind speeds for 
60-days gap period starting on 01.07.2012...........................................................................86 
Figure 3-22. Evolution of STDF-WS for 1-day (top) and 60-days (bottom) gap scenarios .....87 
Figure 3-23. Evolution of LTWS without DF and LTWS with DF for 1-day (top) and 60-days 
(bottom) gap scenarios .........................................................................................................88 
Figure 3-24. Scatter plot of DF predicted vs LTC predicted wind speeds for 60-days gap period 
starting on 01.07.2012 ..........................................................................................................88 
Figure 3-25. Concurrent measured and referenced monthly wind speeds during short-term 
period ...................................................................................................................................90 
Figure 3-26. Monthly windiness comparison of the short and long-term period .....................90 
Figure 3-27. Measured wind frequency roses, measurement period 2013 (top left), 
measurement period 2014 (top right), measurement period 2015 (bottom left), long-term 
reference period (bottom right) .............................................................................................91 
Figure 3-28. Evolution of DF uncertainties for 1-day and 60 days gap scenarios ..................93 
Figure 3-29. Evolution of JK uncertainties in LT correction for 1-day and 60 days gap scenarios
 .............................................................................................................................................94 
Figure 3-30. Evolution of combined uncertainties in LT correction for 1-day and 60 days gap 
scenarios ..............................................................................................................................94 
Figure 3-31. Comparison of empirical and calculated uncertainties in wind speeds for 60 days 
gap period starting on 01.07.2012 ........................................................................................95 
    
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy ix 
Figure 3-32. Comparison of bootstrap and calculated uncertainties in wind speeds for 60-days 
gap period starting on 01.07.2012 ........................................................................................96 
 
    
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy x 
List of tables 
Table 2-1. Statistical characteristics of the wind [25] 11 
Table 2-2. ANN settings at the example of regression methodologies used for the MCP 
methodology 29 
Table 2-3. SVR settings at the example of regression methodologies used for the MCP 
methodology 31 
Table 2-4. Classification of MCP methods according to Hanslian 33 
Table 2-5. MCP method 1: Properties of linear regression methods 36 
Table 2-6. MCP method 2: Properties of bin methods 36 
Table 2-7. MCP method 2: Properties of properties matrix methods 36 
Table 2-8. MCP method 3: Properties of novel computational methods 36 
Table 2-9. MCP Method 4: Properties of quantile mapping methods 37 
Table 2-10. MCP Method 5: Properties of empirical methods 37 
Table 2-11. Summarized survey response to the question regarding KPI metrics 38 
Table 2-12. Definition of KPIs 42 
Table 2-13. PreDF KPI 43 
Table 2-14. SelfDF KPI 43 
Table 2-15. ValDF KPI for the gap 44 
Table 2-16. Uncertainty estimators in the area of long-term corrections method uncertainty 45 
Table 2-17. MCP algorithms for implementation of linear regression (LinReg) 52 
Table 2-18. MCP algorithms for implementation of other methods 53 
Table 2-19. Relationships and datasets for data filling at the example of data segments 57 
Table 2-20: Reference relationships for the KPI classification 58 
Table 2-21. MMIJ Instrumentation 60 
Table 2-22. MMIJ short-term statistics 62 
Table 2-23. Reference dataset statistics for the concurrent and long-term periods 65 
Table 3-1. Coefficients of variation of considered MCP methods 67 
    
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy xi 
Table 3-2. PreDF - Summary statistics of RMSE of MWS for 1-day and 60-days gap scenarios
 73 
Table 3-3. PreDF - Summary statistics of KS of MWS for 1-day and 60-days gap scenarios 74 
Table 3-4. SelfDF - Summary statistics of RMSE of MWS for 1-day and 60-days gap scenarios
 77 
Table 3-5. Summary statistics of ValDF for all gap periods 81 
Table 3-6. LLS model parameter of validation period for 60-days gap period (start at 
01.07.2012) 84 
Table 3-7. LLS model parameter of LTC for 1-day, 20-days and 60-days scenarios 89 
Table 3-8. Sectorwise LLS model parameter – full measurement period 92 
 
 
 University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy xii 
Acronyms and abbreviations 
ANN: Artificial neural networks .............................................................................................................. 28 
CAPEX: Capital expenditure ................................................................................................................... 1 
CDF: Cumulative distribution function ................................................................................................... 28 
DE: Distribution error ............................................................................................................................. 16 
DF: Data-filling ......................................................................................................................................... 3 
DFWS: Data-filled short-term wind speeds ............................................................................................. 7 
DT: Decision trees ................................................................................................................................. 28 
ECMWF: European Centre for Medium-Range Weather Forecasts ....................................................... 7 
ERA5: ECMWF Reanalysis 5th Generation ............................................................................................ 7 
FLS: Floating lidar systems ..................................................................................................................... 1 
ISo1: Industry software 1 - Windographer ............................................................................................. 35 
ISo2: Industry software 2 - Windfarmer ................................................................................................. 35 
ISo3: Industry software 3 - WindPRO ................................................................................................... 35 
JK: Jack-knife .......................................................................................................................................... 8 
JPD: Joint probability distribution .......................................................................................................... 24 
KPI: Key performance indicator ............................................................................................................... 3 
KS: Kolmogorov–Smirnov ..................................................................................................................... 15 
LLS: Linear least squares ...................................................................................................................... 19 
LTMOMM: Long-term mean of monthly means..................................................................................... 39 
LTWS: Long-term wind speed ................................................................................................................. 6 
MCP: Measure-Correlate-Predict ............................................................................................................ 2 
MCPs: Multilayer perceptrons ............................................................................................................... 29 
MEASNET: Measuring Network of Wind Energy Institutes ..................................................................... 1 
ML: Machine learning ............................................................................................................................ 28 
MTS: Matrix time series ......................................................................................................................... 28 
NL-MoM: Non-Linear method of moments ............................................................................................ 23 
OLS: Ordinary least squares ................................................................................................................. 19 
PCA: principal component analysis ....................................................................................................... 19 
 University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy xiii 
PreDF: Prerequisites for data filling ......................................................................................................... 6 
SelfDF: Self-predictions for data-filling .................................................................................................... 7 
SVR: Support vector regression ............................................................................................................ 28 
TLS: Total least squares ........................................................................................................................ 19 
ValDF: Validation for data-filling .............................................................................................................. 7 
VM: Variance method ............................................................................................................................ 21 
VS: Vertical slice .................................................................................................................................... 23 
WAsP: Wind Atlas Analysis and Application Program ............................................................................ 5 
WD: Wind direction .................................................................................................................................. 6 
WDD: Wind direction deviation .............................................................................................................. 42 
WPD: Wind power density ..................................................................................................................... 10 
WRA: Wind resource assessment ......................................................................................................... 10 
WS: Wind speed ...................................................................................................................................... 2 
WTG: Wind turbine generator................................................................................................................ 10 
 University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy xiv 
List of notation 
 : predicted mean ................................................................................................................................ 19 
: independent variable ....................................................................................................................... 19 
: observed (measured) value ............................................................................................................. 12 
: predicted value ................................................................................................................................ 12 
: sample mean .................................................................................................................................... 13 
: sample variance ............................................................................................................................. 13 
: standard deviation .......................................................................................................................... 13 
e: the random variable from the triangular distribution corresponding to the standard deviation of the 
ratios .................................................................................................................................................. 28 
r: the average of wind speed ratios at the target site ............................................................................ 28 
 
𝑥 
𝑦𝑖  
𝑦 𝑖  
𝑥 
𝑠𝑥
2 
𝑠𝑥  
Introduction 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 1 | 110 
1 Introduction 
The recent coalition agreement by the government laid out specific targets for renewables, 
aiming to achieve 80% of electricity demand by renewable energies. Part of this plan envisions 
remarkable investments in offshore wind, targeting 30 GW by 2030 and ramping up to 70 GW 
by 2045 [1]. 
As the fuel of the wind energy projects is wind, the assessment of the wind resource for 
offshore projects plays a fundamental role in project financing. The stakeholders in the industry 
have therefore established best practices and standards. One of the key institutions is a group 
of commercial institutes named as the Measuring Network of Wind Energy Institutes 
(MEASNET), which aims for the standardisation of wind energy measuring processes so that 
findings may be recognised and used interchangeably. 
The MEASNET guideline for "EVALUATION OF SITE-SPECIFIC WIND CONDITIONS" has 
established the methodology and standards for a site assessment approach that will result in 
well-founded outcomes using state-of-the-art techniques/procedures [2]. The guideline 
prescribes a clear requirement for site-specific wind measurements as input to wind resource 
and energy yield assessments. 
Floating lidar technology was first launched in 2009 as an offshore wind measuring technology 
aimed at the wind industry's particular demands for wind resource assessment applications. 
Floating lidar systems (FLS) or wind lidar buoys have become since then increasingly common 
in recent years as a measuring technology for determination of the offshore wind resource. 
They replace wind measuring masts with comparable accuracy at significantly lower costs and 
shorter mobilization, saving a large portion of the project's initial capital expenditure (CAPEX) 
[3]. 
As the FLS technology matured and several commercial deployments were made, it became 
apparent that the post-processed data availability of the FLS measurement campaigns 
exhibited data gaps [4]. The typical way to handle data gaps in an onshore campaign would 
be to use an intra-mast anemometer and conduct synthesis, as well as use a correlation 
analysis from nearby measurement masts. In the offshore environment, the FLS failure results 
in the unit's complete downtime, thus making an intra-FLS correlation impossible. Therefore 
other methods for dealing with the data gaps within the measurement period is being 
investigated [4]. 
This study aims to investigate the impact of the data gaps on the long-term wind speeds as 
part of the "Digital Wind Buoy" project, which has been started by Fraunhofer IWES to develop 
procedures to address the limitation of the above-mentioned data gaps. Within the scope of 
Introduction 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 2 | 110 
this project, methods will be analyzed, evaluated and developed to synthesize and extend 
measurement data to long-term periods by means of numerical models [5]. 
Further, as the stakeholders in the offshore wind industry are looking for ways to minimise the 
uncertainties in the energy yield prediction, the impact on the uncertainty has been 
investigated. 
In the subsequent subsections, a literature review, research questions, the feedback from 
stakeholders and the design of the particular analysis are presented. 
1.1 Research questions 
FLS are the de-facto standard measurement technology for offshore wind resource 
assessment. However, the relatively lower post-processed system availabilities of FLS 
compared to offshore platforms bring the requirement that the uncertainty and bias introduced 
by a data gap are understood in a quantifiable manner [3]. 
Gap filling of meteorological time series is required for various applications requiring 
continuous data series, such as time series analysis, meteorological and climatological 
modelling [6]. Motivated by the industry problem stated in the previous paragraph, Fraunhofer 
IWES looked at the effect of data gaps in terms of bias in estimating siting parameters and 
how to mitigate it by correlating and filling in the gaps with data from mesoscale models [4]. 
The authors of [4] have shown that the influence of gaps grows steadily with gap length during 
the measurement periods. 
On both short and extended time periods, wind speed (WS) is subject to irregular variation with 
a wide variety of time scales superimposed on each other [7]. Current procedures prerequire 
at least one year (defined as a short-term) of measured wind data at the location of interest to 
provide a meaningful wind resource evaluation[2]. That, however, is not sufficient to predict 
the wind characteristics from year to year. 
Measure-Correlate-Predict (MCP) methods or a form of long-term scaling approach must be 
used to estimate long-term wind conditions based on the short-term measurement at the 
measurement location (target) [2]. The MCP methodology is summarized in Figure 1-1 as 
defined by the MEASNET. 
Introduction 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 3 | 110 
Figure 1-1. Scheme of the measure-correlate-predict (MCP) procedure  
 
Source: [2] 
In the same guideline, Measnet characterises an MCP methodology suitable for both long-
term correction and data filling of the wind speed time series. Similarly, Fraunhofer IWES 
applied the MCP procedure to test the impact of data gaps in its recent study [4].  
Therefore, this study investigated the MCP methods as the initial step, which are broadly used 
within the industry to understand why the state-of-the-art techniques are being used by the 
stakeholders and not other ways. This is further extended by discussing whether the industry 
can learn from this experience and short-list methods with good prospects.  
By investigating the best method for data filling and the applicability for a wind resource 
assessment, this study aims to identify a suitable method to fill out the data gaps for an offshore 
measurement. The impact of the data gap on the robustness would be a key criterion to enable 
a robust wind resource assessment. Hence an indication of the maximum duration of the data 
gap from the study is considered very valuable. Basic functions with the different methods 
should be laid out briefly to ensure proper application, and an appropriate method should be 
applied for the analysis. 
In order to evaluate the uncertainty of the findings, users of an MCP method must have long-
term data at each location from which to draw conclusions. There is a resulting concern about 
whether it is possible to assess the MCP prediction uncertainty using just the long-term 
reference site data and the shorter-term concurrent data at the target site [8]. This problem 
could be investigated by recording key performance indicators (KPIs) for the different analysis 
steps. 
The definition of KPIs to identify the most appropriate data filling method is a challenge. What 
would be the minimum acceptable criteria (key parameter) to perform the operations of the 
selected method? Is the uncertainty calculation a proxy to define the "best method"? Or, should 
the selected uncertainty method be applicable to the data-filling (DF) process and not 
Introduction 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 4 | 110 
„long-term correction“? How should the uncertainties (step 1 – gap filling) and (step 2 – long-
term correction) be combined as they might be dependable on each other? 
This study proceeds with the primary question of whether an interim step of data filling is 
necessary before the application of the long-term correction. A tangible method to define the 
uncertainty of the analysis has been investigated. 
Typically the expected end results of a concluded gap filling and long-term correction operation 
constraints the type of analysis. As the wind industry goes in the direction of energy time series 
as a key deliverable, the investigated method should ideally be suitable to deliver such output. 
Other final deliverables are a Weibull statistic or sector-wise wind frequency distribution. 
Finally, the best combination of methods (sequence) to conduct a data-filling (also referred 
sometimes as data synthesis or gap-filling) and long-term correction exercise is investigated 
as the final step. 
The first research question has been approached twofold. The first part consisted of a literature 
review on existing MCP methods, followed by the second step of a stakeholder questionnaire. 
The results of the literature review and questionnaire informed the decision about the 
methodology. 
A tangible outcome of this study is to inform the reader about the overall maximum acceptable 
gap duration in a year for an offshore measurement campaign for a robust wind resource 
assessment. 
It is noted that possible secondary investigations can be done to confirm the robustness of the 
gap-filling process. Environmental variables could be investigated within such analysis. Such 
analysis could explore the relationship between reference and target data in the best way 
possible to account for different weather conditions. There could be a situation where a certain 
“outlier“ weather condition is available within the concurrent data, which is not representative 
of the expected long-term. Further consideration of specific environmental conditions might be 
relevant for the procedure. There could be the risk that certain weather events could skew the 
results. 
And as the frequency of extreme weather events is expected to increase due to climate 
change, one should think about whether this assumption is likely to influence the investigated 
method beyond the fact that such extreme events are likely to increase the data gaps in 
commercial floating measurement campaigns. 
Introduction 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 5 | 110 
1.2 Literature review and questionnaire 
A review of the literature on MCP methods and the use of data-filling of gaps was conducted 
to inform and structure the rest of the master’s thesis. The literature review included industry 
publications, white papers, user manuals of industry-standard software, published books and 
peer-reviewed journal articles. Cross-references from well-known studies like Carta [9] were 
helpful to obtain more information on the research topics. Further, a stakeholder questionnaire 
was designed and distributed to key industry experts to collect feedback on the research 
questions and suitable methods. 
As the long-term wind climate properties are needed in wind resource and energy yield 
assessments, and as obtaining complete time series data over the whole historical period is 
typically not possible, the purpose of any long-term wind correction is to derive a statistical 
representation of the expected long-term climate or an equivalent time series. The initial MCP 
methods were introduced in the 1940s to estimate the long-term mean annual wind speed 
based on a single reference station [9]. 
The relationship between the reference and observed datasets can be mathematically defined 
as a transfer model. According to [10], there are at least four main kinds of transfer models: 
1. Models that represent the physics of the wind flow (e.g., CFD flow models) 
2. Statistical models 
3. Empirical models 
4. Other (combinations of the above, such as Wind Atlas Analysis and Application Program 
(WAsP)) 
MCP models may fall into any category or a combination of them, showing that MCP models 
can be used in a broad range of situations [10]. According to Addison [11], MCP techniques, 
in comparison to physical modelling methods, often give a better degree of accuracy, 
particularly in complex terrain. Physical models like CFD or WAsP might also introduce 
unquantifiable uncertainty into the prediction process. As a result of these improvements, MCP 
techniques have become a frequent tool for wind farm developers and have been integrated 
into wind energy software packages [9]. 
The statistical MCP methods and corresponding correlation techniques introduced by Derrick 
[12], Mortimer [13], Taylor [14], Bechrakis [15], and Rogers [8] were investigated [16]. These 
methods are introduced and discussed in Section 2.3 alongside selected empirical methods. 
It is noted that, typically, MCP methods are used to estimate the magnitude of the wind speed 
at the target location but not the wind direction (WD). Nothing in the literature on MCP 
approaches specifically specifies stand-alone wind direction prediction at a target location [17]. 
Introduction 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 6 | 110 
Mifsud also mentions that MCP techniques predict the long-term wind speed at a location but 
not the wind direction (WD) [17]. As referred in Section 2.4, the wind direction measurements 
are used typically as a classifier to divide the wind speeds into different bins or sectors, which 
are further processed in the respective algorithms. 
1.3 Methodology overview 
Following the literature review and stakeholder questionnaire, the different MCP methods are 
discussed, followed by the preparation of the target and reference datasets for MCP to select 
a suitable method for a gap-iteration algorithm. The complete methodology applied within this 
study is presented in Figure 1-2. 
In line with the standard industry convention, the data-filling for gaps and long-term correction 
methods used in this study do not replace the observed (measured) time series but instead 
extends the existing observed dataset to the long-term [18]. 
The long-term correction of the entire measured period was conducted repeatedly with the 
industry-standard engineering software Windographer and WindPRO, equating to a total of 43 
MCP runs. Further, a performance test algorithm has been run within the Windographer 
software to compare the available MCP methods in terms of their performance. Based on the 
sensitivity analysis of the final long-term wind speeds (LTWS) and results of the performance 
test, the omnidirectional linear regression method, with least-squares model fit with offset, was 
identified as a suitable solution for iterative analysis. It is noted at this stage that the sectorwise 
results were analyzed in parallel during the concurrent period to gain confidence in the MCP 
algorithm's performance and collate the KPI metric. 
The KPIs were evaluated under three groups. The first group is the “PreDF”, the acronym for 
“prerequisites for data filling”. PreDF looks at the relationship between the reference and 
observed datasets in order to make the judgement of whether the reference dataset is suitable 
for the MCP application. The PreDF-KPI also gives a benchmark about the performance of the 
MCP, as it does not involve any model and compares only two independent variables set for 
the concurrent short-term measurement period. 
A common analysis option for MCP performance is to test the result of predictions versus a 
known result, which is referred to as “self-predictions” [19]. Typically this method is used to 
slice sufficiently long-term measured data into chunks and compare the prediction results of 
these chunks with the measured data [20]. It is noted that the terminology “self-prediction” is 
used in a slightly different context here, where the time period of the initial and target period is 
identical. In statistics, the difference between model output and observed value is sometimes 
referred to as a residual, defining the accuracy of the model using the prediction error [21]. It 
Introduction 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 7 | 110 
is noted that in this study, the term “residual” is reserved for a random error introduced by the 
model, as discussed in [22].  The second group of KPIs were obtained from evaluating the 
model performance from the concurrent periods. The linear fit obtained from the correlations 
between the measured and reference datasets for the concurrent periods is used to obtain the 
self-predictions for the same period. The results are compared with the measured data. From 
the comparison, the KPIs are defined, referred to as the self-predictions for data-filling 
(SelfDF).  
The third KPI group is gathered from the analysis of the gap periods, as these provide the 
“true" performance of the MCP data-filling procedure. The focus was laid on the mean wind 
speed mainly. The relationship between the measured and gap period was investigated as 
well to gain an understanding of the related uncertainties. The signifier for this KPI group is 
ValDF, standing for “validation for data-filling”. 
Different gap periods starting with one day up to sixty days were investigated to find a 
quantifiable metric to forecast the performance of the data-filling and long-term correction 
algorithm with European Centre for Medium-Range Weather Forecasts (ECMWF) Reanalysis 
5th Generation (ERA5) as a reference dataset. For each gap period, the gap was cut from the 
combined dataset, introducing a measured period with an artificial gap. A sectorwise linear 
least regression was applied within this training period (measured period with a gap) between 
the measured and reference hourly wind speeds values before obtaining the linear regression 
model to confirm the applicability. In the subsequent run, an omnidirectional linear regression 
model was run due to computational limitations. This model fit was used to obtain both self-
prediction performances and to predict the wind speeds at the introduced artificial gap. 
The performance of a measure-correlate-predict (MCP) algorithm for data-filling with linear 
least squares was analysed in detail using two years of the Ijmuiden met mast (MMIJ) 
measurements (see Section 2.9.3) both with a data-filling process and without. A temporal 
resolution of one hour was selected for the correlations and model. 
The comparison of the root mean square errors (RMSE) of the mean wind speed (MWS) of 
the self-prediction and validation period show a strong negative correlation for the investigated 
periods, obtained from the metrics of the incremental gaps within the calculated gap period. 
The function of this relationship was used as a proxy to assess the quality of the prediction 
(following the MCP approach). This process is used to calculate the uncertainty in the 
data-filling of the gaps. The data-filled short-term period average is referred to as data-filled 
short-term wind speeds (DFWS). 
The analysis was proceeded by obtaining the long-term wind speeds in a new set of iterations 
in two loops. Within the inner loop, the single-day gap was moved through the measurement 
Introduction 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 8 | 110 
period by shifting the start time by one day. The outer loop increased the gap duration 
incrementally by one day, starting with one day up to a total of 60-days.  
The LTWS was calculated in two scenarios for each iteration as mentioned above, using a 
data-filling and without data-filling procedure. The regression model was used to fill out the 
gap in the first scenario. Subsequently, the gap-filled dataset was used as if it was a measured 
time series, and a new regression model was obtained for long-term correction. This 
relationship was used to calculate the final LTWS for the first scenario. The second scenario 
was designed to obtain the LTWS without the data-filling procedure. The regression model 
was obtained from the relationship between the measured period, including the gap and the 
concurrent reference time series. Similarly, the LTWS was calculated using the same method. 
The uncertainties in the long-term correction were calculated using a jack-knife (JK) algorithm 
[8] using four subsets for each iteration. The results of the LTWS and uncertainties are 
compared to derive the conclusions. Subsequently, the final uncertainty in the MCP method 
was obtained by combining the uncertainty in data-filling and long-term correction. The 
shortcomings and future work are also discussed. 
Introduction 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 9 | 110 
Figure 1-2. Flow chart of the methodology 
 
Source: Author’s own illustration 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 10 | 110 
2 Methods and materials 
The statistical methods commonly used in MCP procedures are introduced within this section, 
and a brief introductory paragraph about wind resource assessment is given. The review of 
MCP methods summarises the available MCP algorithms based on a literature review. After 
that, the reviewed methods are grouped into classes to gain an overview of their applicability 
for the purpose of the study. Key performance indicators are introduced and discussed, 
followed by uncertainty assessment methods. Finally, the selection of the base-case algorithm 
for the iterative gap analysis is introduced. The code design based on the base-case algorithm 
and the used datasets is described in the last sections. 
2.1 Wind resource assessment 
Wind resource assessment (WRA) is the discipline of determining the long-term wind climate 
and expected seasonal, diurnal, spatial and temporal variation at a proposed renewable 
energy project location. The outcome of a wind resource assessment typically includes long-
term representative wind conditions at a hub height of a wind turbine generator (WTG), and 
sometimes across the rotor plane. Following flow modelling based on the wind climate 
statistics, the energy yield is modelled at a project location using WTG specific power and 
thrust curves as well as project-specific loss and uncertainty estimations.  This information is 
used as input to a financial model to calculate the financial performance of the wind project. 
As a result, WRA is the most important activity in determining the feasibility of a wind energy 
project [23]. 
The purpose of this small part is to refresh the reader's memory on the link between kinetic 
energy of wind to rotor radius and wind speed, to emphasize that an increase in accuracy in 
per mille range as well as an increase in uncertainty estimates has a high impact on the 
financial model of offshore projects. 
Wind energy is proportional to the cube of the wind speed. The wind power density (WPD), or 
the power per unit of area normal to the direction the wind is blowing, is a commonly used unit 
of measurement as shown in the below equation [24]; 
 pw = 
1
2
ρvw
3      [W/m²] [e1] 
 where:  
 ρ =air density at standard atmosphere [kg/m³]  
 vw =wind velocity [m/s]  
The kinetic energy advected by an air stream is proportional to the wind speed to the third 
power. Emeis states, therefore, that the climatological mean wind speed is insufficient to 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 11 | 110 
determine the amount of wind energy available at a particular location since wind turbines may 
react to real wind speeds in seconds. Additionally, stresses and vibrations on structures such 
as wind turbines are highly dependent on the wind spectrum's high-frequency components. As 
a result, it is critical to quantify the wind speed's spatial structure and temporal oscillations. 
This may be accomplished by the computation of the wind speed distribution at a given location 
using representative long-term time series [25]. 
Data distributions are commonly approximated by mathematical functions with a small number 
of parameters. Emeis summarizes commonly used wind statistics parameters as shown in 
Table 2-1 [25]. 
Table 2-1. Statistical characteristics of the wind [25] 
Parameter Description 
Mean wind 
speed 
Indicates the overall wind potential at a given site, expected wind speed for a 
given time interval (first central moment) 
Wind speed 
fluctuation 
Deviation of the momentary wind speed from the mean wind speed for a given 
time interval 
Wind speed 
increment 
Wind speed change for a given time span 
Variance Indicates the mean amplitude of temporal or spatial wind fluctuations, expected 
fluctuation in a given time interval (second central moment) 
Standard 
deviation 
Indicates the mean amplitude of temporal or spatial wind fluctuations (square root 
of the variance) 
Turbulence 
intensity 
Standard deviation normalized by the mean wind speed 
Gust wind 
speed 
Maximum wind speed in a given time interval 
Skewness Indicates the asymmetry of a wind speed distribution around the mean value 
(third central moment) 
Kurtosis 
(flatness) 
Indicates the width of the wind speed distribution around the mean value (fourth 
central moment) 
Excess kurtosis Kurtosis minus 3 
Frequency 
spectrum 
Indicates the frequencies at which the fluctuations occur 
Autocorrelation Indicates the gross spatial scale of the wind speed fluctuations, Fourier transform 
of the spectrum 
Structure 
function 
Indicates the amplitude of wind speed fluctuations, computed from wind speed 
increments 
Turbulent 
length scale 
Indicates the size of the large energy-containing eddies in a turbulent flow 
Turbulent time 
scale 
Indicates the time within which wind fluctuations at one point are correlated 
Probability 
density function 
Indicates the probability with which the occurrence a certain wind speed or wind 
speed fluctuation can be expected 
Source: [25] 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 12 | 110 
2.2 Statistical methods 
This section discusses the fundamental statistical procedures that were used in the 
investigations for this study. When dealing with substantial statistical populations, in this 
specific case, wind measurements in the boundary layer, counting every object in the 
population is impossible. Hence the computation must be done on a sample of the population. 
Therefore a subset of the dataset is assumed to represent the statistical population subject to 
analysis [26]. 
The dataset used in this analysis is considered a sample of the available statistical population. 
Following definitions are made with regards to the notation: 
 y i =  sampled predicted value 
[e2]  yi =  sampled measured value 
 xi = sampled reference value 
2.2.1 Definition of uncertainty 
The formal definition of "uncertainty of measurement" provided for use in this analysis is – as 
defined in [27] the quantity associated with a measurement result representing the scatter of 
values that may reasonably be assigned to the physical amount measured. Standard 
uncertainty is a standard deviation resulting from a measurement [27]. 
Annex E of IEC 61400-12-1 includes a comprehensive summary of the theoretical basis for 
determining the uncertainty using bin-wise calculations [28]. 
2.2.2 Definition of type A and type B uncertainties 
Type A uncertainty is defined by the statistical analysis of a sequence of observations which 
is used to assess uncertainty, whereas Type B uncertainty does not rely on statistical 
evaluation [27]. The Type A and Type B classifications are intended to identify the two distinct 
ways of assessing uncertainty components. It should be noted that both forms of evaluation 
are based on probability distributions, and the uncertainty components produced by either type 
are quantified using variances or standard deviations. [27] states that Type B uncertainties are 
obtained by scientific judgement based on the pool of available information. The uncertainty 
assessment conducted within this study is categorised as Type A uncertainty. 
2.2.3 The mean 
The summation of the observations divided by the count of observations gives the arithmetic 
mean [29]. Time is considered as an independent variable for averaging.  The sample mean, 
𝑥 , is given by the following equation: 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 13 | 110 
 x̅ =  
x1+x2+…+ xn
n
=
1
n
∑ xi
n
i=1  [e3] 
 
2.2.4 Variance and standard deviation 
The expectation of a random variable's squared difference from its population mean or sample 
mean is called variance. Variance is the measure of the spread, or how much a data group 
deviates from its average value [30]. The following equation gives the sample variance: 
 sx2 = 
1
n−1
∑ (xi − x̅)
2n
i=1  [e4] 
The standard deviation is the positive square root of the variance. The number 𝜎𝑥 represents 
the experimental standard deviation of the measurement dataset and provided by the formula 
for a series of n measurements of the same measurand [27]: 
 sx = √
1
n−1
∑ (xi − x̅)2
n
i=1  [e5] 
2.2.5 Covariance and correlation coefficient 
Covariance measures how two variables change together, whereas variance examines how a 
single variable varies. Covariance is, therefore, can be interpreted for this paired co-
movement. The expectation value is used to describe the covariance between two random 
variates, x and y, each having a sample size of n [31]. The equation of covariance is given in 
the [e6];  
 
cov (x, y) =  
1
n
∑
(xi − x̅)(yi − y̅)
n
n
i=1
 [e6] 
The Pearson correlation coefficient or Pearson product-moment correlation coefficient (PMCC) 
is a statistic that calculates the linear correlation between two sets of data. The sample 
Pearson correlation coefficients'  absolute values vary between -1 and 1. The Pearson 
correlation coefficient will be referred to as the “correlation coefficient” in this study [32]. 
The correlation coefficient is a measure of how well two variables are related and is obtained 
by dividing the covariance by the product of each variable's standard deviations, whereas the 
increase and decrease of the correlation coefficient show the direction of the linear relationship 
[32]. In the case of the sample correlation, correlations of +1 or 1 correspond to data points 
sitting perfectly on a line. The equation is given [e7]: 
 rxy = 
sxy
sxsy
 [e7] 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 14 | 110 
2.2.6 Coefficient of determination 
The square of the sample correlation coefficient is commonly abbreviated as R², and is a 
subset of the coefficient of determination. R² is simply the square of the sample correlation 
coefficient between the observed outcomes and the observed predictor values when just an 
intercept is given [33]. 
A percent can be used to represent the coefficient of determination giving an indication of how 
many data points are contained inside the regression equation's results line: if the R² is 0.80, 
then the regression line can define 80 percent of the points in consideration [34].  
2.2.7 Mean bias, absolute bias and root mean square errors 
Mean bias, absolute bias and root mean square errors are important metrics for the definition 
of the method uncertainty in data-filling, as discussed later in this study in Section 2.6.2 [35].  
Mean bias error, or the statistical bias, occurs when the predicted value of the results differs 
from the genuine underlying quantitative parameter being evaluated [36]. The mean bias error 
(MBE) is, therefore, the metric that determines how closely a collection of projected values 
matches a set of observed values and given in the following equation: 
 
MBE =
1
n
∑(y i − yi)
n
i=1
 [e8] 
The mean absolute bias, or mean absolute error (MAE) is the arithmetic average of absolute 
errors and is defined as the measure of errors between paired observations in statistics that 
reflect the same phenomena [37]. The MAE is defined in the following equation: 
 
MAE = 
1
n
∑|y i − yi|
n
i=1
  [e9] 
The root-mean-square error (RMSE) is a metric for comparing the values predicted by a model 
or estimate to the values observed. When based on a sample population, the variances are 
defined as prediction errors [38]. 
The standard deviation of the prediction errors is derived by taking the square root of the 
average of squared errors. The root mean square error (RMSE) is a measure of how spread 
out these prediction mistakes are, and it's often used to validate experimental results in 
climatology, forecasting, and regression analysis [39]. The RMSE is defined in the following 
equation: 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 15 | 110 
 
RMSE = √
1
n
∑(y i − yi)2
n
i=1
  [e10] 
2.2.8 Standard error 
The standard deviation of a statistics sample distribution is the standard error of that statistic. 
The standard error of the mean is the standard deviation of the means sample distribution. For 
the application of confidence intervals and significance testing, standard errors are crucial [40]. 
A statistical accuracy is commonly expressed in terms of its standard error, which is the 
measure of the distributions spread [40]. Standard error, in other words, is a measure of the 
uncertainty in the model parameter values estimated [41] and given with the following formula: 
 sx̅ =
s
√n
  [e11] 
2.2.9 Kolmogorov-Smirnov statistic 
The two-sample Kolmogorov-Smirnov (named after Andrey Kolmogorov and Nikolai 
Smirnov) test is used to determine how closely the distribution of a set of predicted values 
matches that of observed or true values. 
The Kolmogorov–Smirnov test (KS test) is a nonparametric test used to compare two samples 
(two-sample KS test) in statistics [42]. The test examines the cumulative distributions of two 
datasets and calculates the greatest vertical distance between their empirical distribution 
functions. The test is sensitive to changes in the location and shape of the samples [42]. A test 
statistic of zero will result from two datasets with identical cumulative distributions. 
Figure 2-1 illustrates the KS test statistic, where the black arrow represents the two-sample 
KS statistic, whereas the red and blue lines are empirical distribution functions.  
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 16 | 110 
Figure 2-1. Illustration of KS test 
 
Source: [42] 
The KS test is defined in the following equation: 
 D = supx|F0(x) − Fdata(x)| [e12] 
 where; 
F0(x) = the cumulative distribution function (CDF) of the predicted distribution 
Fdata(x) = the empirical distribution function of the observed dataset 
 
In addition to the KS test, distribution error (DE) introduced by UL [43] can also be calculated 
using the following equation, following the creation of the predicted and observed frequency 
distributions as defined in the manual of Windographer [43]; 
 
DE = ∑
(Fî − Fi)
2
Fi
N
i=1
 [e13] 
 where; 
Fî= frequency of the ith bin of the true observed distribution 
Fi=frequency of the ith bin of the predicted distribution 
 
2.2.10 Normal distribution 
The normal distribution is a continuous probability distribution for a real-valued random 
variable, which is the most important and extensively used distribution in statistics [44]. Normal 
distributions are broadly used in statistics, for example, to describe real-valued random 
variables whose distributions are unknown in the natural sciences. The probability density 
function (PDF) of the normal distribution is given with the following formula [44]: 
 f(x) =  
1
σ√2π
e−
1
2
(
x−x̅
σ
)
2
 [e14] 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 17 | 110 
2.2.11 Weibull distribution 
The frequency distribution of wind speed is typically defined in a compact form by means of a 
Weibull distribution [45]. The two-parameter Weibull distribution is expressed mathematically 
in [45] as: 
 f(u) =  
k
A
(
u
A
)
k−1
e−(
u
A
)
k
 [e15] 
 where:  
 u=horizontal wind speed [m/s]  
 f(u)=Frequency of occurrence of wind speed  
 A=Scale parameter [m/s]  
 k=shape parameter [-]  
2.3 Review of MCP methods in wind resource assessments 
A common approach used within MCP methods is shown in the block diagram in Figure 2-2. 
The operation is divided into two steps by the authors of [9].  
The first part is to study the concurrent period to establish a link between the reference and 
observed datasets. The observed relationship is applied to the reference dataset to obtain the 
long-term site-specific time series in the second step. However, it is noted that this is not 
always identical in each MCP method, and sometimes the relationship might be applied to the 
short-term dataset [9]. Further, within the wind resource industry, it is prevalent to use the 
relationship only to the remaining period of the reference dataset and combine it with the 
measured dataset to obtain the long-term site-specific datasets. This is commonly referred to 
as an extended dataset and refers to the long-term time series. 
Reference data is defined as consistent, sufficiently long time series data with the same 
measurement types (in this case, wind speed and wind direction), with a high temporal 
resolution like hourly resolution and high quality. Wind measurement data collected over long-
term, reanalysis data, mesoscale analysis, the long-term yield from wind turbines or yield; or 
wind indexes derived from wind turbine yield data or wind data might be used as reference 
data depending on the use case and application [35]. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 18 | 110 
Figure 2-2. Block diagram of a typical MCP 
 
Source: As presented in [9]  
As stated by Addison [11], MCP's main difficulty is with the prediction model. Historically the 
MCP methods have been interested in deriving the long-term wind speed as accurate as 
possible. Nevertheless, the MCP procedure accounts for the wind direction deviation as well 
as described in the next paragraph. 
When predicting long-term site wind speed and direction distribution, systematic direction 
changes between reference and site observations may be employed. But typically, it is 
expected that the long-term wind direction distributions remain the same.  The direction shift 
between time series is determined by obtaining the difference between the site direction and 
the reference mast for each time step after binning the reference time series. The mean of all 
reference wind directions within a direction sector is computed as an offset [46]. The offset is 
then added to the reference site wind direction measurements to obtain the long-term 
representative site wind direction time series. 
The main MCP methods are presented briefly in the subsequent sections to inform the rest of 
the study. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 19 | 110 
2.3.1 Linear regression methods 
Linear regression models the relationship between two variables by fitting a linear equation to 
observed data. One variable is regarded as an independent variable, while the other is 
regarded as a dependent variable [47]. For visualisation of the relationship, a scatterplot is 
often deployed where the correlation coefficient (see Section 2.2.5) is used as a numerical 
measure of association between these two variables. The linear regression line has the 
following formula, where x is the independent variable, m the slope, b the offset and y the 
dependent variable. 
 y = mx + b  [e16] 
There are various methods for how the linear regression line can be fitted. The most common 
sub-methods (please refer to Section 2.4.2 regarding the taxonomy used in this analysis) are 
linear least squares (LLS) and total least squares (TLS). Further, the variance ratio method is 
discussed. 
There are three primary LLS formulas to choose from, as shown in [48]: 
• Ordinary least squares (OLS) 
• Weighted least squares  
• Generalized least squares 
As OLS is primarily used within the wind industry and recommended practices, the LLS is 
referred to as the OLS method within this analysis, mainly due to the broad implementation of 
the LLS acronym within the wind industry [49]. As shown in Figure 2-4, it minimizes the vertical 
distance (residual) between data points and the model fit. Derrick [12] presented that the 
simplest and most often used method for obtaining a model from a collection of points is the 
LLS fitting approach for wind resource assessments [50]. The linear fit parameters of the LLS 
are calculated using the equations [46] as shown below: 
 
m =
∑ ( xi − x̅)i (yi − y̅ )
∑ (xi − x̅)2i
 [e17] 
 b =  y̅ − mx̅ [e18] 
 where, y̅=predicted mean  
On the other hand, the TLS submethod is a technique for minimizing the sum of squared errors 
(residuals) measured orthogonally to the line of best fit as shown in Figure 2-4. It is also known 
as 'orthogonal least squares’ and sometimes referred to as York Method [51]. Industry-
standard WindFarmer software refers to TLS as the principal component analysis (PCA) 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 20 | 110 
method [20]. PCA is the technique of calculating the principal components and utilizing them 
to modify the change of basis of the data [52]. WindFarmer theory manual notes that the 
principal components are the uncorrelated parameters of the dataset [46]. The TLS method is 
illustrated in Figure 2-3 with the orthogonal distance from the fit as given by the below 
equation: 
 
di =
di − mxi − b
√m2 + 1
  [e19] 
Figure 2-3. Illustration of the TLS method 
 
Source: Author’s illustration based on [46] 
The slope and offset values of the TLS fits are calculated as shown in the following equations 
in [46]; 
 m = −B + √B2 + 1  [e20] 
 
B = 
1
2
∑ ( xi − x̅)
2
i − (yi − y̅ )
2
∑ (xi − x̅)(yi − y̅)i
 [e21] 
 b =  y̅ − mx̅ [e22] 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 21 | 110 
Figure 2-4. Minimization of errors in LLS (left) and TLS (right) with respect to model fit 
  
Source: [53] (left), [54] (right)  
In statistics and economics, orthogonal regression has a long-standing tradition [51]. In certain 
cases, it has been thought to be preferable to standard least squares. The primary reason for 
this is that when there is no clear confidence in the independent (reference) dataset, and the 
dependent and independent variables are likely to have the same error margin, the 
conventional LLS might fail, as the vertical distance between the data and the fitted line is 
minimized using conventional least squares [51]. On the other hand, if there is more confidence 
in the independent variable, the LLS might perform better. 
It is noted that higher-order polynomials may be used in modelling the relationship between 
the reference and measured (target) datasets. This was not further investigated within this 
analysis as linear fits were found to provide reasonable results for wind resource applications 
[10].  
According to [10], regression MCP techniques can be improved beyond typical linear 
regression methods if they contain a residual distribution model. WindPRO, for example, 
implemented this approach to capture the energy content of MCP adjusted site wind 
distributions better than regression models without this option [10]. The residual is defined as 
the random error in the model. In WindPRO, the residuals can be introduced to the linear 
regression model by assuming a zero mean Gaussian distribution or a model constrained on 
both wind direction and wind speed [22]. 
Rogers [8] created the variance method (VM) approach in response to a limitation of linear 
regression in which the wind resource may be underestimated in poorly correlated datasets 
[16]. It entails reducing the variation of the predicted wind speed at the target location to the 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 22 | 110 
same level as the variance of the observed wind speed at the target site. This is presented in 
the following equation: 
 yi − y̅
sy
= 
xi − x̅
sx
 [e23] 
 Source: [16]  
Multiple linear regression is a type of regression model in which more than one regressor 
variable is included [17]. With the development of statistical computer packages,  multiple 
linear regression has become one of the most frequently utilized statistical procedures [55]. In 
multiple linear regression, the independent variables or functions of independent variables 
could consist of quadratic or hyperbolic elements. However, the relationship is still considered 
a linear regression, as the corresponding coefficients are linear [56]. 
2.3.2 Bin methods 
The method of bins was introduced by Beltran as an alternative to linear regression. It is based 
on the approach of bins of the power curve performance measurement standard [28], which is 
a performance measurement standard for power curves. It has been shown that this approach 
can be used to estimate wind speed data in nacelle anemometers, in addition to being 
employed in power curve measurements [57]. 
The dataset is separated into bins and sectors to determine the wind speed. The goal wind 
speeds are binned by 0.5 m/s versus the reference wind speeds. In each bin with more than 
10 data points, the mean of reference and target wind speed is determined.  Then, as 
illustrated in the equation, a linear interpolation between these positions provides the target 
wind speed [57]. 
 
Ŵi
tar = Wi
tar + (Wi
ref − Wb
ref)
Wb+1
tar − Wb
tar
Wb+1
ref − Wb
ref
 [e24] 
 Ŵi
tar=  predicted target wind speed  
 Wb
tar= bin average of the target measured wind speed in bin b  
 Wb
ref= bin average of the reference measured wind speed in bin b  
 Source: [57]  
The model fits based on measurements, and the representative algorithm points are illustrated 
in Figure 2-5 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 23 | 110 
Figure 2-5. Model fits [57] based on measurements (left) and representative algorithm 
points (right)  
  
Source: [57] 
The “Vertical Slice” (VS) MCP method fits a piecewise linear curve to a scatter plot of target 
wind speeds versus reference wind speeds [58]. Wind speed at the target site versus 
concurrent wind speed at the reference location is used to create a scatter plot.  The scatter 
plot for pairs is sectioned into equal-sized vertical stripes. The mean values of the target site 
wind speed for each stripe are calculated, and a pair between the latter values and the mean 
values of each stripe is shown on the diagram. The linear fit is then performed by connecting 
the pairs linearly, where the initial line starts at zero origin [59]. 
Leblanc further introduced a slightly revised version of the VS method similar to the LLS. This 
method is called the Non-Linear Method of Moments (NL-MoM), and is similar to the VS 
method in that it likewise splits the wind speed plot into bands or slices. However, as seen in 
Figure, the slices are perpendicular to the TLS linear fit of the data [58]. 
The VS method and NL-MoM are illustrated in Figure 2-6. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 24 | 110 
Figure 2-6. VS method (left) and NL-MoM (right) with bins and resulting piecewise linear 
fits 
  
Source: [58]. 
2.3.3 Matrix methods 
Matrix methods are nonlinear models employing a joint probability distribution (JPD) instead 
of attempting to impose a linear connection between two variables [51]. The prerequisite of 
linear models having residuals with a normal distribution is not required [51] for this method. 
According to [60], matrix methods is the general overarching definition for MCP methods where 
the wind speed and wind direction measurements are used to classify the data into bins of 
more than a single dimension. Hanslian [37] also notes that the use of the terminology “matrix 
method” within the industry is not consistent and often refers to different methods, and 
sometimes identical methods are referred to differently. It is noted that the classic matrix 
method introduced by [61] and Anderson [51], as applied within WindPRO, is discussed here 
as it is a commonly used approach. 
The matrix methods are based on the notion that long-term site data can be described using 
simultaneous onsite and reference data measurements. A combined joint distribution between 
the two variables, wind speed-up and wind veer, is used to represent the relationship [62]. The 
wind speed-up and wind veer are calculated based on the differences between the site and 
the reference concurrent wind speed and wind directions. The outcome of the differences is 
then sorted according to the reference wind speed and wind direction in the form of two 
matrices, with each element corresponding to a user-inferred reference wind speed and 
reference wind direction bin [10]. An example of the wind speed model for three sectors is 
shown in Figure 2-7. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 25 | 110 
Figure 2-7. Sample data and first-order model for the wind speed-up 
 
Source: [10] 
Thøgersen notes in [19] that the method for modelling the joint distribution matrix should be 
determined by the specific dataset. According to [19], a mix of binned sample distributions and 
modelled joint Gaussian distributions might provide reasonable results. 
As mentioned above, the matrix approach is based on the joint distribution of the measured 
wind speed-ups and wind veers [19]. Hence for each measured sample following pairs of the 
quantities are calculated as shown below per [19]; 
 ∆y = yobserved − yreference [e25] 
 ∆θ = θobserved − θreference  
 where,  
 ∆y= wind speed u  
 yobserved= observed wind speed  
 yreference= reference wind speed  
 ∆θ= wind veer  
 θobserved= observed wind veer  
 θreference= reference wind veer  
The flowchart of the above-discussed matrix method is shown in Figure 2-8. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 26 | 110 
Figure 2-8. Flowchart of the matrix method 
 
Source: Author’s illustration based on [10] 
As shown in the concurrent period container of Figure 2-8, whenever observed data pairs were 
not available, the sample distribution statistics were used to fit a model. The model is then 
used to conduct interpolations and extrapolations into bins where no data is available. The 
sample distributions are calculated using a Wood and Watson (WW) method as discussed in 
[61]. The WW method is a sector-bin approach that uses regression analysis to identify the 
transfer function that describes the relationship between observed and reference datasets [63]. 
The parametric distribution is defined by the mean, standard deviation and correlation values. 
The sample and fitted polynomial model is shown in Figure 2-9 below. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 27 | 110 
Figure 2-9. Polynomial model used within matrix method, samples (left) and polynomial 
model (right) 
 
Source: [10] 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 28 | 110 
The matrix time series (MTS) by Lambert [64] is an adapted version of the matrix method [51]. 
The MTS method is applied within Windographer industry software. The first step of the MTS 
is to build this joint probability distribution. The algorithm generates a cumulative distribution 
function (CDF) using the joint probability distribution and the reference dataset, which is then 
used to convert the observed dataset to a percentile time series. The percentile time series 
with a 50% is the expected average based on the reference time series at the corresponding 
time step [51]. Finally, a Markov-based-reconstruction algorithm is used to extend the 
observed percentiles time series to the long-term. This algorithm generates artificial data 
matching the measured data in terms of frequency distribution, seasonal and diurnal patterns, 
and autocorrelation [43]. 
Windographer converts the synthetic percentile time series results into desired wind speed 
values in the final step. By utilizing the JPD to determine the target wind speed for each 
percentile value and reference wind speed in each time step, Windographer is reversing the 
previous procedure. Windographer employs the percentile value instead of the reference wind 
speed to get the predicted wind speed for that time step. Rather than retaining seasonal and 
diurnal patterns and autocorrelation, this step preserves the statistical link between observed 
and reference wind speeds [43]. 
Mortimer's approach [13] is another nonlinear method similar to the matrix method. The wind 
speed observations are binned by the reference site's wind speed and direction. Then two 
matrices are created by deriving ratios of the average of the observed wind speed to the 
reference site's wind speed, and the other including the standard deviations, respectively [65]. 
The below equation is used to predict the wind speed: 
 yi = (r + e)xi [e26] 
 Where; 
r is the average of wind speed ratios at the target site 
 
 e is the random variable from the triangular distribution corresponding to the standard 
deviation of the ratios 
 
 Source: [13] , [65]  
2.3.4 Novel computational methods 
Amongst the linear regression and matrix models, there are also a couple of novel 
computational methods to conduct an MCP. These are mainly artificial neural networks (ANNs) 
and machine learning (ML) methods, including support vector regression (SVR) and decision 
trees (DTs) [17]. 
Due to their capacity to identify patterns in noisy or otherwise difficult data, ANNs have been 
employed to correlate and predict wind data [66]. A neural network comprises linked neurons 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 29 | 110 
that take a set of weighted inputs. The function causes the neuron to provide an output when 
the weighted inputs are above the threshold. A feedforward neural network has layers of 
neurons with no lateral or backward connections. The network's input layer is the data from 
the reference location in the case of MCP. The network's last layer is the output layer providing 
the extended time series [16].  
The weights of the interconnections and biases between the neurons in the different levels are 
updated through a learning process. The Levenberg–Marquardt algorithm may be used for this 
process [17]. Feedforward networks with multilayer perceptrons (MLPs) are typically used to 
do the regression [67].  
Within an example study in [9], various reference stations' wind speed and direction were fed 
into an ANN's input layer. The model performed better when the wind direction was added to 
the input signal. As the number of reference stations increased, so did estimation inaccuracies. 
The schematic diagram is shown in Figure 2-10. 
Figure 2-10. Schematic diagram of an ANN with 2N wind speed and wind direction input 
signals of N reference stations and two wind data output signals of the target station 
 
Source: [9] 
The following ANN setup was used for the study in [17], as shown in Table 2-2. 
Table 2-2. ANN settings at the example of regression methodologies used for the MCP 
methodology 
Parameter Value 
WS - input values Wind speed and wind direction at the reference site 
WS - output values Wind speed at the target site 
WD - input values Wind velocity vector in selected directions at reference 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 30 | 110 
Parameter Value 
WD - output values Wind velocity vector in selected directions at target 
Number of layers three 
Number of neurons 
in layer 
30, 30, 10 
Training 
methodology 
Levenberg–Marquardt algorithm 
Percentage of points 
used for training 
70% 
Percentage of points 
used for verification 
15% 
Percentage of points 
used for testing 
15% 
Source: Author’s compilation, extract from [17] 
Provided that input data is accurate enough and the training was done effectively, ANN is a 
potential method that may serve as an alternative for long-term corrections in the wind sector 
[68]. 
2.3.4.1 Machine learning algorithms 
Machine learning is the study of computer algorithms that learn from experience and 
data.  Machine learning algorithms create a model using training data to make predictions or 
judgments without being explicitly programmed. Machine learning algorithms can be based on 
ANN as well [69]. ML is typically divided into supervised learning, unsupervised learning, and 
reinforced learning.  
By resolving the surrogate model construction problem as a quadratic programming problem, 
the SVR method offers a novel method for constructing smooth, nonlinear regression 
approximations [66]. The transfer function is shown in the below equation: 
 f̃(x) = 〈w,∅(x)〉 + b [e27] 
 |f̃(xi) − f(xi)| ≤ ε  
 where,  
 f(xi) = function to be approximated  
 w  = set of coefficients  
 ∅(x)= map from input space to feature space  
 ε= maximum tolerated error, predefined  
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 31 | 110 
It is further noted in [66] that the coefficient w may be found by solving a quadratic programming 
problem with slack variables and a cost function. An exemplary parametrisation of an SVR 
application is provided in Table 2-3. 
Table 2-3. SVR settings at the example of regression methodologies used for the MCP 
methodology 
Parameter Value 
WS - input values Wind speed and wind direction at the reference site 
WS - output values Wind speed at the target site 
WD - input values Wind velocity vector in selected directions at reference 
WD - output values Wind velocity vector in selected directions at target 
Method Hyperparameter optimisation 
Kernel Gaussian 
Solver Sequential minimal optimisation 
Source: Author’s compilation, extract from [17] 
A decision tree method is an ML application with a hierarchical data structure that uses the 
"divide and conquer" strategy [17]. A single decision tree model divides the feature space into 
regions and fits a basic model to each zone [70]. Assuming an example with a continuous 
response variable y and two independent variables x1 and x2; each part of the space specified 
by x1 and x2 is modelled separately in the first stage of the regression. The operation is 
repeated until a preset stopping rule is met. The best fit is attained at the end of each partition 
by selecting variables and a split-point in two [70].  
Another method was proposed by Nielsen named as diffusion-based transformation. In this 
method, measurements and reference data are transformed to Gaussian variables prior to 
creating a statistical correlation. For this purpose, a novel transformation algorithm was 
inspired by Gastner and Newman's cartogram approach, which was initially created for 
showing themed maps in geographic information systems. Additionally, by converting wind 
data to Gaussian variables, conditional simulation of time series was performed using Fourier 
transformation [71]. 
Gradient boosting is another application of machine learning. The gradient boosting technique 
gradually improves prediction capacity by creating many models and focusing on difficult-to-
estimate training cases [70]. Gradient boosting has been shown to be a very effective 
technique for filling gaps in meteorological time series by Körner [6]. There are various 
advantages to using multiple linear regression or neural networks over multiple linear 
regression or neural networks. Compared to neural networks and multiple linear regression, 
the computations may be performed in 1/500 to 1/300 on a standard desktop PC [6]. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 32 | 110 
2.3.5 Quantile mapping methods 
The primary basis of the U&N method is the Q-Q method. This is a quantile method that 
consists of plotting quantile values derived from probability distributions of two datasets. If the 
relation between the two datasets is linear, then the Q–Q plot shows a straight line. It 
ignores simultaneity and focuses on the statistics of the datasets [20]. The U&N technique is 
oriented around the wind direction and wind speed, focusing on the probability distributions of 
both parameters. In contrast to the majority of other LTC methods, concurrency is merely used 
to ensure that the data represents the same time period. According to the authors, the 
approach could be enhanced by incorporating stability [20]. 
The SpeedSort approach includes sectorwise fitting a linear regression model with a non-zero 
intercept by comparing observed wind speeds data to the reference dataset. Because the line 
fitting procedure requires separate sorting of reference and site wind speeds, the fitted line 
assesses the relationship between wind speed frequency distributions rather than hourly 
values. Additionally, a veer analysis is performed, which results in the direction and speed of 
long-term reference sites being adjusted. The technique includes sector binning, sorting wind 
speeds, fitting the line and calculating the average veer for each sector prior to extending 
the short-term time series to the long-term [72]. 
2.3.6 Empirical methods 
The bulk speed ratio (BSR) algorithm is an empirical method deployed by ISo1. It uses a 
relatively straightforward approach of matching observed (target) and reference wind speed 
data, assuming a linear connection with just slope parameter and no offset. The slope is 
computed by dividing the target mean wind speeds by the reference mean [73]. 
The 'Weibull Fit' algorithm is an MCP method proposed by van Lieshout [74] and implemented 
within ISo1. The scale factor of the Weibull fit is equal to the difference between the Weibull 
scale factors at the target and reference sites, multiplied by the exponent b. The Weibull fit 
method employs a power law model of the following form [73]: 
 y̅ =  αx
β [e28] 
 where:  
 
β =
kx
ky
  
 
α =
Ay
Ax
β
  
 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 33 | 110 
Wind index is an empirical MCP approach that is utilized in ISo3. It leverages MCP analysis 
by using monthly averages of the energy production without consideration of the directional 
distribution of the wind climate. While this approach may seem simplistic and rudimentary 
compared to more advanced MCP methods, it offers significant benefits in terms of stability 
and performance — even when other MCP methods appear to fail [10]. 
The KH (Knut Harstveit) method is a non-linear MCP technique, utilized at Kjeller Vindteknikk. 
This approach organizes non-zero reference and site concurrent wind speed data into 12 
equal-width direction bins and the zero wind speed values into an additional 13th bin for both 
site and reference datasets. The average wind speed for each bin is then determined and 
weighted based on its frequency. Then, for each bin, the reference and site weighted averages 
are compared. These ratios are used as adjustment factors. While the adjustment factors are 
based on short-term data, they are expected to be true throughout time. Using this assumption, 
the weighted average of the reference long-term data for each bin is corrected. This yields the 
long-term site sector mean wind speed [20]. 
Tallhaug and Nygaard invented the non-regression T&N MCP method, which was published 
in 1993 and is utilized at Kjeller Vindteknikk. The mean and standard deviation of the site's and 
reference wind speeds, as well as the correlation coefficient of their relationship, are 
determined for each direction bin of the reference data. This technique explicitly incorporates 
the correlation coefficient of the relationship between the site and reference data when 
estimating the site's long-term wind speeds but does not employ the relationship's regression 
function. The authors note that the strength of the link between measured site data and 
contemporaneous reference data is critical to the method's accuracy [20]. 
2.4 Definition of the measure-correlate-predict (MCP) algorithms 
Based on the literature review in Section 1.2, the MCP methods are classified, and 
subsequently, an MCP algorithm is selected for the study. 
2.4.1 Type classification of MCP 
The classification of MCP methods proposed by Hanslian [60] is considered a useful tool to 
gain an overview of the applicable methods and thus to select a suitable method for this study. 
This is shown in Table 2-4.  
Table 2-4. Classification of MCP methods according to Hanslian 
Description Type 1 Type 2 Type 3 
Results based on Reference Target Target 
Provides time series Yes No No 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 34 | 110 
Description Type 1 Type 2 Type 3 
Prediction of wind distribution No Yes No 
Suitable for the MCP application considered Yes No No 
Source: Author’s own visualisation based on [60] 
As the topic of interest requires a time series output, only type 1 MCP methods are of interest. 
2.4.2 Definition of an algorithm 
The process for solving a mathematical problem in a limited number of steps, which typically 
requires the repeating of an action, is referred to as an algorithm [75]. In this study, an MCP 
algorithm is defined as the combination of a method, sub-method, model and concept. 
The MCP methods are already described in Section 2.3. The sub-methods are the primary 
tools available within the method, whereas the primary settings to conduct the model fits are 
categorized under the model header. There might be further options to run the MCP algorithm, 
where the user needs to make project-specific judgements to conduct a robust MCP; these 
are grouped into concepts. For the example of linear regression, the sub-methods are LLS and 
TLS, describing how the model is optimised to obtain the linear fit, whereas the model options 
are focused on the details of the model selections. Finally, the model can be fitted, several 
times repeatedly for different sectors, or be based on multiple values like with a high temporal 
resolution (hourly) or fewer values like in a monthly resolution. For the example of a monthly 
resolution, one might consider the weights of different months. These scenarios define the final 
MCP algorithm, as shown in Figure 2-11. 
In the subsequent section, the classification of the MCP methods is further presented in 
overview tables. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 35 | 110 
Figure 2-11. Definition of an MCP algorithm at the example of linear regression 
 
Source: Author’s own illustration 
2.4.3 Classification of MCP methods 
Within this section, the MCP methods are further summarized, with examples given from 
engineering software used within the wind industry. Following software solutions were 
available at the time of the assessment; 
• Industry software 1 (ISo1): Windographer 
• Industry software 2 (ISo2): Windfarmer 
• Industry software 3 (ISo3): WindPRO 
The previously discussed MCP methods are summarized in the following tables. It is observed 
that most of the methods are classified as Type 1, with an output of time series. Further, it is 
clear that linear regression, empirical, and matrix methods have a broader industry application 
based on the investigated industry software. As Hanslian stated, Type I methods are 
considered most appropriate for filling data gaps and point predictions. In contrast, Type II 
methods should be used for an accurate representation of the wind distribution [60]. 
Accordingly, only Type I methods available within industry software were evaluated in the next 
Section 2.7 to select the base-case algorithm suitable for iterative analysis. 
Based on the above classification criteria and literature review discussed in Section 2.3, the 
MCP methods are summarized as shown in Table 2-5 to Table 2-10 with respect to their types 
and applications in the specific software. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 36 | 110 
Table 2-5. MCP method 1: Properties of linear regression methods 
Method: Linear regression 
Reference classification WD WD 
Type classification Type 1 Type 2 
 ISo1 ISo2 ISo3  
Sub-method 
✓ ✓ ✓ LLS - 
✓ ✓ - TLS - 
✓ - - VM - 
Source: Author’s own calculation/assessment 
Table 2-6. MCP method 2: Properties of bin methods 
Method: Linear regression 
Reference classification WD WD 
Type classification Type 1 Type 2 
 ISo1 ISo2 ISo3  
Sub-method 
- - - Method of bins - 
✓ - - Vertical slice - 
Source: Author’s own calculation/assessment 
Table 2-7. MCP method 2: Properties of properties matrix methods 
Method: Matrix 
Reference classification WD WS+WD WS+WD 
Type classification Type 1 Type 2 Type 1 +  
Type 2 
 ISo1 ISo2 ISo3   
Sub-method 
- - ✓ Classification - WindPro matrix  
✓ - - - Joint probabilistic [76] MTS 
- - - - - Matrix-analog (Hanslian) 
Source: Author’s own calculation/assessment based on  
Table 2-8. MCP method 3: Properties of novel computational methods 
Method: ANN 
Reference classification - - 
Type classification Type 1 Type 2 
 ISo1 ISo2 ISo3  
Sub-method 
- - - ANN - 
- - - SVR - 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 37 | 110 
Method: ANN 
- - - DT - 
Source: Author’s own calculation/assessment 
Table 2-9. MCP Method 4: Properties of quantile mapping methods 
Method: Quantile mapping 
Reference classification - - WS 
Type classification Type 1 Type 2 Type 1 +  
Type 2 
 ISo1 ISo2 ISo3   
Sub-method 
- - - - - U&N 
✓ - - - - SpeedSort 
Source: Author’s own calculation/assessment 
Table 2-10. MCP Method 5: Properties of empirical methods 
Method: Empirical methods 
Reference classification - WS+WD 
Type classification Type 1 Type 2 
 ISo1 ISo2 ISo3  
Sub-method 
✓ - ✓ Bulk speed ratio - 
✓ - ✓ - Weibull scaling 
- - ✓ - Wind index 
- - -  KH method 
- - -  T&N method 
Source: Author’s own calculation/assessment 
2.5 Questionnaire results 
The questionnaire was designed based on the research questions in the empiro environment 
[77], which is a free survey tool for students. It has been distributed to key industry 
analysts/experts through direct links using the LinkedIn platform as well as the online 
community “wind resource assessment group” (WRAG) with more than 400 registered 
members [78]. It is noted that the questionnaire was not accessible by other persons through 
a search engine or a publicly available link on the LinkedIn platform. The detailed charts of the 
answers to the questionnaire are presented in Annex A. The following paragraphs give a brief 
summary of the outcome as well as comments and recommendations of the participants. 
25 analysts answered a total of 31 questions with an average total answer duration of 15 
minutes. The majority (50%) of the respondents were consultants, followed by 25% developers 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 38 | 110 
and other groups (WTG OEM, research and miscellaneous). More than 80% of the analysts 
had a master’s degree or PhD, with 60% more than 10 years of industry experience. For about 
half of the participants, the percentage of offshore work in their daily wind analysis job 
surpassed 25%. 
The majority of respondents (55%) believed that new approaches were essential to address 
data gaps in FLS measurements, while 40% expressed no view, and one analyst said it was 
unnecessary. There was a consensus with more than 75% that the interim step of data-filling 
(DF) should be applied prior to the long-term correction. According to the majority of the 
participants, an algorithm's output should be a time series with the same temporal resolution 
as the measurement time series (88%), or at least a time series with lower temporal resolution 
(16%). The response to the similar question, but this time for the end result of a concluded 
long-term correction operation, was broader; output with a temporal resolution identical to the 
measurement dataset led with 64%, followed by 48% lower resolution time series. 
In terms of data filling, the majority of respondents (36%) agreed that the most extended 
permissible gap duration each year should be less than 15 days, followed by 30 days (28 per 
cent), and the highest duration was 60-days, which was selected by just one analyst out of a 
total of 25. 
Among those that participated in the survey, 72% utilized in-house tools based on Python and 
Excel, while 36% used Windographer, and 24% chose WindPRO for usage in the workflow for 
an MCP operation, respectively. Other in-house solutions used by the participants included an 
internally designed tool programmed in R and with a web interface, Vortex LTC, Brightwind 
open-source python, in-house Java software analysis and database, as well as Matlab. 
Regarding the question of which metrics (key performance indicators, KPI) should be used to 
assess the performance of a DF / LTC process, the coefficient of determination received the 
highest percentage of responses (72%), followed by the root mean square error (RMSE), 
which received 60%. With a 56% share, most participants believed that the number of samples 
collected was an essential critical factor to consider when doing the MCP analysis. The 
distribution of responses per metric and type of the participant is presented in Table 2-11. 
Table 2-11. Summarized survey response to the question regarding KPI metrics 
Metric 
C
on
su
lta
nc
y 
D
ev
el
op
er
 
O
th
er
 
R
es
ea
rc
h
 
S
ki
pp
ed
 
W
TG
 O
E
M
 
To
ta
l 
Mean bias error (MBE) 8 3 0 1 0 1 13 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 39 | 110 
Metric 
C
on
su
lta
nc
y 
D
ev
el
op
er
 
O
th
er
 
R
es
ea
rc
h
 
S
ki
pp
ed
 
W
TG
 O
E
M
 
To
ta
l 
Mean absolute error (MAE) 7 3 1 0 0 1 12 
Root mean square error (RMSE) 7 5 1 1 0 2 16 
R² (coefficient of determination) 10 5 1 0 0 2 18 
Mean wind direction 3 3 1 0 1 1 9 
Wind veer 2 3 0 0 1 1 7 
Weibull scale parameter (A) 1 2 0 1 0 1 5 
Weibull shape parameter (k) 1 2 1 1 0 1 6 
Wind power density 3 2 1 1 0 0 7 
Kolmogorov-Smirnov test statistic regarding wind 
speed distribution 
4 2 0 0 0 1 7 
Number of samples in bin/sector (depending on the 
method) 
8 4 1 0 0 1 14 
Other (please use next question to enter your 
preference) 
1 2 1 1 0 0 5 
Source: Author’s own calculation/assessment 
Additional important criteria, stated by the experts, included the number of overlapping data 
points, the (theoretical) power production of a wind turbine using a real power curve (or several 
power curves), and whether the data filling is conducted inter- (with nearby measurements) or 
intra- (from the same measurement location and instrumentation). 
One response emphasised that it was critical to pay close attention to how effectively the 
reconstruction captured the energy content of the ensuing wind regime. This might be 
accomplished by using a synthesis check on data from an identical period. 
One of the experts stressed that all adjustments, whether they are data filling or long-term 
corrections, should be evaluated in terms of their influence on the uncertainty of the annual 
energy output estimate. Another expert stated that it is also necessary to compare the final 
long-term mean of monthly means (LTMOMM) wind speeds in order to determine the influence 
of the data-filling procedure chosen. In other words, if the source of data-filing reference data 
and the technique used to process it do not have a significant influence on the final LTMOMM, 
then greater confidence may be placed in the results. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 40 | 110 
The participants were asked whether they could assign a rating to the previously described 
KPIs based on how important they were to the MCP process. It ranged from one to ten, with 
one being not significant and ten being highly important. For MBE, MAE, and RMSE, the 
percentage share of outcomes that were greater than the score 5; was 72%, 68%, and 76%, 
respectively. The coefficient of determination achieved the same outcome as the root mean 
square error (RMSE) at 76%. The distribution of responses was more uniform for wind 
direction, wind veer, Weibull scale, and shape characteristics. It should be emphasized that 
the KS statistic was also regarded as important, with a 56% share of scores over 5, indicating 
that it is significant. 
For data-filling and long-term correction in wind analysis, 80% of the experts indicated that 
they utilize the linear regression method for data-filling and long-term correction MCP. The 
matrix (48%) and ANN methods (40%) were the next most popular. Specifically, the Variance 
Ratio approach was mentioned directly in the category "alternative ways." In response to a 
question, one participant replied that he/she had no visibility to the details of the in-house 
algorithm. 
The next question requested participants to elaborate on their choice of sub-method. The LLS 
and TLS sub-methods of linear regression received 52% and 28% of the votes, respectively, 
but a sizable part (48%) answered that it depends on the study and that they have no pre-
defined preference. 
When utilizing linear regression for DF and LTC, 64% of respondents reported that they use a 
linear first-order polynomial. Linear regression forced through zero, linear regression forced 
through zero with cut-off wind speed, and second-order polynomial each obtained 16% of the 
vote, while the “other” choice received 20%. 
When doing a data-filling / long-term correction study for mean wind speed, 12 sectors were 
the dominant response (80%) for the typical number of wind direction sectors. This was 
followed by 16 sectors (32%), 36 sectors (24%), and omnidirectional (single sector, 20% ). 
The leading temporal domain for the data-filling & long-term correction used by the analysts 
was an hourly resolution with 52%, followed by a 10-minutes resolution (44%). 84% of 
respondents indicated that they take seasonality into account during the data-filling/LTC 
process, either through seasonal balancing prior to MCP (32%), using monthly intervals (32%), 
or applying yearly divisions (20%). Another 20% of responders said that seasonality was not 
considered in their MCP workflow. For linear regression applications based on monthly values, 
52% of respondents indicated using a weighted technique, while the remainder indicated that 
it was not relevant. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 41 | 110 
68% of respondents indicated that atmospheric stability should be taken into account when 
performing a data-filling/MCP activity. This agreement was reduced to 48% in the following 
question, which questioned participants whether metocean conditions such as waves, 
currents, or air, pressure, or water temperature should be included in the process. It is worth 
noting that 24% and 32% of voters, respectively, abstained from answering the above-
mentioned questions. When analysts were asked for their view on the most important 
metocean parameter that should be studied in relation to the data-filling process, a slight 
majority (60%) chose wave height, followed by air temperature (44%). 
Finally, experts were asked to comment on whether their choice of data-filling (DF) / long-term 
correction (LTC) methodology was based on performance testing and/or uncertainty analysis. 
The overwhelming majority (88%) affirmed this question. The experts discussed their 
recommended performance test and approach in greater detail. Two experts stated that they 
do LTC performance evaluations using industry-standard software (WindPRO/Windographer). 
Another individual stated that they employ a variety of ways and analyze the statistical 
distribution of all approaches in order to determine the consensus opinion. Along with 
determining how well the synthesised data and correlation capture wind speed and energy 
content, significant KPIs such as jack knife uncertainty, MBE, MBA, and distribution error were 
mentioned. Finally, another participant proposed applying methods used in well-known 
offshore meteorological masts and offshore lidar data sets, such as the FINO mast. 
The questionnaire was ended with expert advice and recommendations for this study.  
One expert objected to the questionnaire, hinting that it omitted a question on analysts' 
willingness to agree to data filling. One participant added that he would recommend conducting 
as many various approaches to the LTC as possible and then selecting the most appropriate 
methods after comparing final LTMOMM estimates, as this would provide a good sense of the 
ultimate result's sensitivity to the approaches employed. The eagerness was expressed that it 
would be interesting to compare long-term results obtained from a measured dataset that was 
not data filled to those obtained from the same dataset that was data-filled, which would be 
derived from measurements at the same location with the same instrumentation. The 
concluding comment was from another expert suggesting that data filling should be performed 
when it can be proved that the uncertainty associated with filling with unmeasured or non-
targeted data is less than the uncertainty associated with leaving the gaps unfilled. 
2.6 Definition of the key performance indicators and uncertainties 
Based on the findings of the literature research and the results of the questionnaire, the key 
performance indicators (KPIs) shown in Table 2-12 have been established. These key 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 42 | 110 
performance indicators (KPIs) are divided into two primary categories: test statistic and test 
parameter.  
The lower MAE, MBE and RMSE values, as well as the higher the R² value, indicate that the 
predictions are closer to the measured values [6].  
Table 2-12. Definition of KPIs 
KPI [Generic] Test 
Statistic? 
Test 
Parameter 
Generic 
Parameter 
Mean bias error (MBE) Yes - - 
Mean absolute error (MAE) Yes - - 
Root mean square error (RMSE) Yes - - 
R² (coefficient of determination) Yes - - 
Mean wind speed (MWS) - Yes - 
Mean wind direction (MWD) - Yes - 
Wind direction deviation (WDD) - Yes - 
Weibull scale parameter (A, Weib_A) - Yes - 
Weibull shape parameter (k, Weib_k) - Yes - 
Wind power density (WPD) - Yes - 
Kolmogorov-Smirnov test statistic regarding wind speed 
distribution (KS) 
Yes - - 
Number of samples in bin/sector (depending on the 
method) (TS) 
- - Yes 
Source: Author’s own calculation/assessment 
The selection of the reference dataset requires high quality and consistently measured wind 
speeds in order to obtain accurate estimations of the target site's wind resource [62]. As 
described in Section 2.9.2, this reference dataset is often a modelled dataset due to a set of 
limitations. In any case, the consistency and quality measurements are still valid. Further, the 
representativeness of the reference dataset is another important criterion [35]. In conclusion, 
one needs to compare the measured and reference dataset before conducting an MCP. The 
KPIs are summarized as the prerequisite to data-filling KPIs (PreDF). The test statistics used 
for PreDF KPI are shown in Table 2-13 below. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 43 | 110 
Table 2-13. PreDF KPI 
Test statistic MWS MWD WDD Weib_A Weib_k WPD Wind 
rose 
MBE ✓ - Single value ✓2 ✓1 ✓ - 
MAE ✓ - - - - ✓ - 
RMSE ✓ - - - - ✓ - 
R² ✓ ✓ - - - - - 
KS ✓ - - - - - - 
Representativeness 
parameter 
- - - - - - ✓ 
Number of samples in 
bin/sector 
✓ - - - - - - 
Source: Author’s own calculation/assessment 
Table 2-14 and Table 2-15 illustrate the test statistics that were utilized for the SelfDF and 
ValDF KPIs, respectively, in the same manner. 
Table 2-14. SelfDF KPI 
Test statistic MWS Weib_A Weib_k WPD 
MBE ✓ ✓2 ✓2 ✓ 
MAE ✓ - - ✓ 
RMSE ✓ - - ✓ 
R² ✓    
KS ✓ - - - 
Source: Author’s own calculation/assessment 
 
 
2 Measured and reference values are calculated sectorwise, MBE is obtained from a weighted average 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 44 | 110 
Table 2-15. ValDF KPI for the gap 
Test statistic MWS 
MBE ✓ 
MAE ✓ 
RMSE ✓ 
Source: Author’s own calculation/assessment 
2.6.1 The interface of the KPIs to the uncertainty method 
Validating MCP approaches by quantifying and modelling the uncertainty would improve the 
confidence in the long-term analysis. Uncertainty in on-site wind conditions might be evaluated 
by modelling the uncertainty in the MCP process [66].  
Rogers advocated that as an uncertainty measure, the standard deviation of long-term forecast 
estimations be used. He suggested predicting the long-term target site data's properties using 
shorter contemporaneous data sets from the lengthier set. The uncertainty associated with the 
prediction is then assessed using the standard deviation of predictions across many data sets. 
The disadvantage of this strategy is that it can only be used with sufficiently high-quality and 
long-term measurable data (target). Saarnak also used a similar technique to calculate the 
mean and standard deviation of the biases for the long-term correction based on each 
subgroup of a longer dataset and use them as a measure of uncertainty [63]. 
Klinkert [68] did a very comprehensive literature review of uncertainty estimators in long-term 
correction procedures. His conclusion was that the correlation and standard deviation were the 
most common estimators used within the industry, at the same time presenting different 
metrics suitable for different purposes. Seasonality and long-term trends, for example, may 
contribute to uncertainty [68]. 
Even though Klinkert’s final evaluation of this research comprised just 19 papers on long-term 
uncertainty correction, the breadth of uncertainty approaches and parameter applications is 
extensive [68]. The parameters are often used to determine their sensitivity to other variables. 
Long-term studies often analyze uncertainty in terms of the time period between the 
measurement station and the reference dataset. Another typical approach is a comparison of 
the uncertainty introduced by the length of on-site measurements versus the use of sufficiently 
accurate reference datasets.  
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 45 | 110 
Table 2-16. Uncertainty estimators in the area of long-term corrections method 
uncertainty  
Parameter Description Usage 
MBE MBE  is a measure of the systematic 
errors in a measurement sample to 
some extent. 
To determine if the inaccuracy is 
systematic and, in that case, whether it 
is over-or underestimating the wind 
speed. Industry-wide application 
MAE The MBE's magnitude. The average 
difference is displayed to ensure that all 
variations are captured during the 
analysis 
MAE is used in displaying the error and 
analyzing a process. While the 
disparities may fluctuate significantly, 
the sign change might bring them to 
zero. MAE demonstrates the 
magnitude of the oscillations 
independent of their sign. Used in 
normalized and percentage forms and 
to a large extent in the ANN approach. 
Coefficient of 
determination 
The fraction of observed response 
variable variability can be explained by 
a linear regression model. 
Used to determine the degree to which 
linear regression, e.g. the MCP 
approach, adequately explains the 
variability. Frequently used in the 
reporting of wind assessment 
uncertainty. 
RMSE A simple-to-understand error indication, 
as it uses the same unit as the 
estimated variable. 
RMSE is involved in all instances 
involving error analysis. This is a 
frequent occurrence in short term 
analysis, as the length of the prediction 
interval is dependent on the length of 
the prediction interval. The purpose of 
long term analysis is to demonstrate 
the error's convergence with the period 
of concurrent data. 
Standard 
deviation 
The most often used method of 
expressing uncertainty in long-term 
adjustments 
The standard deviation for each result 
should be included. It is critical to 
determine and appropriately estimate 
the standard deviation, which can be 
challenging when dealing with serially 
correlated data. 
Source: [68]. 
In a recent study, Basse [74] noted that further research is needed to determine how 
systematic biases and, ultimately, the uncertainty associated with long-term correction of short-
term wind data may be decreased efficiently and expeditiously. 
2.6.2 Uncertainties in the long-term correction 
The below Figure 2-15 gives a comprehensive overview of the different uncertainty 
components relevant for the energy production of a WTG. It can be observed, that the long-
term adjustment (correction) is a sub-component of the historical wind resource category in 
the proposed draft IEC 61400-15 framework [79]. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 46 | 110 
Figure 2-12. Mind map of energy production uncertainty according to the draft 
IEC 61400-15 
 
Source: [79] 
The proposed framework is mapped in Figure 2-15 below to the current technical guideline 
TG6 and an example from the industry practice [18]. It may be noticed that there is no 
unanimity in the nomenclature used to describe the components of the uncertainty impacting 
the historical wind resource. The fundamental purpose of this research is to get an 
understanding of the representativeness of the experimental performance test (method 
uncertainty) in the event of missing data in line with the research questions. This study does 
not aim to thoroughly compare and test the remaining uncertainty components of long-term 
correction. The topic of this thesis is correlation and on-site data synthesis uncertainties which 
are the sub-components of the method uncertainty. The method uncertainty is covered in more 
detail in the next section. 
2.6.3 MCP method uncertainty 
TG6 states that the quality of the chosen long-term correction technique should be checked by 
the long-term correction procedure's reconstruction of the original measurement data or yield 
data [35]. The overlap period of existing short-term measurement and a long-term reference 
time series is separated into training and test periods [35]. The correlation calculated during 
training is applied to the long-term reference time series during testing [35]. The generated 
dataset is compared to the test period's original short-term data. This is done using statistical 
factors like mean and standard deviation, as well as measurement data like wind speed and 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 47 | 110 
direction frequency distribution. The mean bias error, mean absolute error, root mean squared 
error, and distribution error may all be calculated [35]. It is also a requirement that a self-
consistency test must be used to determine the quality of the applied MCP extrapolation and 
the validity of the MCP result [35]. An illustration of the assessment of MCP uncertainty sub-
uncertainty components using an example from industry practice and the TG6 technical 
guideline is shown in Figure 2-16. 
The standard deviation metric of the estimates introduced by Rogers was initially introduced 
in Section 2.6.1, which is not suitable for use cases when no long-term measured (target) 
dataset is available [8]. Additionally, Rogers referred to Derrick, mentioning that the uncertainty 
of the slope and offset is typically used to simulate the relationship between the reference and 
target sites in linear regression. Derrick [12] described estimating the standard deviation of the 
expected wind speed using the slope and offset variances and covariances [8]. But Rogers 
dismissed this approach because it makes the assumption that the data are not serially 
correlated, which is not true in the specific use case. 
Windfarmer refers to correlation uncertainty by stating that it is calculated using the scatter of 
the correlation between the reference and site masts. The smaller the scatter, the less 
questionable the association [46]. This is not regarded to be an objective test-based technique 
that is appropriate for this sort of analysis. 
Brower [80] suggests estimating the method uncertainty using an empirical formula. The 
following simple equation approximates the overall uncertainty in the long-term mean wind 
speed as a function of the correlation coefficient, assuming normally distributed yearly wind 
speed variations and a homogenous reference station data record [80]. This is given in the 
following equation, valid only if the concurrent dataset is longer than a single year: 
 
σ = √
r2
NR
σR
2 +
1 − r2
NT
σR
2  [e29] 
 where,  
 r= Correlation coefficient  
 NR= Number of years of reference data  
 NT = Number of years of concurrent reference and target data  
 σR= Standard deviation of the annual mean wind speed of the reference site as a 
percentage of the mean 
 
 σT= Standard deviation of the annual mean wind speed of the target site as a percentage 
of the mean 
 
Similar to the method referred to in the Windfarmer manual, the empirical method of Brower is 
not a test-based approach and was not used in this study.  
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 48 | 110 
It is possible to execute a performance test using Windographer [43], which offers KPIs for 
MBE, MAE, RMSE, and DE of the various MCP approaches that have been applied. Version 
4 of Windographer does not have any feature for evaluating uncertainty. 
Another common method to assess uncertainties is Bootstrapping, which is intrinsically linked 
to Monte Carlo. Monte Carlo techniques have been used to simulate and approximate 
distributions in various sectors, including the wind industry. However, the Monte Carlo 
techniques are the general term for any methods employing random numbers. Accordingly, 
using the LTC approaches and not restricting the computations to likely values and without 
implying any underlying distribution, bootstrapping may be regarded as a branch of these 
Monte Carlo simulations [68]. It is noted by Valk, that resampling serves the same purpose as 
Monte-Carlo simulation for evaluation of MCP uncertainties. However, unlike the latter, 
resampling does not need an explicit probabilistic model [81]. 
According to Nielsen, the bootstrap technique is the most used resampling approach. The idea 
is to create artificial data sets of the same size as the actual time series by randomly sampling 
from it [71].  An example of the bootstrap for uncertainty assessment is provided in [81]. The 
authors noted that simulations might be used to determine the effect of random error on long-
term correction methods. However, when assessing the uncertainty of a wind resource 
estimate, it is not always required to explicitly address random errors in the data source [81]. 
Valk added that the block length should be adequate for the bootstrap strategy to succeed. In 
the study [81], the authors have chosen a block length of 62.5 days corresponding to the 
satisfactory de-correlation of wind speed.  
The random sampling procedure of bootstrapping can be repeated several times, and the DF 
and LTC algorithms can be used to generate a new synthetic dataset. The uncertainty 
associated with the MCP approach can be expressed as the standard deviation of the final 
estimates. It is necessary to perform a large number of simulations in order to obtain reliable 
estimates. Unfortunately, this is one of the disadvantages of this method, which renders it 
unsuitable for the sliding gap window analysis used in this study. 
In DNV's recommended practice DNV-RP-J101, the jack-knife (JK) and bootstrap techniques 
are recommended for assessing uncertainty. The jack-knife estimate of variance quantifies the 
uncertainty of a study's conclusions by taking into account the variability of outcomes as 
succeeding subsets of data are excluded from the analysis [82]. 
When conducting the JK, Rogers arrived at a final decision regarding the number of jackknife 
subsets. The number of jackknife subsets was chosen to represent the median of the 12 data 
sets' best-performing number of jackknife subsets he used in his study [8]. It denotes the 
number of jackknife subsets that produces the lowest overall root mean square error for these 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 49 | 110 
data sets [8]. As a result, four subsets are relevant for the context of this research as a two-
year assessment period is used. This quantity of jackknife subsets was employed throughout 
the entirety of this master thesis' analysis. 
Figure 2-13. Selection of number of subsets based on concurrent period 
 
Source: [8] 
The difference between JK and bootstrapping is presented in Figure 2-14. 
Figure 2-14. Sketch of the difference between JK and bootstrap resampling 
 
Source: [71] 
In conclusion, it is proposed that the RMSE of the validation phase is employed as an 
uncertainty metric for the interim data-filling stage. This is compared with Brower's empirical 
uncertainty method. In terms of the uncertainty associated with the MCP method's long-term 
correction, the JK method is shown to be suitable, primarily to computational limitations. An 
exemplary bootstrapping uncertainty is produced in Section 3.3 for a single gap duration of 60-
days at a given gap time start period to demonstrate a rudimentary comparison. 
Methods and materials  
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 50 | 110 
Figure 2-15. Mapping of sub-uncertainty components 
 
Source: Author’s own illustration based on the example of industry practice [18], TG6 technical guideline  [35] and new proposed framework within IEC 61400-15 [79] 
 
 
Areas Type Areas Areas Partial uncertainty components 
and sub-components
statistical Reference data
2.1 Correlation statistical On-site data synthesis
2.2 Quality of the long-term dataset assumption Long-term adjustment
2.3 Representativeness of the comparison 
period - w ind speed distribution
assumption Wind speed and direction distribution
2.4 Representativeness of the comparison 
period - w ind rose
assumption Long-term period
2.5 On site data synthesis statistical
not 1:1 
more than 
one
8.1.3e - Selection of the operation period
Color legend:
Long-term 
correction
1- Long-term representation
Long-term 
correction
8.1.3a - Representativeness of the long-term data 
for the site
Historic 
wind 
resource2- MCP 
method 
uncertainty
8.1.3b - Consistency of the long-term data 
sources
8.1.3c - Method uncertainty
8.1.3d - Selection of the reference time period
Industry practice (example) TG 6  Rev 11 IEC61400-15 Energy uncertainty 
Partial uncertainty components and sub-
components
Partial uncertainty components and sub-components
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 51 | 110 
Figure 2-16. Flowchart of evaluation of MCP uncertainty sub-uncertainty components at 
the example of industry practice [18] and TG6 technical guideline [35], with "correlation" 
uncertainty shown in amber as target of this study 
 
Source: Author’s own illustration based on [18] and [35] 
2.7 Selection of the base-algorithm 
Whilst each strategy has advantages and disadvantages, Brower [80] recommends tried-and-
true methods for day to day applications in the wind industry. In that regard, linear regression 
methods are highlighted as simple to use and calculate long-term mean wind speed as 
accurately as any linear technique [80].  
Table 2-17 shows the possible number of MCP algorithms using a linear regression method. 
In the case of linear regression, up to 20 scenarios are easily possible for a single method. As 
the focus is to understand the impact on the data filling with an iterative analysis, it should be 
possible to implement the selected method using open source programming language without 
significant computational effort. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 52 | 110 
Table 2-17. MCP algorithms for implementation of linear regression (LinReg) 
A
lg
or
ith
m
 
ID
 
Method Sub 
Method 
Model 
S
ec
to
r 
Ti
m
e 
D
om
ai
n
 
W
ei
gh
ts
 Algorithm identifier 
1 LinReg TLS 1OFtO 12 Hourly nW LinReg_TLS_1OF_12_Ho_nW 
2 LinReg TLS 1OFtO 16 Hourly nW LinReg_TLS_1OF_16_Ho_nW 
3 LinReg TLS 1OFtO 36 Hourly nW LinReg_TLS_1OF_36_Ho_nW 
4 LinReg TLS 1OFtO omni Hourly nW LinReg_TLS_1OF_omni_Ho_nW 
5 LinReg TLS 1OFtO N/A Monthly We LinReg_TLS_1OF_nS_Mo_We 
6 LinReg TLS 1OFtO N/A Monthly iW  LinReg_TLS_1OF_nS_Mo_iW 
7 LinReg TLS 1OwOf 12 Hourly nW LinReg_TLS_1Ow_12_Ho_nW 
8 LinReg TLS 1OwOf 16 Hourly nW LinReg_TLS_1Ow_16_Ho_nW 
9 LinReg TLS 1OwOf 36 Hourly nW LinReg_TLS_1Ow_36_Ho_nW 
10 LinReg TLS 1OwOf omni Hourly nW LinReg_TLS_1Ow_omni_Ho_nW 
11 LinReg TLS 1OwOf N/A Monthly We LinReg_TLS_1Ow_nS_Mo_We 
12 LinReg TLS 1OwOf N/A Monthly iW  LinReg_TLS_1Ow_nS_Mo_iW 
13 LinReg LLS 1OFtO 12 Hourly nW LinReg_LLS_1OF_12_Ho_nW 
14 LinReg LLS 1OFtO 16 Hourly nW LinReg_LLS_1OF_16_Ho_nW 
15 LinReg LLS 1OFtO 36 Hourly nW LinReg_LLS_1OF_36_Ho_nW 
16 LinReg LLS 1OFtO omni Hourly nW LinReg_LLS_1OF_omni_Ho_nW 
17 LinReg LLS 1OFtO N/A Monthly We LinReg_LLS_1OF_nS_Mo_We 
18 LinReg LLS 1OFtO N/A Monthly iW LinReg_LLS_1OF_nS_Mo_iW 
19 LinReg LLS 1OwOf 12 Hourly nW LinReg_LLS_1Ow_12_Ho_nW 
20 LinReg LLS 1OwOf 16 Hourly nW LinReg_LLS_1Ow_16_Ho_nW 
21 LinReg LLS 1OwOf 36 Hourly nW LinReg_LLS_1Ow_36_Ho_nW 
22 LinReg LLS 1OwOf omni Hourly nW LinReg_LLS_1Ow_omni_Ho_nW 
23 LinReg LLS 1OwOf N/A Monthly We LinReg_LLS_1Ow_nS_Mo_We 
24 LinReg LLS 1OwOf N/A Monthly iW LinReg_LLS_1Ow_nS_Mo_iW 
25 LinReg VR - 12 Hourly nW LinReg_VR_12_Ho_nW 
26 LinReg VR - 16 Hourly nW LinReg_VR_16_Ho_nW 
27 LinReg VR - 36 Hourly nW LinReg_VR_36_Ho_nW 
28 LinReg VR - omni Hourly nW LinReg_VR_omni_Ho_nW 
Source: Author’s own calculation/assessment 
Table 2-18 presents the other MCP algorithms possible for the study, which were tested prior 
to the implementation of the iterations. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 53 | 110 
Table 2-18. MCP algorithms for implementation of other methods3 
A
lg
or
ith
m
 
ID
 
Method Sub 
Method 
Model 
S
ec
to
r 
Ti
m
e 
D
om
ai
n
 
W
ei
gh
ts
 Algorithm identifier 
29 Bin Method VS - 12 Hourly 
10_WSb
ins 
BinMethod_VS_12_Hourly_10_W
Sbins 
30 Bin Method VS - 16 Hourly 
10_WSb
ins 
BinMethod_VS_16_Hourly_10_W
Sbins 
31 Bin Method VS - 36 Hourly 
10_WSb
ins 
BinMethod_VS_36_Hourly_10_W
Sbins 
32 Bin Method VS - omni Hourly 
10_WSb
ins 
BinMethod_VS_omni_Hourly_10
_WSbins 
33 Matrix MTS Def 12 Hourly - Matrix_MTS_Def_12_Hourly 
34 Matrix MTS Def 16 Hourly - Matrix_MTS_Def_16_Hourly 
35 Matrix MTS Def 36 Hourly - Matrix_MTS_Def_36_Hourly 
36 Matrix MTS Def omni Hourly - Matrix_MTS_Def_omni_Hourly 
37 Matrix 
Wind-
PRO Def 12 Hourly 
- 
Matrix_WindPRO_Def_12_Hourly 
38 Matrix 
Wind-
PRO Def 16 Hourly 
- 
Matrix_WindPRO_Def_16_Hourly 
39 Matrix 
Wind-
PRO Def 36 Hourly 
- 
Matrix_WindPRO_Def_36_Hourly 
40 Matrix 
Wind-
PRO Def omni Hourly 
- Matrix_WindPRO_Def_omni_Hou
rly 
41 QM 
Speed 
Sort Def 12 Hourly 
- 
QM_SpeedSort_Def_12_Hourly 
42 QM 
Speed 
Sort Def 16 Hourly 
- 
QM_SpeedSort_Def_16_Hourly 
43 QM 
Speed 
Sort Def 36 Hourly 
- 
QM_SpeedSort_Def_36_Hourly 
44 QM 
Speed 
Sort Def omni Hourly 
- QM_SpeedSort_Def_omni_Hourl
y 
45 EM BSR ISo1 12 Hourly - EM_BSR_ISo1_12_Hourly 
46 EM BSR ISo1 16 Hourly - EM_BSR_ISo1_16_Hourly 
47 EM BSR ISo1 36 Hourly - EM_BSR_ISo1_36_Hourly 
48 EM BSR ISo1 omni Hourly - EM_BSR_ISo1_omni_Hourly 
49 EM 
Weibull 
scale ISo3 12 N/A 
- 
EM_Weibull scale_ISo3_12_N/A 
50 EM 
Wind 
index ISo3 N/A N/A 
- 
EM_Wind index_ISo3_N/A_N/A 
Source: Author’s own calculation/assessment 
 
3 Selected list of methods available in industry software. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 54 | 110 
The particular focus of this study is on offshore applications. Thus, the complexity in the 
transfer functions between the target and reference sites are expected to be low. As stated by 
Duncan, over land and water, the diurnal and annual changes of near-surface wind speed are 
vastly different [83]. Wind speeds offshore are often thought to be stronger and less turbulent 
than onshore. Furthermore, the diurnal and annual changes of near-surface wind speed vary 
significantly between land and water. The diurnal cycle is almost non-existent at sea 
throughout the whole year due to the considerable thermal inertia of the sea surface. Because 
of the increased synoptic activity in the winter, wind speeds are higher than in the summer 
[83]. Accordingly, the MCP method suitable for offshore might not necessarily be complex. 
Therefore a widely used simplistic MCP algorithm might prove good enough results, whereby 
it could be validated easily during the coding process. In other words, the repeatability of the 
analysis of the gap-filling impact would be easier with a simple but proven method. 
Regarding the submethod, the selection was based on the LLS, as there was high confidence 
in the measured dataset.  A first-order linear regression model with offset (1OwOf) was 
selected, as this is a well known and widespread method, providing robust results. This 
assumption is further tested and confirmed for this specific analysis with the performance 
testing algorithm within Windographer within Section 3.1. 
Similarly, the consideration of the wind directions, or the number of sectors, is an essential 
feature of the MCP algorithm. In general, terrain greatly influences wind direction, with the 
distance to the coastline from offshore locations having a considerable impact on the 
directional distribution [68]. The omnidirectional analysis was based on 41910 iterations for a 
total of 60 gap periods in sequential steps. The sectorwise approach would scale the number 
of iterations by the multiple factors of sector numbers accordingly due to the design of the 
code. Therefore, following sensitivity runs with sectorwise runs, the necessity of directional 
MCP was assessed. The selected target dataset location MMIJ is located far offshore without 
any coastal effects in the different sectors. Therefore, an omnidirectional analysis was found 
to be suitable, as there were no directional influences.  Finally, it was concluded that the 
omnidirectional MCP was a reasonable simplification for the purpose of the study. 
The hourly temporal resolution was selected for the study, as this was considered important to 
understand the impact of the data filling. 
Finally, considering the above-mentioned criteria, the linear regression method with the LLS 
sub-method (MCP algorithm ID 22) has been chosen as the base case scenario for this study. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 55 | 110 
2.8 Design of the code for iterative analysis 
Table 2-19 illustrates the relationships and datasets utilized for data filling and long-term 
correction. The study begins with the complete measured dataset covering a two-year period 
of 17472 hours. Gaps ranging from one day to sixty days are added in an outer loop, with a 
24-hour increment (1-day). This gap is removed from the measured period in 24-day 
increments in a sliding window. This is referred to as the inner loop. At the time of the analysis's 
inception, the code was implemented sectorwise. As a result, the inner loop comprises a 
secondary loop across the sector bins. As previously noted, KPIs are gathered throughout 
each sector for the PreDF, SelfDF, and ValDF groups, based on the correlations provided in 
Table 2-20. During the validation step, the decision was taken to change the code to an 
omnidirectional (1 sector) version, primarily due to computational constraints. 
The outer loop has been designed in Jupyterlab. JupyterLab is an interactive development 
environment for notebooks, code, and data that is available over the web. Users may create 
and organize data science and scientific computing processes using the interface's flexibility 
[84]. The inner loops have been developed using Python [85]  within the latest Anaconda 
environment [86]. NumPy [83]  and pandas [84] were used within the python environment for 
calculations. Matploblib [87] module was utilized for visualisations, whereas 
sklearn.metrics [88], scipy.stats [89] and dc_stat_think [90] were deployed for statistical 
analysis. The module xlsxwriter was implemented to export the results to Excel. The overall 
design of the code is presented in Figure 2-17.  
It is noted that the training and test periods as defined in Table 2-19 are not random and don’t 
have equal durations but are always complimentary. The extension (creation of the 
synthesized data) does not replace observations. It should be mentioned that throughout the 
code's creation, the output of the Python code was compared to the output of Windographer 
in numerous phases to validate the findings. For bin analysis, a separate function was built to 
partition the data into matching bins. Furthermore, directional averaging was performed using 
the wind direction's vector components during sectorwise analysis. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 56 | 110 
Figure 2-17. Flow chart of the code 
 
Source: Author’s own illustration 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 57 | 110 
Table 2-19. Relationships and datasets for data filling at the example of data segments 
Ab
b. Dataset Period Goal Note 
KPI 
Group 
KPI 
Description 
Relation-
ship for 
model 
LT
 
1 LT
 
…
 
S
1 
S
2 
S
3 
S
4 
S
5 
S
6 
S
7 
S
8 
S
9 
S
10
 
S
11
 
S
12
 
LT
…
-1
9 
Y.1 Measured w artificial gap - - - - - -                
Y.2 Measured gap - - - - - -                
Y.3 Measured full - - - - - -                
X.1 Reference-DF - - - - - -                
X.2 Reference-DF - - - - - -                
X.3 Reference full - - - - - -                
X.4 Reference LT - - - - -                 
B.0 Concurrent_w_gap Training Suitability Prerequisite PreDF Reference-observed None                
B.1 Model_w_gap Training Uncertainties 1. Step SelfDF Predicted-observed Y.1-X.1                
B.2 Model-gap_self_prediction 
Not part of this study 
Y.2-X.2                
B.3 Model_self_prediction Y.3-X.3                
C Model_gap Test Validation 2. Step ValDF Predicted-observed (gap) Y.1-X.1                
D Model_gapfilled DF Input to F 3. Step PostDF Predicted-observed Y.1-X.1                
E Model_ltc LTC Impact of DF 4. Step Ltc Jackknife uncertainty Y.1-X.1                
F Model_df_ltc LTC LTC 5. Step LtcDF Jackknife uncertainty D-X.3                
Source: Author’s own calculation/assessment 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 58 | 110 
Table 2-20: Reference relationships for the KPI classification 
Dataset Scenarios Uncertainties Case PreDF 
(no model) 
SelfDF 
(self-predictions) 
ValDF 
For verification - onsite 
metrics 
ConcurrentPeriod  - Use PreDF_KPI_p0 
Relation    : -  
Reference : X.3 
Target       : Y.3 
PreDF_KPI_p0 
Relation    : Y.3-X.3  
Reference : X.3 
Target       : B.3 
Not available in the use 
case. 
ConcurrentPeriod_gap 1-day to 60-
days 
- Test PreDF_KPI_p2 
Relation    : - 
Reference : X.2 
Target       : Y.2 
SelfDF_KPI_p2 
Relation    : Y.2-X.2 
Reference : X.2 
Target       : B.2 
ValDF_KPI_p1 
Relation    : Y.1-X.1 
Reference : X.2 
Target       : C 
ConcurrentPeriod_w_gap 1-day to 60-
days 
- Test PreDF_KPI_p1 
Relation    : - 
Reference : X.1 
Target       : Y.1 
SelfDF_KPI_p1 
Relation    : Y.1-X.1 
Reference : X.1 
Target       : B.1 
- 
ConcurrentPeriod_gap_filled 1-day to 60-
days 
RMSE-MWS 
(ValDF) 
Test - SelfDF_KPI_p3 
Relation    : Y.1-X.1 
Reference : X.3 
Target       : D 
- 
Source: Author’s own calculation/assessment 
The greyed relationships in the above Table show possible investigations that were not part of this study. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 59 | 110 
2.9 Datasets 
The measurement and reference datasets are discussed in the subsequent sub-sections. 
2.9.1 Selection of the measurement dataset 
The meteorological met mast Ijmuiden (MMIJ) dataset was preselected and provided by Dr 
Gottschall for this analysis based on a previous investigation of impacts of gaps on offshore 
datasets [4]. 
TNO Energy Transition's wind energy division conducted a four-year meteorological 
measuring study by installing and operating the MMIJ  in the Dutch North Sea between 2011 
and 2015 by the commission of The Ministry of Economic Affairs, Agriculture, and 
Innovation [91].   
MMIJ was located approximately 75 km west of Ijmuiden’s coast. Sensors are positioned at 
various heights (between 25 m and 100 m) to observe and record wind speed, direction, 
temperature, and pressure changes. A light detection and ranging (lidar) system was installed, 
measuring wind speed and direction up to 300 meters above the mast. The campaign included 
measurements on sea current and wave data using a wave buoy in order to construct safe and 
cost-effective foundations for future offshore wind turbines. 
The MMIJ dataset can be requested for research purposes by the TNO's data cloud manager 
[92]. 
A two full years dataset was provided by Dr Gottschall for the analysis at the top height with 
wind direction and wind speed data at 10 minutes temporal resolution. 
2.9.2 Selection of the long-term reference dataset 
ERA5 was used as the reference dataset in the initial study conducted by Gottschall [4]. In 
order to conduct this study, this dataset has been pre-selected. It satisfies the criteria for 
reference dataset properties established by TG6 [35]. 
ECMWF is producing the ERA5 reanalysis as part of the Copernicus Climate Change Service 
(C3S), which contains a thorough record of the global atmosphere, land surface, and ocean 
waves from 1950 to the present. ERA5 benefits from a decade of advances in model physics, 
core dynamics, and data assimilation. In addition to a greatly improved horizontal resolution of 
31 km, ERA5 includes hourly output [93]. ERA5 is accessible in the geographical domain 
worldwide, is well-documented, and has been extensively validated. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 60 | 110 
In addition to ERA5, MERRA-2 and KNMI datasets were also evaluated during the analysis of 
the long-term period, as discussed in Section 2.9.4.3 
2.9.3 Measurement campaign overview 
The details of the MMIJ instrumentation is provided in the ECN-Wind Memo-12-010 [94]. Thies 
First Class anemometers were deployed during the measurement campaign by ECN [94], as 
shown in Table 2-21.  
Table 2-21. MMIJ Instrumentation 
Sensor type Heights 
 above 
LAT 
[m] 
Analysis 
use case 
Arrangement Distance 
from 
mast  
[m] 
Vertical 
distance 
from 
boom 
[cm] 
Sensor Measured 
variable 
Anemometer 92 Primary Top anemometer  
dual boom, 17.5° 
and 197.5° 
orientation 
- 1500 Thies First 
Class 
Advanced 
anemometer 
10-minute 
average, 
standard 
deviation, 
minimum 
and 
maximum 
values 
Wind vane 87 Primary Triple boom 
arrangement 
with, 46.5°, 
166.5° and 
286.5° 
orientation 
Triple boom 
arrangement 
with, 46.5°, 
166.5° and 
286.5° 
orientation 
Triple boom 
arrangement 
with, 46.5°, 
166.5° and 
286.5° 
orientation 
4.6 70 Thies First 
Class wind 
vane 
Anemometer 58.5 Secondary 7.0 150 Thies First 
Class 
Advanced 
anemometer 
Anemometer 27 Secondary 9.2 150 Thies First 
Class 
Advanced 
anemometer 
Source: Author’s own summary based on [94] 
The MMIJ is shown in Figure 2-18. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 61 | 110 
Figure 2-18. Picture of the MMIJ station 
 
Source: [94] 
2.9.4 Pre-processing and data preparation 
Fraunhofer IWES has conducted the screening and pre-processing of the measured time 
series for the time period 01 June 2012 until 30 June 2014 (short-term period). The pre-
processed time series at the 92 m wind speed and 87 m wind direction level 
(Ijmuiden_filled_2012-2014) was provided as a “txt” file as input into this analysis.  
The measurements were done using several anemometers at the same heights. The 
anemometers were combined into a virtual anemometer representing the relevant height by 
removing tower shadow effects following the screening. An example methodology of obtaining 
the virtual anemometer without the tower shadow effects is provided within [94] in 
“Chapter 7.5”, header “True wind speed”. 
Further, the data coverage of the top height was increased by means of intra-mast correlation 
analysis. This was done to have the highest data coverage possible for the research exercise. 
The below comparison figure of the time series shows very good alignment with the results 
obtained by Fraunhofer IWES and ECN [83] for the period in question. As the start of the short-
term period does not cover full years in 2012 and 2014, the year 2013 is suitable for a like-a-
like comparison. In the below Figure 2-19, it can be observed that the difference in Weibull fit 
and histogram is negligible between the Fraunhofer IWES and ECN datasets.  
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 62 | 110 
Figure 2-19. Weibull fit and histogram of MMIJ measurements in 2013 (left: ECN 
analysis, right: Fraunhofer IWES dataset) 
  
Source: Left: [83], right: Author’s own illustration via Windographer 
2.9.4.1 Summary statistics of the short-term dataset 
The summary statistics of the short-term dataset is shown in Table 2-21. 
Table 2-22. MMIJ short-term statistics 
Variable Value – WS92 
Measurement height [m] 92 
Mean wind speed [m/s] 9.88 
Median wind speed [m/s] 9.48 
Minimum wind speed [m/s] 0.29 
Maximum wind speed [m/s] 37.92 
Standard deviation [m/s] 4.78 
Weibull k [-] 2.18 
Weibull A [m/s] 11.16 
Possible data points 105120 
Available data points 104844 
Data availability [%] 99.74 
Variable Value - WD87 
Measurement height [m] 87 
Mean wind direction [°] 223.1 
Median wind direction [°] 207.5 
Possible data points 105120 
Available data points 104842 
Data availability [%] 99.74 
Source: Author’s own calculation/assessment 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 63 | 110 
2.9.4.2 Time synchronisation 
Following the selection of the reference dataset, a combined dataset consisting of the 
reference and measured (target) dataset was created. The Pearson correlation coefficient was 
used within Windographer to calculate the maximum correlation between two data sets for the 
wind speed. This analysis step shifts the reference time step automatically to obtain the offset, 
which maximises the degree of correlation. In the below Figure 2-20, the results are presented, 
showing a minus one hour shift was required. This is done subsequently in Windographer. 
Figure 2-20. Time synchronisation 
 
Source: Author’s own illustration via Windographer 
2.9.4.3 Definition the reference long-term reference period 
It should be determined whether there were any trends in the long-term reference dataset. The 
study looked at different long-term durations ranging from 10 to 20 years in length with the 
same method proposed in [18]. The slope of the fit was calculated by fitting normalized yearly 
wind speeds using a linear regression approach. The analysis has been repeated for each 
MERRA-2, KNMI, ERA5 nodes nearest to the MMIJ location. A time range was chosen that 
minimizes the impact of a probable trend while still being representative of the long-term 
reference period. 
Following the trend analysis, the ERA5 reference dataset “R5” located at 52.69° North and 
3.60° East from 2000 to 2018 with 19 years of duration has been selected as the reference 
dataset for the analysis. 
The trend analysis is shown in Figure 2-21. 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 64 | 110 
Figure 2-21. Annual trend analysis and comparison of reference datasets for the selected long-term period 2000-2018 
 
Source: Author’s own illustration 
 
Methods and materials 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 65 | 110 
2.9.4.4 Summary statistics of the concurrent and long-term reference period 
The summary statistics of the long-term reference dataset is shown in Table 2-23. 
Table 2-23. Reference dataset statistics for the concurrent and long-term periods 
Variable 
Long-term period 
Value – WS100 
Concurrent period 
Value – WS100 
Model height [m] 100 m 100 m 
Mean wind speed [m/s] 9.28 9.34 
Median wind speed [m/s] 8.91 9.01 
Minimum wind speed [m/s] 0.02 0.07 
Maximum wind speed [m/s] 32.98 29.24 
Standard deviation [m/s] 4.44 4.50 
Weibull k [-] 2.23 2.25 
Weibull A [m/s] 10.50 10.63 
Possible data points 166559 17519 
Available data points 166559 17472 
Data availability [%] 100.0 99.7 
Variable Value - WD87 Value - WD87 
Model height [m] 100 100 
Mean wind direction [°] 247.7 229.00 
Median wind direction [°] 218.1 211.9 
Possible data points 166559 17519 
Available data points 166559 17472 
Data availability [%] 100.0 99.7 
Source: Author’s own calculation/assessment 
 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 66 | 110 
3 Results 
Based on the evaluation of the MCP algorithms, the base-case scenario for the Python code 
was developed. The most critical performance indicators during the procedure were evaluated 
and presented. The outcomes of data-filling and long-term correction are detailed in the 
following sections. The uncertainty evaluation is given at the end of this section. The detailed 
results presented in this section are provided in the annexes from Annex B to Annex N. 
3.1 Evaluation of the MCP algorithms 
In addition to the linear regression concepts shown in Table 2-17, the following concepts were 
tested to understand the suitability of the base case scenario. 
In order to gain confidence and select a reasonably robust MCP algorithm, the MCP methods 
presented in Section 2.7 were tested with an omnidirectional selection. This is done in ISo1 
using the performance test functionality. This test is conducted within ISo1 with a cross-
validation experiment, where a selected number of segments are created within the concurrent 
period, and the datasets are divided into training and test periods. The model is fit using the 
data within the training segments, and the output is generated for the testings periods. The 
observed and predicted are compared for a total of 400 randomized datasets, and the following 
test statistics are generated as shown in Figure 3-1 for this study. 
Figure 3-1. MBE, MAE and DE results of the investigated MCP methods 
  
 
Source: Author’s own illustration, generated in ISo1 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 67 | 110 
It can be observed that MTS and LLS perform best in the case of MBE and MAE with error 
values of -0.0005 m/s and 1.0 m/s for MBE and MAE, respectively, whereas TLS, VR, SS and 
BSR methods perform slightly better regarding the distribution error. 
The coefficient of variation (COV) is defined by the ratio of mean and standard deviation. COV 
results of the considered methods and submethods are shown in Table 3-1. It can be seen 
that despite the high number of algorithms considered for linear regression, the COV values 
are similar to the other methods. 
Table 3-1. Coefficients of variation of considered MCP methods 
Method Count Subtotal submethods COV 
BinMethod 4 1 0.05% 
EM 6 3 0.17% 
LinReg 21 3 0.30% 
Matrix 8 2 0.46% 
QM 4 1 0.03% 
Source: Author’s own calculation/assessment 
The long-term wind speed (LTWS) results of the different methods are presented in Figure 3-2, 
showing that the LTWS of the base-case algorithm is in good alignment with the other results. 
Figure 3-2. Comparison of LTWS with different MCP methods 
 
Source: Author’s own illustration, generated in ISo1 
3.2 Evaluation of the base-case algorithm results 
The application of the base-case algorithm has been conducted in line with the flow chart 
shown in Figure 2-17 presented previously. During the validation and simulation runs, the 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 68 | 110 
output of the python code was monitored for plausibility. In the subsequent subsections, the 
findings of the simulations are presented. 
3.2.1 Key performance indicators during the process 
The KPI defined for PreDF in Table 2-12 were evaluated as a prerequisite to running an MCP 
for both data filling and long-term correction. This analysis was conducted sectorwise. Another 
use of calculating PreDF KPIs is to evaluate the performance of self-predictions to observed 
metrics changes. The sectorwise exemplary results of the concurrent periods are presented in 
detail in Annex B. 
The heatmaps of measured Weibull scale and shape factors, as well as R² values of sectorwise 
hourly wind speed correlations, are shown in Figure 3-3 and Figure 3-4, respectively, for 1-day 
and 60-days gap scenarios. The description “feature” represents the sector, the colours within 
the vertical columns represent the iteration results within the gap. In each heatmap, the 
evolution of iteration results is shown starting from top to bottom. As shown in the images 
below, the larger intervals cause a minor distortion in the A and k Weibull parameters. 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 69 | 110 
Figure 3-3. PreDF – Heatmap of measured Weibull scale and shape factors for 1-day 
(left) and 60-days gap scenarios (right) in each column, respectively 
Scale factors Shape factors 
 
 
 
 
Source: Author’s own illustration, the description “feature” represents the sector, the colours within the vertical 
columns represent the iteration results within the gap. Scale factor in m/s, shape factor dimensionless. 
Figure 3-4 depicts the heatmaps of R² values of sectorwise hourly wind speed correlations for 
scenarios with 1-day and 60-day gaps scenarios. The hourly wind speed correlations (R²) are 
very good (>0.85) across all sectors and uniform throughout the sliding gap window in the 
respective period. A slight decrease in correlations can be observed in the longer 60-days-gap 
period, especially in the easterly sectors. In general, the R² values are considered very good, 
showing a significant correlation between the reference and measured datasets. As a result, 
a sector-based MCP approach is deemed appropriate. 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 70 | 110 
Figure 3-4. PreDF – Heatmap of R² values of sectorwise hourly wind speeds correlation 
for 1-day (left) and 60-days gap scenarios (right) 
 
 
Source: Author’s own illustration, the description “feature” represents the sector, the colours within the vertical 
columns represent the iteration results within the gap. 
The coefficient of determination of sectorwise hourly wind speeds for 1-day and 60-days gap 
periods are presented in Figure 3-5 with a whisker plot. The "whiskers" plot (also box plot) is 
defined by the third quartile on the top and the first quartile on the bottom. The box is divided 
by the median. The whiskers represent error bars, with one extending upward from the third 
quartile to the maximum and the other extending downward from the first quartile to the lowest. 
Dot markers are also used to identify the outliers in the data.  
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 71 | 110 
For the single-day gap period, the sectorwise R² values show excellent correlation (R²>0.9) 
throughout the majority of sectors, as well as a very narrow distribution between the 25% and 
75% quantiles for the whole duration. Similarly, the correlations do not diminish throughout the 
60-day gap period, while the extent of the boxes increases somewhat during this time. 
Accordingly, the sectorwise correlations are deemed appropriate for hourly modelling of a 
linear regression MCP, both for data filling and long-term correction. 
Figure 3-5. PreDF – Box plot of R² values of sectorwise hourly wind speeds correlation 
for 1-day 
 
Source: Author’s own illustration, the description “feature” represents the sector. 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 72 | 110 
Figure 3-6. PreDF – Box plot of R² values of sectorwise hourly wind speeds correlation 
for 60-days gap 
 
Source: Author’s own illustration, the description “feature” represents the sector. 
The MBE, MAE, and RMSE of mean wind speeds over concurrent periods are summarized for 
1-day and 60-days in Table 3-2. The reader is reminded that the PreDF metrics do not involve 
any modelling and just show a comparison of the reference and target datasets in order to 
determine the dataset's appropriateness and representativeness for an MCP method, as 
specified in the technical standards [35]. 
The MBE, MAE, and RMSE values for the PreDF period are relatively high, indicating that 
despite its good correlations, the reference dataset cannot match the precision of wind speed 
observations. This is to be expected, given that the reference dataset is a global reanalysis 
with a coarse grid resolution, as opposed to a mesoscale modelling dataset. Although this is 
not a concern for this type of study, it does provide an opportunity to evaluate the algorithm's 
performance against a mesoscale modelling solution in a future exercise. It is noted at this 
stage that mesoscale simulations are not entirely independent from reanalysis solutions as 
they use dynamic downscaling methods driven by reanalysis data [35]. 
Summary statistics of the RMSE of MWS for 1-day and 60-days period are presented in Table 
3-2. The summary statistics of the MBE and MAE of MWS are shown in Annex C.  
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 73 | 110 
Table 3-2. PreDF - Summary statistics of RMSE of MWS for 1-day and 60-days gap 
scenarios 
Sector 
Mean [m/s] Standard deviation [m/s] Max [m/s] 
1-day 60-days 1-day 60-days 1-day 60-days 
0 1.184 1.084 0.206 0.378 1.229 1.283 
1 1.182 1.087 0.206 0.381 1.230 1.366 
2 1.460 1.343 0.254 0.468 1.515 1.546 
3 1.673 1.535 0.291 0.536 1.736 1.804 
4 1.692 1.559 0.294 0.544 1.764 1.824 
5 1.677 1.545 0.292 0.539 1.743 1.787 
6 1.600 1.476 0.278 0.515 1.658 1.755 
7 1.427 1.313 0.248 0.458 1.476 1.529 
8 1.500 1.381 0.261 0.481 1.551 1.587 
9 1.228 1.130 0.214 0.394 1.272 1.302 
10 1.199 1.101 0.209 0.383 1.245 1.264 
11 1.151 1.054 0.200 0.367 1.194 1.234 
Source: Author’s own calculation/assessment 
Heatmaps of the mean bias error observed in the Weibull shape and scale factors for all 
iterations and gap scenarios – weighted from sectorwise analysis - are shown in Figure 3-7 for 
1-day, 30 days and 60-days gap scenarios. The evolution of the scale and shape factors are 
provided in the following figures, indicating a good alignment between the reference and 
measured datasets. The MBE for shape factor ranges from -0.03 (blue) to -0.00 (yellow), 
indicating that the Weibull shape factor is nearly identical between the measured and reference 
datasets. The MBE of the scale factors demonstrates a greater discrepancy but similar low 
dispersion, ranging from 0.61 m/s (blue) to 0.67 m/s (red) (yellow). 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 74 | 110 
Figure 3-7. PreDF – Heatmap of MBE of Weibull shape (left) and scale (right) factors for 
all iterations and gap scenarios – weighted from sectorwise analysis 
  
Source: Author’s own illustration, grey means no data values, feature stands for gap. 
The MBE, MAE and RMSE of WPD for 1-day and 60-days gap scenarios were documented 
during the iteration, which is presented in Annex C. The greatest difference in wind power 
density is seen in the south sector with the fewest samples. The MBE, MAE, and RMSE of 
WPD distributions are comparable across sectors, and they marginally decrease for the largest 
gap scenario. 
The overall statistics of the KS values are presented in Table 3-3, showing a moderate 
performance. The KS-statistic has a similar error margin distribution as the wind power density, 
with greater errors in the easterly sectors. A somewhat reduced KS error is detected in the 
primary wind direction components of sectors 7 and 8, which does not increase with gap size. 
This revealed that the distribution is unlikely to be influenced by increasing gap size. It should 
be highlighted that a mesoscale model product would have higher KS-statistic performance 
(lower value) when compared to this observed dataset. 
Table 3-3. PreDF - Summary statistics of KS of MWS for 1-day and 60-days gap 
scenarios 
Sector 
Mean Standard deviation Max 
1-day 60-days 1-day 60-days 1-day 60-days 
0 4.9% 4.4% 0.8% 1.6% 5.2% 6.1% 
1 5.6% 5.3% 1.0% 1.9% 6.1% 7.0% 
2 8.6% 8.0% 1.5% 2.8% 9.2% 10.3% 
3 11.2% 10.5% 2.0% 3.7% 11.9% 13.2% 
4 8.6% 8.1% 1.5% 2.9% 9.2% 10.3% 
5 8.7% 8.3% 1.5% 2.9% 9.2% 10.9% 
6 6.4% 6.1% 1.1% 2.1% 6.7% 7.8% 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 75 | 110 
Sector 
Mean Standard deviation Max 
1-day 60-days 1-day 60-days 1-day 60-days 
7 4.1% 3.8% 0.7% 1.4% 4.3% 4.9% 
8 5.0% 4.7% 0.9% 1.7% 5.3% 5.9% 
9 4.5% 4.2% 0.8% 1.5% 4.7% 5.3% 
10 7.1% 6.6% 1.2% 2.4% 7.6% 8.3% 
11 6.4% 6.0% 1.1% 2.1% 7.0% 7.6% 
Source: Author’s own calculation/assessment 
Annex C contains heatmaps showing sectorwise wind direction deviation of wind speeds for 
1-day and 60-day gap situations. Wind direction discrepancies are relatively moderate 
throughout sectors, ranging from -4.5° to -1.1°, with the primary sectors having the biggest 
offsets. 
In light of the aforementioned metrics, it is clear that the reference dataset chosen is 
appropriate and representative of the target location and that it may be utilized to make 
predictions with the chosen LLS algorithm. 
SelfDF KPIs are used to compare the outcome of predictions to a known outcome, which is 
represented by the true measured values. It is possible to assess the performance of the model 
with the use of the SelfDF key performance indicators. The relationships for SelfDF were 
previously detailed in Table 2-19 and Table 2-20. It should be mentioned at this point that, in 
real-life circumstances, there is no long-term dataset available for analysts to use in order to 
analyze the true performance of a model. Furthermore, because technical analysis time is 
often limited, it is necessary to execute a simplified procedure in order to evaluate the 
performance of any chosen MCP approach as rapidly as possible. In order to get further insight 
into the performance of the model, it is critical to judge the SelfDF performance and, if possible, 
look for a relationship with the validation performance, in which predicted values of a training 
model are compared to unknown true observed values, as described above. This is covered 
in further detail under the ValDF KPI. In the subsequent paragraphs, tables and figures, the 
performance of the LLS model is presented during the concurrent period.  
The R² correlations of hourly wind speeds between the model predictions and actual values 
are shown in Annex C with heat maps for scenarios with a 1-day and 60-day gap period. The 
results demonstrate that the distributions of R² are consistent across the sliding gap periods. 
A similar pattern can be observed in terms of correlations throughout bins; they are outstanding 
up to 0.92 with the exception of the eastern sectors, which have poorer correlations down to 
0.77. The box plots of the data are shown in Figure 3-8 and Figure 3-9. 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 76 | 110 
Figure 3-8. SelfDF – Boxplot of R² values of sectorwise hourly wind speeds correlation 
for 1-day scenario 
 
Source: Author’s own illustration, the description “feature” represents the sector. 
Figure 3-9. SelfDF – Boxplot of R² values of sectorwise hourly wind speeds correlation 
for 60-day scenario 
 
Source: Author’s own illustration, the description “feature” represents the sector. 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 77 | 110 
Figure 3-10. SelfDF - 3D evolution of RMSE of MWS for all sectors and gaps 
 
Source: Author’s own illustration with Paraview [95] 
For the examined periods, the model's mean bias error is zero, and for the 1-day gap scenario, 
the MAE is roughly 1 m/s throughout the bins, as shown in Table 3-4. These findings are easily 
comparable to those previously given in Figure 3-1 for the various MCP investigations. A total 
of 1 m/s approximates a 10% relative MAE, which is in excellent agreement with previous MCP 
methods and demonstrates better performance. The 60-days scenario results in a somewhat 
lower average MAE. The root mean square error of mean wind speeds in the bins is visualized 
in a ParaView plot as shown in Figure 3-10, where the x-axis represents the number of 
iterations in time, the y-axis the gap duration from 1 to 60-days and the z-axis the directional 
sectors from 1 to 12. The magnitude of RMSE is represented with a colour. 
The plot is shown to emphasize that an RMSE value exists for each bin, iteration within the 
gap, and gap period, and secondly to demonstrate that the results are remarkably uniform 
around 1.2 m/s and consistent across bins, iterations, and gap periods, with the minor 
exception of sectors 4 to 7, where a higher error can be observed up to 1.6 m/s 
Table 3-4. SelfDF - Summary statistics of RMSE of MWS for 1-day and 60-days gap 
scenarios 
Sector 
Mean [m/s] Standard deviation [m/s] Max [m/s] 
1-day 60-days 1-day 60-days 1-day 60-days 
0 1.133 1.038 0.197 0.362 1.175 1.218 
1 1.153 1.061 0.201 0.372 1.200 1.325 
2 1.324 1.219 0.230 0.425 1.378 1.452 
3 1.281 1.177 0.223 0.411 1.329 1.408 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 78 | 110 
Sector 
Mean [m/s] Standard deviation [m/s] Max [m/s] 
1-day 60-days 1-day 60-days 1-day 60-days 
4 1.461 1.347 0.254 0.470 1.521 1.583 
5 1.473 1.356 0.256 0.473 1.528 1.577 
6 1.404 1.295 0.244 0.452 1.455 1.566 
7 1.353 1.243 0.235 0.433 1.399 1.439 
8 1.435 1.322 0.250 0.461 1.485 1.525 
9 1.162 1.070 0.202 0.373 1.204 1.238 
10 1.088 0.999 0.189 0.348 1.127 1.147 
11 1.087 0.996 0.189 0.347 1.127 1.156 
Source: Author’s own calculation/assessment 
The heatmaps of the MBE of Weibull shape and scale factors are shown in Annex C of this 
document. When comparing the differences between the measured and reference periods, an 
anticipated improvement in the MBE values are seen, with a minor variation between the 
measured and model scale factors of 0.13 to 0.15 for the scale factors between the two 
periods. Scale factor deviations are insignificant with values between 0.008 m/s and -0.001 
m/s. 
Figure 3-11. SelfDF – Heatmap of MBE of Weibull shape (left) and scale (right) factors 
for all iterations and gap scenarios – weighted from sectorwise analysis 
  
Source: Author’s own illustration, grey means no data values, feature stands for gap. 
The WPD statistics for the SelfDF period is provided in Annex C. With regard to the PreDF 
KPI, an improvement in the MBE, MAE, and RMSE of the WPD has been noticed.  The root 
mean square error of WPD is reduced by 18% in the sector with the maximum error. This is to 
be anticipated, given the global reanalysis dataset was not originally intended to align well with 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 79 | 110 
the absolute values in a measurement dataset. The comparison of WPD revealed no more 
information. 
A similar observation was made for the KS-statistic. Following the model fit, the predicted 
distribution exhibits good performance with an error margin of 1.9% to 2.4% in the primary 
wind directions. 
Figure 3-12 depicts the progression of the root mean square error of the MWS for the 1-day 
(top) and 60-day (bottom) scenarios of the omnidirectional analysis, with the heatmap shown 
for all iterations and gap situations in the next Figure 3-13. The findings of the omnidirectional 
root mean square error (RMSE) are now in great agreement with the results of the initial 
Windographer performance test of different MCP methods. In the 1-day gap case, it can be 
seen that the spread of the root mean square error during the measurement period is very 
limited. Despite the fact that this grows significantly for the 60-day case, the range of RMSE of 
MWS stays within a 0.05 m/s interval. 
Figure 3-12. SelfDF – Evolution of RMSE of MWS for 1-day (top) and 60-days (bottom) 
scenarios – omnidirectional analysis 
 
 
 
Source: Author’s own illustration 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 80 | 110 
Figure 3-13. SelfDF – Heatmap of RMSE of MWS for all iterations and gap scenarios – 
omnidirectional analysis 
 
Source: Author’s own illustration, grey means no data values, feature stands for gap. 
MBE, MWS, and RMSE of mean wind speeds were examined throughout the validation period 
to determine the genuine performance of the tested approach. The difference between 
predicted and observed values are used to calculate the ValDF KPI, where predicted values 
are trained using the concurrent period rather than the validation period. Due to the fact that 
such a comparison is not attainable in real-world projects, any knowledge acquired from this 
part might prove very valuable. It is noted that the validation period KPIs are derived from an 
omnidirectional LLS modelling. 
The evolutions of MBE, MAE and RMSE of MWS are shown in Figure 3-14, Figure 3-15 and 
Figure 3-16, respectively, for 1-day  and 60-days scenarios. 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 81 | 110 
Figure 3-14. ValDF – Evolution of MBE of MWS for 1-day (top) and 60-days (bottom) 
scenarios 
 
 
Source: Author’s own illustration 
The mean bias error shows a strong oscillation around zero. This is considered reasonable 
considering the good performance of the model shown in the SelfDF period. As a result, the 
absolute errors are much larger. MAE and RMSE of ValDF mean wind speeds both 
demonstrate a higher dispersion around the mean. The coefficient of variance declines from 
48% in the case of a single-day gap to 9% in the scenario of a 60-day gap. It's worth noting 
that this COV behaviour is the opposite of what was seen for the SelfDF KPI, as the increase 
in the gap size results in a higher number of samples. Hence the downwards trend is plausible. 
The summary statistics of the ValDF period over the gap periods are shown in Table 3-5. 
Table 3-5. Summary statistics of ValDF for all gap periods 
Description MBE MAE RMSE 
Mean of gap MWS [m/s] -0.003 0.930 1.243 
Mean of standard-deviation [m/s] 0.200 0.293 0.397 
Mean of maximum gap MWS [m/s] 0.446 1.457 1.948 
Standard deviation of the mean [m/s] 0.003 0.026 0.025 
Standard error [m/s] 0.000 0.003 0.003 
Source: Author’s own calculation/assessment 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 82 | 110 
Figure 3-15. ValDF – Evolution of MAE of MWS for 1-day (top) and 60-days (bottom) 
scenarios 
 
 
Source: Author’s own illustration 
Figure 3-16. ValDF – Evolution of RMSE of MWS for 1-day (top) and 60-days (bottom) 
scenarios 
 
 
Source: Author’s own illustration 
The link between SelfDF and ValDF RMSE was examined with the goal of establishing a proxy 
approach for assessing the uncertainty associated with data filling. For all gap situations, a 
very high negative relationship was observed between the SelfDF and ValDF RMSE of MWS. 
It is noteworthy to highlight that the relatively small RMSE error interval for self-prediction is 
linked with the larger error interval seen during the validation period. The inverse correlation 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 83 | 110 
suggests that if MWS's self-prediction RMSE is quite large, the uncertainty in the data-filled 
gap period is very likely to be reduced. This relationship has the potential to be used to 
empirically assess the anticipated uncertainty in data-filling using normalized transfer 
functions. 
The regression plots of SelfDF and ValDF RMSE of MWS are shown in Figure 3-17, with 
detailed figures shown in Annex H. 
Figure 3-17. Regression plots of self-prediction and validation RMSE for 1-day (top) and 
60-days (bottom) scenarios 
 
Source: Author’s own illustration 
When a representative measurement campaign is accessible, the strong negative connection 
discovered between the ValDF and SelfDF KPIs might be used as a proxy to judge the 
performance of a nearby future measurement campaign. More crucially, in a sufficiently 
offshore situation, this can serve as a credible empirical tool for assessing the uncertainties 
associated with data gaps. 
It should be emphasized that this link has not been mentioned or discussed in any related 
literature before. Because these results show a strong association, independent validation of 
these results would be required before this novel approach could be used in future studies. 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 84 | 110 
For the remainder of this work, the approach is referred to as the inverse self-prediction error 
(ISPE) method. 
3.2.2 Data filling results 
Figure 3-18 illustrates the evolution of the mean difference between actual and predicted wind 
speeds during a 60-day period. Additionally, the Figure for the 60-day gap period beginning on 
01.07.2012 gives insight into the findings by displaying both actual and forecasted wind speed 
time series.  
Figure 3-18. Evolution of MBE of observed vs predicted wind speeds for 60-days gap 
period 
 
 
Source: Author’s own illustration, generated in ISo1 
The overall mean bias error for the first day of July 2017 is minimal at -0.05 m/s; however, in 
the plot with time series deviations up to 3-4 m/s may be seen distinctly between the observed 
and predicted time series as shown in Figure 3-19. When the scatter plot, as seen in Figure 
3-20, is evaluated, the magnitude of this variance becomes even more apparent. The 
parameters of the regression fit are shown in Figure 3-6, presenting a good correlation between 
the independent datasets.  
Table 3-6. LLS model parameter of validation period for 60-days gap period (start at 
01.07.2012) 
Gap period Model Time steps Intercept [m/s] Slope R² 
60-days Trained from concurrent time series 1438 0.856 1.058 0.79 
 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 85 | 110 
Figure 3-19. Time series of observed vs predicted wind speeds for 60-days gap period 
starting on 01.07.2012 
 
 
Source: Author’s own illustration, generated in ISo1 
Figure 3-20. Scatter plot of observed vs predicted wind speeds for 60-days gap period 
starting on 01.07.2012 
  
Source: Author’s own illustration, generated in ISo1 
Comparison of wind direction frequency of observed versus predicted wind speeds for 60-days 
gap period starting on 01.07.2012 is shown in Figure 3-21. While there is acceptable 
agreement across the broad sectors, it should be noted that the predicted primary wind 
direction of simplified MCP is offset approximately by a sector. 
July 2012 August 2012
0
5
10
15
20
 W
in
d
 s
p
ee
d
 (
m
/s
)
Comparison of time series, 60 days gap period, start. 01.07.2012
Predicted
Observed
0 5 10 15 20 25
0
5
10
15
20
25
P
re
d
ic
te
d
 (
m
/s
)
Comparison (60 days gap - start: 01.07.2012)
Observed (m/s)
Data
Line of best fit
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 86 | 110 
Figure 3-21. Comparison of wind direction frequency of observed vs predicted wind 
speeds for 60-days gap period starting on 01.07.2012 
  
Source: Author’s own illustration, generated in ISo1 
The standard deviation of all calculated STWS is 0.007 m/s, encompassing all gap times. The 
STWS has a low coefficient of variation, demonstrating a linear trend for the gaps, ranging 
from 0.01% for a single day gap to 0.12% for a 60-day gap. The highest and smallest deviations 
from the recorded short-term wind speed are respectively 0.26% and -0.34%, indicating 
outstanding performance. Figure 3-22 illustrates the progression of STDF-WS for 1-day (top) 
and 60-day (bottom) gap situations. 
Wind Direction Frequency (60 days gap period, start: 01.07.2012)
0°
30°
60°
90°
120°
150°
180°
210°
240°
270°
300°
330°
0%
12%
24%
Observed
Predicted
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 87 | 110 
Figure 3-22. Evolution of STDF-WS for 1-day (top) and 60-days (bottom) gap scenarios 
 
 
Source: Author’s own illustration 
3.2.3 Long term correction results 
The study’s key research question was whether an interim phase of data filling is required prior 
to applying the long-term correction. Therefore two versions of LTWS were constructed using 
the Python code and the procedures outlined above for each sliding window of the gap, ranging 
from a 1-day gap to a 60-day gap, starting with a 1-day gap and increasing to a 60-day gap 
afterwards. The first LTWS was produced by fitting an omnidirectional linear regression model 
to concurrent time series and reference datasets. Only concurrent measurements with gaps 
were utilized in the second relationship. The following Figure 3-23 depicts the evolution of the 
LTWS over a period of one day and sixty days. 
The following Figure 3-24 shows the comparison of data-filled long-term time series with long-
term wind speed time series that did not go through the intermediate step of data filling for a 
60-day gap period beginning on the first of July 2012. 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 88 | 110 
Figure 3-23. Evolution of LTWS without DF and LTWS with DF for 1-day (top) and 60-
days (bottom) gap scenarios 
 
 
Source: Author’s own illustration 
 
Figure 3-24. Scatter plot of DF predicted vs LTC predicted wind speeds for 60-days gap 
period starting on 01.07.2012 
  
Source: Author’s own illustration, generated in ISo1 
As seen above, both variants of the LTWS are identical and do not differ at all for the largest 
gap studied. This is predicted, given the omnidirectional regression parameters and the 
lessened influence of any change in model fit caused by the proportion of gaps. This conclusion 
may be drawn by examining the following Table 3-7 more closely; the LLS-slope model’s and 
0 5 10 15 20 25
0
5
10
15
20
25
L
L
S
 w
it
h
 D
F
 (
m
/s
)
Comparison (60 days gap - start: 01.07.2012)
LLS without DF (m/s)
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 89 | 110 
intercept parameters are equal for data-filled and starting (with gap) time series. The 
percentage of the biggest gap is 6.9%, which means that any change in the connection after 
the gap is filled affects just 7% of the final linear relationship. 
Table 3-7. LLS model parameter of LTC for 1-day, 20-days and 60-days scenarios 
Gap period Data-filling Fraction 
of data 
gap [%] 
Time steps Intercept [m/s] Slope R² 
1-day Data-filled time series - 17472 0.441 1.011 0.918  
Without data-filling 0.1% 17448 0.442 1.011 0.918 
20-days Data-filled time series - 17472 0.435 1.012 0.922  
Without data-filling 2.7% 17018 0.435 1.012 0.921 
60-days Data-filled time series - 17472 0.404 1.014 0.926  
Without data-filling 6.9% 16057 0.404 1.014 0.922 
Source: Author’s own illustration, x-axis start of the gap-time. 
Additionally, the study is interested in observing and comprehending the effect of gaps on the 
LTWS. It can be seen that the long-term correction results vary considerably more than a 
quarter downward during the gap periods beginning in the early weeks of January 2013. 
Similarly, for the gap periods beginning in September 2013 and ending at the end of the 
corresponding year, the divergence is more upward. Annex K has thorough documentation of 
the LTWS for each gap period, including the measured wind speeds for comparison. While 
examining these figures, it is critical to note that the gap periods listed above omit a time of 
high wind periods. 
Basse [96] examined the seasonality and behaviour of reanalysis datasets in considerable 
detail using the linear regression method with residuals. For the majority of the investigated 
cases,  the mean of the adjusted wind speed time series is underestimated for summer 
measurements, whereas it is overestimated for the winter season, where the outcome was 
dominated by the reanalysis data’s significant seasonality. 
Considering the aforementioned observations and the literature findings, the modest step-up 
in LTWS increase may be explained by the predicted “overcorrection” of the model fit, slightly 
overestimating average short-term wind speeds. This also highlights the importance of having 
a seasonally balanced short-term dataset while conducting an MCP. 
While the above argument may explain the overestimation of LTWS in that particular case, it 
does not explain the underestimation of wind speeds fully in the winter period from February 
to March 2013, as well as in February to March 2014. Figure 3-25 shows the course of monthly 
wind speeds throughout the measurement period for the concurrent combined measured and 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 90 | 110 
reference dataset. On the other hand, Figure 3-26 illustrates the normalized monthly wind 
speeds obtained from the data, as well as the projected average normalized wind speed at the 
target site. As seen in Figure 3-26, the period from October 2013 to January 2014 was an 
exceptional high-wind season. As a result, it is expected that a significant gap established over 
such a period will raise the LTWS. In comparison, during a typical average year, normalized 
wind speeds fall below the 100% range beginning in February and gradually recover until 
September, when an underestimation of LTWS is predicted. 
Figure 3-25. Concurrent measured and referenced monthly wind speeds during short-
term period 
 
Source: Author’s own illustration, via ISo1 
Figure 3-26. Monthly windiness comparison of the short and long-term period 
 
Source: Author’s own illustration 
0.6
0.8
1
1.2
1.4
1.6
1.8
A
nn
ua
liz
ed
 m
on
th
ly
 w
in
d 
sp
ee
ds
 [-
]
Date in month and year
Monthly windiness based on LTWS
Measured annualized with LTWS Reference long-term average
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 91 | 110 
It is noted that the underestimation of the LTWS during the first February-March period is more 
pronounced than the second phase of the measurement period.  
Figure 3-27. Measured wind frequency roses, measurement period 2013 (top left), 
measurement period 2014 (top right), measurement period 2015 (bottom left), long-term 
reference period (bottom right) 
  
  
Source: Author’s own illustration, via ISo1 
As may be seen in Figure 3-27, the wind rose was slightly different in 2013, with a high 
frequency of easterly winds in February and March. An omnidirectional linear model, as 
demonstrated by the sectorwise linear model fit parameter in Table 3-7, is bound to 
underestimate such periods. This argument identifies a disadvantage of omnidirectional 
evaluation and suggests that sector-specific analyses may be more suitable. Nonetheless, it 
should be highlighted that the disadvantage of a sector-based correlation would be fewer 
points in sectors for rare weather events. There might also be a restriction in analysing very 
brief gaps. This is thought to be a future study subject. 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 92 | 110 
Table 3-8. Sectorwise LLS model parameter – full measurement period 
Sector Range Time steps Intercept [m/s] Slope R² 
0 345° - 15° 1280 -0.179 1.062 0.905 
1 15° - 45° 1066 0.22 1.006 0.863 
2 45° - 75° 1131 0.428 1.024 0.885 
3 75° - 105° 1307 -0.183 1.130 0.917 
4 105° - 135° 834 0.73 1.019 0.879 
5 135° - 165° 792 0.64 1.023 0.896 
6 165° - 195° 1600 0.75 1.004 0.924 
7 195° - 225° 2651 0.754 0.974 0.923 
8 225° - 255° 2513 0.431 1.002 0.903 
9 255° - 285° 1828 0.45 0.996 0.922 
10 285° - 315° 1280 0.314 1.023 0.938 
11 315° - 345° 1190 0.32 1.009 0.917 
Source: Author’s own illustration 
In conclusion, whilst it is self-evident that the intermediate step of data filling was unnecessary 
within this study for the purpose of generating long-term wind speeds, generalizing this result 
without testing more sophisticated methods would be incorrect. Combining alternative data-
filling procedures and/or using more advanced methodologies may result in a different output. 
For instance, ISo1 uses a Markov-based reconstruction mechanism to generate synthetic data 
to fill in gaps in a measured time series. This synthetic data has the same frequency 
distribution, seasonal and diurnal trends, and autocorrelation as the observed data [73]. 
Additionally, it would be beneficial to use statistical testing techniques with hypothesis testing 
where such methods are implemented. 
3.3 Evaluation of the DF and LTC uncertainties 
Figure 3-28 illustrates the progress of DF uncertainty for 1-day and 60-day gap scenarios, as 
well as the percentage deviation from the observed short-term mean average. The coefficient 
of variation for short-term wind speed estimates is between 0.01% and 0.15% for 1-day and 
60-day gaps, respectively.  
With a 52-day gap, the calculated maximum variation of the STWS average is -0.34%, which 
is considered a modest level. A similar deviation can be observed in Figure 3-28 for the 60-
days gap scenario for the gap period starting in February 2013. The detailed evolution of the 
DF uncertainties can be seen in Annex L, alongside the measured time series at the bottom of 
each chart. A visual similarity between the deviation and uncertainty bounds is visible. Similar 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 93 | 110 
to the discussion in Section 3.2.3, the evolution of DF uncertainties is driven primarily by the 
seasonality of the reference dataset and MCP method. This is a direct consequence of the 
inverse relationship from the predictions in the validation period. Furthermore, it can be 
observed that the averaged mean deviation in percentage is significantly lower than the 
associated uncertainty. 
Figure 3-28. Evolution of DF uncertainties for 1-day and 60 days gap scenarios 
 
 
Source: Author’s own illustration 
Figure 3-29 below illustrates the evolution of JK uncertainties in LT correction for scenarios 
with a 1-day and 60-day gap, respectively.  
The difference between the JK uncertainties for the scenario with data-filling and the scenario 
without data-filling is quite minor, with the difference growing somewhat for the scenario with 
the largest 60-day gap. While it can be observed that the JK uncertainties for the scenario 
without data-filling are more uniform, the other scenario, with data-filling, exhibits greater 
variability in the 60 days scenario with an increased COV of 38%, as compared to an increased 
COV of 18% in the scenario without data-filling. Throughout the 2013/2014 winter season, for 
example, the compensatory impact of the linear model for the very high wind period is highly 
visible in the JK uncertainties with DF. During that time, a reduction in the JK uncertainty is 
apparent, which can be attributed to the more uniform dataset due to data filling. 
The overall level of uncertainty with DF is around 0.21%, whereas the same figure is 0.02%, 
slightly less for the scenario without data-filling. 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 94 | 110 
Figure 3-29. Evolution of JK uncertainties in LT correction for 1-day and 60 days gap 
scenarios 
 
 
Source: Author’s own illustration 
It was suggested to take into account the DF and JK uncertainty while assessing the MCP 
method's uncertainty. Assuming that each source of uncertainty is statistically independent of 
the others, the total uncertainty is defined as the square root of the squared uncertainty 
estimations. This is referred to as the final uncertainty in DF and LTC as shown in Figure 3-30. 
Figure 3-30. Evolution of combined uncertainties in LT correction for 1-day and 60 days 
gap scenarios 
 
 
Source: Author’s own illustration 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 95 | 110 
It can be seen in the above figure that the total uncertainty is predominantly driven by the data 
filling uncertainty, which is not surprising. In light of the residual mean square error metrics 
from the validation periods, this is deemed reasonable. This also suggests the possibility that 
the missing gaps from an ideal representative 1-year assessment might account for a 
considerable portion of the LTC uncertainty.  
The standard error of LTWS predictions was determined to be 0.0% for all gap periods, 
indicating that the model is consistent. This is reasonable given the large number of forecasts 
made throughout the gap period, which totals more than 669 for each gap. Regarding the 
expected uncertainty,  the standard deviation of the LTWS predictions might be a more 
appropriate comparison metric than the standard error. This metric is sometimes referred to 
as “standard error” in the literature [72]. Nevertheless, it is clear that the standard deviation of 
the LTWS considerably underestimates the uncertainty margin, as shown in Figure 3-31. 
Figure 3-31. Comparison of empirical and calculated uncertainties in wind speeds for 
60 days gap period starting on 01.07.2012 
  
Source: Author’s own illustration  
In a recent wind resource assessment study conducted in the Dutch North Sea [18], the 
omnidirectional correlation uncertainty has been assessed as 1.47% with a Monte Carlo 
simulation for an FLS measurement campaign with 69 days of a gap. It is interesting to see 
the good alignment with the above Figure 3-31, as we would expect to see at a minimum 1.4% 
total uncertainty in the MCP method for a similar gap size.  
Section 2.6.3 introduced bootstrapping, which may be thought of as a variant on Monte-Carlo 
simulations. This technique was not implemented due to the high computational power needed 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 96 | 110 
for bootstrapping each iteration as a sliding analysis with several loops. Nonetheless, based 
on the findings of this study, it can be concluded and proposed that bootstrapping should be 
studied for MCP corrections, preferably in a comparable research project. 
Figure 3-32 presents a sensitivity case using the newest test version 5 of ISo1, which includes 
a bootstrapping analysis algorithm to estimate the uncertainties in long-term correction. The 
graph represents the results of a 500-iteration bootstrapping simulation utilizing an hourly 
omnidirectional LLS technique for the 60-day gap for both the data-filled and gap-free versions 
of the concurrent time series. Clearly, the acquired uncertainty level is substantially more than 
the estimate achieved in this research, which is around 1.4% for the 60-day gap scenario. This 
might be related to a large number of simulations or to other components of the analysis that 
were not examined at this point. This is unquestionably another area of research that warrants 
more exploration. 
Figure 3-32. Comparison of bootstrap and calculated uncertainties in wind speeds for 
60-days gap period starting on 01.07.2012 
  
Source: Author’s own illustration  
3.4 Proposed combined MCP uncertainty method 
Suppose a high-quality, wake-free measurement dataset (benchmark dataset) with at least 
two years of data in an offshore environment is available. In that case, a combined ISPE & JK 
approach might be used to estimate the uncertainty in the long-term correction of a nearby 
FLS measurement campaign with data availability issues. 
Provided an FLS measurement campaign is in a representative location to the benchmark 
dataset, the combined ISPE & JK method could be tested as follows 
• Conduct a gap analysis for the benchmark dataset 
Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 97 | 110 
• Obtain self-prediction RMSE and validation period RMSE of mean wind speed as described 
in this study 
• Investigate the linear relationship, and obtain transfer functions if there exists a strong 
correlation as found in this study for the benchmark dataset 
• Apply the transfer function of the benchmark dataset to estimate the DF-uncertainty based 
on the data gap period (1 to 60-days). 
• Conduct a JK uncertainty for long-term correction 
• Combine the DF-uncertainties with the JK uncertainty to obtain a final uncertainty in the long-
term correction 
In the case of an FLS measurement campaign in the Dutch North Sea, in a representative 
location to MMIJ, the transfer functions provided in the Annex I of this study could be tested. 
Finally, maintaining representative offshore measurement masts in far-offshore conditions – 
wake-free environment and representative for broader regions are considered highly valuable 
for research. This can be done with joint-industry projects, can provide a valuable function for 
verification and validation of FLS campaigns for pre-deployment.  
Discussion and conclusions 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 98 | 110 
4 Discussion and conclusions 
Wind resource assessment for offshore projects is critical for project financing. The duration of 
the data gap would be a critical criterion for determining the robustness of a wind resource 
assessment. Gap filling is required of meteorological time series for a variety of applications 
that require continuous data series. Fraunhofer IWES examined the effect of data gaps on the 
estimation of siting parameters in order to identify an appropriate method for filling in data gaps 
for an offshore measurement. This study sought to determine the effect of data gaps on long-
term wind speeds as part of the "Digital Wind Buoy" project. 
This problem could be investigated by recording key performance indicators (KPIs) for different 
analysis steps. Therefore, the study aims to establish the maximum acceptable gap duration 
in a year for an offshore measurement campaign for a robust wind resource assessment. 
Secondary investigations can be done to confirm the robustness of the gap-filling process.  
After literature review and the conduction of a stakeholder questionnaire, the MCP method 
was selected, the target (MMIJ) and reference datasets (ERA5) for MCP were prepared. A 
performance test algorithm has been run to compare the available MCP methods. The 
omnidirectional linear regression method, with least-squares model fit with offset, was 
identified as a suitable solution. Different gap periods starting with one day up to sixty days 
were investigated to find a quantifiable metric to predict the performance of the data-filling and 
long-term correction algorithm. An omnidirectional linear regression model was used to obtain 
both self-prediction and to predict the wind speeds at the introduced artificial gap. 
The performance of a measure-correlate-predict (MCP) algorithm for data-filling with linear 
least squares was analysed in detail using two years of the Ijmuiden met mast (MMIJ) 
measurements. A temporal resolution of one hour was selected for the correlations and model. 
This model fit was used to obtain both self-prediction performances and to predict the wind 
speeds at the introduced artificial gap. An inner loop repeated the predictions with a moving 
gap within the concurrent period, whereas an outer loop increased the gap duration 
incrementally by 1-day, starting with one day up to a total of 60-days. This modelled 
relationship was utilized to derive the LTWS twofold. The first scenario generated short-term 
data-filled time series, which were then used to re-establish the model with the reference 
dataset and generate the final LTWS. The second scenario was created to acquire the 
extended (long-term) time series without the need for data-filling. 
Different MCP methods were tested with an omnidirectional sectoral selection within 
Windographer using the performance test functionality, and the base-case algorithm was 
selected as the omnidirectional linear regression with offset for the Python code. 
Discussion and conclusions 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 99 | 110 
The KPI defined for PreDF was evaluated as a prerequisite to running an MCP for data filling 
and long-term correction. The MBE, MAE, and RMSE of mean wind speeds over concurrent 
periods were summarized for 1-day and 60-days. Despite its good correlations, the reference 
dataset could not match the precision of wind speed observations. Although this was not a 
concern for this type of study, it does provide an opportunity to evaluate the algorithm's 
performance against a mesoscale modelling solution. 
SelfDF KPIs were used to compare the outcome of predictions to a known outcome, which is 
represented by the true measured values. A total of 1 m/s approximated a 10% relative MAE, 
which was in excellent agreement with previous MCP methods. For the examined periods, the 
model's mean bias error was zero, and for the 1-day gap scenario, the MAE was roughly 1 m/s 
throughout the bins. 
When comparing the differences between the measured and reference periods, an anticipated 
improvement in the MBE values was observed. MBE, MWS, and RMSE of mean wind speeds 
were examined throughout the validation period to determine the genuine performance of the 
tested approach. In the 1-day gap case, the spread of the root mean square error during the 
measurement period was very limited. Despite the fact that this grew significantly for the 60-
day case, the range of RMSE stayed within a narrow 0.05 m/s interval. 
A high negative relationship was observed for all gap situations between the SelfDF and ValDF 
RMSE of MWS. This relationship had not been addressed or discussed in any related 
literature. Because the data revealed a substantial correlation, independent validation is 
essential before using this unique technique in future investigations. This method is referred 
to as the inverse self-prediction error (ISPE) method. The ISPE method might serve as a 
credible empirical tool for assessing the uncertainties associated with data gaps in a sufficiently 
offshore situation. 
The evolution of the mean difference between actual and predicted wind speeds was 
investigated following the data-filling procedure. The short term average wind speed (STWS) 
predictions had a low coefficient of variation, demonstrating a linear trend for the gaps, ranging 
from 0.01% for a single day gap to 0.12%. The STWS's maximum and minimum deviations 
from the measured short-term wind speed were 0.26% and -0.34%, respectively, indicating 
exceptional performance. Considering that a 60-day gap time equates to 83% availability, the 
study reaffirmed the industry standard of 80% for measurement campaign data availability. 
One of the study's main questions was whether a short-term data filling phase was required 
before applying the long-term correction. The LTWS predictions were identical in both 
versions. This was mainly due to the omnidirectional regression parameters and the lessened 
influence of any change in model fit caused by the proportion of gaps.  
Discussion and conclusions 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 100 | 110 
The long-term correction predictions varied seasonally. Extremely strong winds and sectoral 
fluctuations during times influenced the predictions slightly. As expected, the LTWS results 
showed overcorrection of linear regression methods. In conclusion, the intermediary stage of 
data filling was redundant in this investigation. However, generalizing this result without doing 
additional tests might be misleading. It is recommended to explore more advanced approaches 
for generating synthetic data to fill in gaps in a measured time series. 
The standard error of LTWS predictions was determined to be 0.0% for all gap periods, 
indicating that the model was consistent. This was comprehensible given the large number of 
predictions made throughout the gap period, which totals more than 669 for each gap. 
The evolution of DF uncertainties was driven primarily by the seasonality of the reference 
dataset and MCP method as a direct consequence of the inverse relationship from the 
predictions in the validation period. The total uncertainty was assessed as the square root of 
the squared uncertainty estimations of data-filling (ISPE-Method) and jackknife uncertainties. 
The combined uncertainty was driven by the data filling uncertainty suggesting that the 
possibility that the missing gaps from an ideal representative 1-year assessment might account 
for a considerable portion of the LTC uncertainty.  
Furthermore, it has been observed that the standard deviation of the LTWS considerably 
underestimated the uncertainty margin. Therefore it is suggested to take into account the DF 
and JK uncertainty while assessing the MCP method's uncertainty. Bootstrapping should be 
studied for MCP corrections as a suitable method in further detail, preferably in a comparable 
research project. 
The questionnaire's answers are considered extremely valuable and may help shape future 
studies' conceptualizations. These may include additional variables that may affect the MCP, 
more advanced non-linear MCP algorithms, data-filling approaches, and sensitivity analysis of 
metocean parameters.  
Finally, it is important to highlight that there might be significant year-to-year fluctuations in 
windiness, which may affect data-filling and MCP operations. According to Burton, many 
factors might contribute to these changes. According to the researchers, global climate 
phenomena such as El Nino, volcano eruptions, and solar activity oscillations may be 
connected. Additionally, the expected effects of human-induced global warming on the climate 
are controversial and are likely to affect wind conditions in the following decades [97]. 
This master thesis contains a thorough set of appendices and a summary of the data collected 
to allow for verification and investigation of any obtained results. 
References 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 101 | 110 
5 References 
[1] MEHR FORTSCHRITT WAGEN: BÜNDNIS FÜR FREIHEIT, GERECHTIGKEIT UND 
NACHHALTIGKEIT. KOALITIONSVERTRAG ZWISCHEN SPD, BÜNDNIS 90/DIE 
GRÜNEN UND FDP. [Online]. Available: https://www.spd.de/fileadmin/Dokumente/
Koalitionsvertrag/Koalitionsvertrag_2021-2025.pdf (accessed: Oct. 1 2021). 
[2] Measnet, “SITE-SPECIFIC WIND CONDITIONS Version 2 April 2016,” April, 2016. 
[Online]. Available: https://www.measnet.com/wp-content/uploads/2016/05/Measnet_
SiteAssessment_V2.0.pdf 
[3] J. Gottschall, B. Gribben, D. Stein, and I. Würth, Floating lidar as an advanced offshore 
wind speed measurement technique: current technology status and gap analysis in 
regard to full maturity, 2041840X, vol. 6. [Online]. Available: https://
onlinelibrary.wiley.com/doi/10.1002/wene.250 
[4] J. Gottschall and M. Dörenkämper, “Understanding and mitigating the impact of data 
gaps on offshore wind resource estimates,” Wind Energy Science, vol. 6, no. 2, pp. 505–
520, 2021, doi: 10.5194/wes-6-505-2021. 
[5] EnArgus Vorhaben '03EE3024' aus Suche nach ''. [Online]. Available: https://
www.enargus.de/detail/?id=1407485 (accessed: Jan. 13 2022). 
[6] P. Körner, R. Kronenberg, S. Genzel, and C. Bernhofer, “Introducing Gradient Boosting 
as a universal gap filling tool for meteorological time series,” Meteorologische Zeitschrift, 
vol. 27, no. 5, pp. 369–376, doi: 10.1127/metz/2018/0908. 
[7] R. B. Stull, An introduction to boundary layer meteorology: Kluwer Academic; 
Atmospheric Sciences Library, 13, 1988. [Online]. Available: https://books.google.de/
books?id=eRRz9RNvNOkC&newbks=1&newbks_redir=0&lpg=PP1&dq=
An%20introduction%20to%20boundary%20layer%20meteorology&hl=de&pg=PR4
#v=onepage&q=An%20introduction%20to%20boundary%20layer%20meteorology&f=fal
se 
[8] A. Rogers, J. Rogers, and J. Manwell, “Uncertainties in Results of Measure-Correlate-
Predict Analyses,” European Wind Energy Conference and Exhibition 2006, EWEC 
2006, vol. 3, 2005. [Online]. Available: https://www.researchgate.net/publication/
237439775_Uncertainties_in_Results_of_Measure-Correlate-Predict_Analyses 
[9] J. A. Carta, S. Velázquez, and P. Cabrera, A review of measure-correlate-predict (MCP) 
methods used to estimate long-term wind characteristics at a target site, 13640321, vol. 
27. [Online]. Available: https://linkinghub.elsevier.com/retrieve/pii/S1364032113004498 
References 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 102 | 110 
[10] Morten Lybech Thogersen, WindPRO / MCP Measure-Correlate-Predict: An Introduction 
to the MCP Facilities in WindPRO: EMD International A/S, 2010. 
[11] J. Addison, A. Hunter, J. Bass, and M. Rebbeck, “A neural network version of the 
measure correlate predict algorithm for estimating wind energy yield,” 2022. [Online]. 
Available: https://www.semanticscholar.org/paper/A-neural-network-version-of-the-
measure-correlate-Addison-Hunter/3c323cabd4d960e605560059194a528cfffa5959 
[12] A. Derrick, “Development of the Measure-correlate-predict strategy for site assessment,” 
Proceedings of the BWEA, 1993. [Online]. Available: https://www.researchgate.net/
publication/245913250_Development_of_the_Measure-correlate-predict_strategy_for_
site_assessment 
[13] A. A. Mortimer, “A new correlation/prediction method for potential wind farm sites,” Proc 
BWEA, pp. 349–352, 1994, doi: 10.1016/j.energy.2013.10.007. 
[14] M. Taylor, M. C. Brower, M. Markus, S. Meteorologist, and A. W. S. Truewind, “An 
Analysis of Wind Resource Uncertainty in Energy Production Estimates,” Proceedings of 
the European wind energy conference & exhibition, 2004. [Online]. Available: https://
vibdoc.com/an-analysis-of-wind-resource-uncertainty-in-energy-productio.html 
[15] D. Bechrakis, J. Deane, and E. Mckeogh, “Wind resource assessment of an area using 
short term data correlated to a long term data set,” Solar Energy, vol. 76, pp. 725–732, 
2004, doi: 10.1016/j.solener.2004.01.004. 
[16] C. J. Sheppard, “Analysis of the measure-correlate-predict methodology for wind 
resource assessment,” 2009. [Online]. Available: https://humboldt-dspace.calstate.edu/
handle/2148/542 
[17] M. Denis Mifsud, T. Sant, and R. Nicholas Farrugia, “Analysing uncertainties in offshore 
wind farm power output using measure-correlate-predict methodologies,” Wind Energy 
Science, vol. 5, no. 2, pp. 601–621, 2020, doi: 10.5194/wes-5-601-2020. 
[18] A. Pulo, O. Sargin, S. Schmidt, W. Schlez, and M. Stoaelinga, “Ten noorden van de 
Waddeneilanden Wind Farm Zone Wind Resource Assessment Prepared for : Wind 
Farm Zone Ten noorden van de Waddeneilanden Wind Resource Assessment Prepared 
for,” 2021. [Online]. Available: https://offshorewind.rvo.nl/file/download/55041024 
[19] M. L. Thøgersen, M. Motta, T. Sørensen, and P. Nielsen, “Measure-correlate-predict 
methods: case studies and software implementation,” European Wind Energy 
Conference & Exhibition, p. 10, 2007. [Online]. Available: 
https://www.semanticscholar.org/paper/Measure-Correlate-Predict-Methods%3A-Case-
References 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 103 | 110 
Studies-and-Thøgersen-Motta/61b794c53c2744869064c83300600566817d86fe 
http://emd.dk/files/windpro/Thoegersen_MCP_EWEC_2007.pdf 
[20] S. Liléo, E. Berge, O. Undheim, R. Klinkert, and R. E. Bredesen, “Long-term correction 
of wind measurements. State-of-the-art, guidelinies and future work,” January, 2013. 
[Online]. Available: https://www.researchgate.net/publication/285769739_Long-term_
correction_of_wind_measurements_State-of-the-art_guidelines_and_future_work 
[21] Datacadamia - Data and Co, Statistics - (Residual|Error Term|Prediction error|Deviation) 
(e| ). [Online]. Available: https://datacadamia.com/data_mining/residual (accessed: Jan. 
7 2022). 
[22] The MCP (Measure-Correlate-Predict) module - Learn moreEMD International. 
Accessed: Dec. 12 2021. [Online]. Available: https://www.emd-international.com/
windpro/windpro-modules/energy-modules/mcp/ 
[23] P. Pramod Jain, Wind Energy Engineering. New York: McGraw-Hill Education, 2011. 
[Online]. Available: https://www.accessengineeringlibrary.com/content/book/
9780071714778 
[24] E. Hau and H. von Renouard, Wind Turbines: Fundamentals, Technologies, Application, 
Economics: Springer Berlin Heidelberg, 2005. [Online]. Available: https://
books.google.de/books?id=Z4bhObd65IAC 
[25] S. Emeis, “Wind energy meteorology : atmospheric physics for wind power generation,” 
1865-3529, 2013. [Online]. Available: https://link.springer.com/book/10.1007/978-3-642-
30523-8 
[26] Statistical population - Wikipedia. Accessed: Dec. 11 2021. [Online]. Available: https://
en.wikipedia.org/wiki/Statistical_population#cite_note-4 
[27] BIPM et al., Evaluation of measurement data ‐ Guide to the expression of uncertainty in 
measurement: JCGM, 2008. [Online]. Available: https://www.bipm.org/documents/
20126/2071204/JCGM_100_2008_E.pdf/cb0ef43f-baa5-11cf-3f85-4dcd86f77bd6 
[28] IEC, “IEC 61400-12-1:2017 Edition 2.0 Wind energy generation systems – Power 
performance measurements of electricity producing wind turbines,” International 
Standard, 2017. [Online]. Available: https://webstore.iec.ch/publication/26603 
[29] Arithmetic mean - Wikipedia. Accessed: Dec. 11 2021. [Online]. Available: https://
en.wikipedia.org/wiki/Arithmetic_mean 
[30] Variance - Wikipedia. Accessed: Dec. 11 2021. [Online]. Available: https://
en.wikipedia.org/wiki/Variance#Sample_variance 
References 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 104 | 110 
[31] Covariance ‐ from Wolfram MathWorld. Accessed: Dec. 12 2021. [Online]. Available: 
https://mathworld.wolfram.com/Covariance.html 
[32] Pearson Correlation - SPSS Tutorials - LibGuides at Kent State University. Accessed: 
Dec. 12 2021. [Online]. Available: https://libguides.library.kent.edu/SPSS/PearsonCorr 
[33] Coefficient of determination - Wikipedia. Accessed: Dec. 12 2021. [Online]. Available: 
https://en.wikipedia.org/wiki/Coefficient_of_determination 
[34] Coefficient of Determination (R Squared): Definition, Calculation - Statistics How To. 
Accessed: Dec. 12 2021. [Online]. Available: https://www.statisticshowto.com/
probability-and-statistics/coefficient-of-determination-r-squared/ 
[35] FGW e.V., “Technical Guidelines for Wind Turbines Part 6 (TG6) Determination of Wind 
Potential and Energy Yields,” Tg 6, 2020. [Online]. Available: https://wind-fgw.de/wp-
content/uploads/2021/03/200921_TR6_Revision11_EN_ST_prev.pdf 
[36] Bias (statistics) - Wikipedia. Accessed: Dec. 12 2021. [Online]. Available: https://
en.wikipedia.org/wiki/Bias_(statistics) 
[37] Mean absolute error - Wikipedia. Accessed: Dec. 12 2021. [Online]. Available: https://
en.wikipedia.org/wiki/Mean_absolute_error 
[38] Root-mean-square deviation - Wikipedia. Accessed: Dec. 12 2021. [Online]. Available: 
https://en.wikipedia.org/wiki/Root-mean-square_deviation 
[39] RMSE: Root Mean Square Error - Statistics How To. Accessed: Dec. 12 2021. [Online]. 
Available: https://www.statisticshowto.com/probability-and-statistics/regression-analysis/
rmse-root-mean-square-error/ 
[40] Free Statistics Book. Accessed: Dec. 12 2021. [Online]. Available: https://
onlinestatbook.com/ 
[41] What is the Standard Error of a Sample ? - Statistics How To. Accessed: Dec. 12 2021. 
[Online]. Available: https://www.statisticshowto.com/probability-and-statistics/statistics-
definitions/what-is-the-standard-error-of-a-sample/ 
[42] Kolmogorov–Smirnov test - Wikipedia. Accessed: Dec. 12 2021. [Online]. Available: 
https://en.wikipedia.org/wiki/Kolmogorov–Smirnov_test#Two-sample_Kolmogorov–
Smirnov_test 
[43] UL International, Windographer: UL. Accessed: Jul. 1 2021. [Online]. Available: https://
www.windographer.com/ 
[44] Normal distribution - Wikipedia. Accessed: Dec. 12 2021. [Online]. Available: https://
en.wikipedia.org/wiki/Normal_distribution 
References 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 105 | 110 
[45] I. Troen and E. Petersen, “European Wind Atlas,” Roskilde: Riso National Laboratory, 
1989, vol. -1, 1989. [Online]. Available: https://www.osti.gov/etdeweb/biblio/5920204 
[46] DNV GL - Energy, “WindFARMER Theory Manual Version 5.3,” April, 2014. 
[47] Linear Regression. Accessed: Dec. 24 2021. [Online]. Available: http://
www.stat.yale.edu/Courses/1997-98/101/linreg.htm 
[48] Wikipedia, Linear least squares. [Online]. Available: https://en.wikipedia.org/w/index.php
?title=Linear_least_squares&oldid=1054104043 (accessed: Dec. 25 2021). 
[49] “Renewables Renewables Software Data/Analytics,” [Online]. Available: https://
collateral-library-production.s3.amazonaws.com/uploads/asset_file/attachment/2498/
UL_Wind_SoftwareData_163.02.1018.EN.EPT_Digital.pdf 
[50] P. Baas, F. C. Bosveld, and G. Burgers, “The impact of atmospheric stability on the 
near-surface wind over sea in storm conditions,” Wind Energy, vol. 19, no. 2, pp. 187–
198, 2016, doi: 10.1002/we.1825. 
[51] M. Anderson and J. Bass, “A Review of MCP Techniques,” RES 03, 2004. 
[52] Principal component analysis - Wikipedia. [Online]. Available: https://en.wikipedia.org/
wiki/Principal_component_analysis (accessed: Dec. 25 2021). 
[53] Linear least squares example2 - Linear least squares - Wikipedia. Accessed: Dec. 24 
2021. [Online]. Available: https://en.wikipedia.org/wiki/Linear_least_squares
#/media/File:Linear_least_squares_example2.svg 
[54] Total least squares - Total least squares - Wikipedia. Accessed: Dec. 24 2021. [Online]. 
Available: https://en.wikipedia.org/wiki/Total_least_squares
#/media/File:Total_least_squares.svg 
[55] V. A. Barbur, D. C. Montgomery, and E. A. Peck, Journal of the Royal Statistical Society. 
Series D (The Statistician), vol. 43, no. 2, pp. 339–341, 1994, doi: 10.2307/2348362. 
[56] Wikipedia, Regression analysis. [Online]. Available: https://en.wikipedia.org/w/index.php
?title=Regression_analysis&oldid=1060800391 (accessed: Jul. 1 2022). 
[57] J. Beltran, L. Cosculluela, C. Pueyo, and J. J. Melero, “Comparison of measure-
correlate-predict methods in wind resource assessments,” in European Wind Energy 
Conference and Exhibition 2010, EWEC 2010, 2010, pp. 3280–3286. [Online]. 
Available: https://www.researchgate.net/publication/266242232_Comparison_of_
measure-correlate-predict_methods_in_wind_resource_assessments 
[58] M. Leblanc, D. Schoborg, S. Cox, A. Haché, and A. Tindal, “Is a Non-linear MCP 
method a useful tool for North American wind regimes,” in Proceedings of the AWEA 
References 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 106 | 110 
2009 Windpower Conference and Exhibition, Chicago, IL, USA, 2009. [Online]. 
Available: https://www.yumpu.com/en/document/view/51447119/non-linear-mcp-gl-
garrad-hassan 
[59] J. V. Miguel, E. A. Fadigas, and I. L. Sauer, “The influence of the wind measurement 
campaign duration on a measure-correlate-predict (MCP)-based wind resource 
assessment,” Energies, vol. 12, no. 19, 2019, doi: 10.3390/en12193606. 
[60] D. Hanslian, “The Matrix of Measure- Correlate-Predict Methods,” in 2017. [Online]. 
Available: https://businessdocbox.com/Green_Solutions/85349495-The-matrix-of-
measure-correlate-predict-methods.html 
[61] J. C. Woods and S. Watson, “A new matrix method of predicting long-term wind roses 
with MCP,” Journal of Wind Engineering and Industrial Aerodynamics, vol. 66, pp. 85–
94, 1997, doi: 10.1016/S0167-6105(97)00009-3. 
[62] S. C. Ramli and M. H. Windolf, “Uncertainty in the application of the Measure-Correlate-
Predict(MCP) method in wind resource assessment,” [Online]. Available: http://
c2wind.com/f/content/sundus_ramli_p0355.pdf 
[63] E. Saarnak, “Case Study of Uncertainties Connected to Long-term Correction of Wind 
Observations,” Uppsala universitet. [Online]. Available: https://www.diva-portal.org/
smash/get/diva2:622452/FULLTEXT01.pdf 
[64] T. Lambert and A. Grue, “The Matrix Time Series method for MCP,” in Proceedings of 
the WINDPOWER 2012 Conference, Atlanta, Georgia, USA, 2012. 
[65] N. D. Waars, “Lidar and MCP in wind resource estimations above measurement-mast 
height,” DTU. [Online]. Available: http://repository.tudelft.nl/ 
[66] J. Zhang, S. Chowdhury, A. Messac, and B.-M. Hodge, “Assessing Long-Term Wind 
Conditions by Combining Different Measure-Correlate-Predict Algorithms,” in 2014. 
[Online]. Available: https://www.nrel.gov/docs/fy13osti/57647.pdf 
[67] E. Alpaydin, Introduction to machine learning. Cambridge, MA, London: MIT Press, 
2004. [Online]. Available: https://books.google.de/books?hl=de&lr=&id=
tZnSDwAAQBAJ&oi=fnd&pg=PR7&dq=introduction+to+machine+learning+&ots=
F3VR518nyj&sig=tw5aptPDqObfKsvlzeYUa1vktC0&redir_esc=y
#v=onepage&q=introduction%20to%20machine%20learning&f=false 
[68] R. Klinkert, “Master of Science Thesis Uncertainty Analysis of Long Term Correction 
Methods for Annual Average Winds,” UMEÅ UNIVERSITY. [Online]. Available: http://
umu.diva-portal.org/smash/record.jsf?pid=diva2%3A556297&dswid=6024 
References 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 107 | 110 
[69] Wikipedia, Machine learning. [Online]. Available: https://en.wikipedia.org/w/index.php?
title=Machine_learning&oldid=1063012883 (accessed: Jan. 1 2022). 
[70] M. Petrelli, Introduction to Python in Earth Science Data Analysis : From Descriptive 
Statistics to Machine Learning, Springer Textbooks in Earth Sciences, Geography and 
Environment, 1st ed. Cham. 
[71] M. Nielsen, “Long-term correction of wind observations by diffusion-based 
transformation,” DTU, 2019. [Online]. Available: https://backend.orbit.dtu.dk/ws/
portalfiles/portal/180029732/dtu_wind_e_0183.pdf 
[72] C. King and B. Hurley, “The SpeedSort, DynaSort and Scatter Wind Correlation 
Methods,” Wind Engineering, vol. 29, no. 3, pp. 217–241, 2005, doi: 
10.1260/030952405774354868. 
[73] UL International, “Windographer Helpfile,” Accessed: Jul. 1 2021. [Online]. Available: 
https://www.windographer.com/ 
[74] Paul van Lieshout, Improvements in AEP Calculations Using IEC 61400. [Online]. 
Available: https://www.windtech-international.com/editorial-features/improvements-in-
aep-calculations-using-iec-61400 (accessed: Jan. 13 2022). 
[75] Definition of ALGORITHM. [Online]. Available: https://www.merriam-webster.com/
dictionary/algorithm (accessed: Jan. 2 2022). 
[76] A. Romo Perea, J. Amezcua, and O. Probst, “Validation of three new measure-correlate-
predict models for the long-term prospection of the wind resource,” Journal of 
Renewable and Sustainable Energy, vol. 3, no. 2, p. 23105, 2011, doi: 
10.1063/1.3574447. 
[77] Questionnaire on "Analysis and Method Selection of a Measure-Correlate-Predict 
Procedure&quot. [Online]. Available: https://www.empirio.de/s/dd9bXys1XW (accessed: 
Jan. 7 2022). 
[78] wrag groups.io Group. [Online]. Available: https://groups.io/g/wrag (accessed: Jan. 7 
2022). 
[79] J. C. Y. Lee and M. J. Fields, “An overview of wind-energy-production prediction bias, 
losses, and uncertainties,” Wind Energy Science, vol. 6, no. 2, pp. 311–365, 2021, doi: 
10.5194/wes-6-311-2021. 
[80] M. Brower et al., Wind Resource Assessment : A Practical Guide to Developing a Wind 
Project. Somerset, UNITED STATES: John Wiley & Sons, Incorporated, 2012. [Online]. 
References 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 108 | 110 
Available: https://books.google.de/books?id=5dSzcF_cowkC&newbks=1&newbks_redir=
0&hl=de&redir_esc=y 
[81] C. de Valk, I. L. Wijnant, “Uncertainty analysis of climatological parameters of the Dutch 
Offshore Wind Atlas (DOWA),” Royal Netherlands Meteorological Institute; Ministry of 
Infrastructure and Water Management, De Bilt TR-379, 2019. [Online]. Available: https://
www.dutchoffshorewindatlas.nl/binaries/dowa/documents/reports/2019/12/10/knmi-
report---uncertainty-analysis-of-climatological-parameters/
Uncertainty+analysis+of+climatological+parameters+of+the+DOWA.pdf 
[82] Det Norske Veritas, “USE OF REMOTE SENSING FOR WIND ENERGY 
ASSESSMENTS,” April, undefined-undefined, 2011. [Online]. Available: https://
rules.dnv.com/docs/pdf/dnvpm/codes/docs/2011-11/RP-J101.pdf 
[83] J. B. Duncan, P. A. van der Werff, and E. Bot, “Understanding of the Offshore Wind 
Resource up to High Altitudes ( ≤ 315 m ),” TNO, 2018. [Online]. Available: https://
repository.tno.nl/islandora/object/uuid%3Ab15f4402-f78f-41b5-bcf1-2d5cad45abf6 
[84] Project Jupyter. [Online]. Available: https://jupyter.org/ (accessed: Jan. 10 2022). 
[85] Python.org, Welcome to Python.org. [Online]. Available: https://www.python.org/ 
(accessed: Jan. 10 2022). 
[86] Anaconda | The World's Most Popular Data Science Platform. [Online]. Available: 
https://www.anaconda.com/ (accessed: Jan. 10 2022). 
[87] Matplotlib — Visualization with Python. [Online]. Available: https://matplotlib.org/ 
(accessed: Jan. 10 2022). 
[88] scikit-learn, 3.3. Metrics and scoring: quantifying the quality of predictions. [Online]. 
Available: https://scikit-learn.org/stable/modules/model_evaluation.html#classification-
metrics (accessed: Jan. 10 2022). 
[89] Statistical functions (scipy.stats) — SciPy v1.7.1 Manual. [Online]. Available: https://
docs.scipy.org/doc/scipy/reference/stats.html (accessed: Jan. 10 2022). 
[90] PyPI, dc-stat-think. [Online]. Available: https://pypi.org/project/dc-stat-think/ (accessed: 
Jan. 10 2022). 
[91] Meteomast IJmuiden (MMIJ) – Wind op Zee. [Online]. Available: https://
www.windopzee.net/en/locations/meteomast-ijmuiden-mmij/ (accessed: Jan. 10 2022). 
[92] This is Nimbus! Accessed: Dec. 6 2021. [Online]. Available: https://
nimbus.windopzee.net/ 
References 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 109 | 110 
[93] H. Hersbach et al., “The ERA5 global reanalysis,” Quarterly Journal of the Royal 
Meteorological Society, vol. 146, no. 730, pp. 1999–2049, 2020, doi: 10.1002/qj.3803. 
[94] E. Werkhoven and J. P. Verhoef, “Abstract of instrumentation report - Offshore 
Meteorological Mast IJmuiden,” ECN, 2012. [Online]. Available: https://
www.windopzee.net/wp-content/uploads/2019/07/ecn-wind_memo-12-010_abstract_of_
instrumentatierapport_meetmast_ijmuiden.pdf 
[95] ParaView. [Online]. Available: https://www.paraview.org/ (accessed: Jan. 13 2022). 
[96] A. Basse, D. Callies, A. Grötzner, and L. Pauscher, “Seasonal effects in the long-term 
correction of short-term wind measurements using reanalysis data,” Wind Energy 
Science, vol. 6, no. 6, pp. 1473–1490, 2021, doi: 10.5194/wes-6-1473-2021. 
[97] T. Burton, D. Sharpe, N. Jenkins, and E. Bossanyi, Wind Energy Handbook: John Wiley 
& Sons, 2001. [Online]. Available: https://books.google.de/books?id=4UYm893y-34C 
 
  
References 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy 110 | 110 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Berlin, 05.01.2022 
 
 
Affirmation 
 
I herewith assure that I wrote the present thesis independently, the thesis has not 
been partly or fully submitted as graded academic work and that I have used no other 
means as the ones indicated. I have indicated all parts of the work in which sources 
are used according to their wording or to their meaning. I declare agreement to the 
inspection of my work with software to detect plagiarism. For this purpose I provide 
an anonymised electronic version of my work in a prevalent text editing format. 
Questionnaire 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy A-1 
Annex A Questionnaire 
 
PreDF - Sectorwise exemplary results of the concurrent period 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy B-1 
Annex B PreDF - Sectorwise exemplary results of the concurrent 
period 
 
KPI Results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy C-1 
Annex C KPI Results 
 
 
Evolution of self-prediction RMSE of MWS results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy D-1 
Annex D Evolution of self-prediction RMSE of MWS results 
 
 
Evolution of validation MBE of MWS results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy E-1 
Annex E Evolution of validation MBE of MWS results 
 
Evolution of validation MAE of MWS results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy F-1 
Annex F Evolution of validation MAE of MWS results 
 
 
Evolution of validation RMSE of MWS results 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy G-1 
Annex G Evolution of validation RMSE of MWS results 
 
 
Regression plots of self-prediction and validation RMSE 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy H-1 
Annex H Regression plots of self-prediction and validation RMSE 
 
 
MMIJ transfer functions to obtain data-filling uncertainties in an representative location 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy I-1 
Annex I MMIJ transfer functions to obtain data-filling 
uncertainties in an representative location 
 
 
 
Evolution of DFWS 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy J-1 
Annex J Evolution of DFWS 
 
 
Evolution of LTWS 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy K-1 
Annex K Evolution of LTWS 
 
 
Evolution of DF uncertainties 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy L-1 
Annex L Evolution of DF uncertainties 
 
 
Evolution of JK uncertainties 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy M-1 
Annex M Evolution of JK uncertainties 
 
Evolution of final uncertainties in LTWS 
University of Kassel WES MScThesis Sargin - MCP Methodology for a Digital Wind Buoy N-1 
Annex N Evolution of final uncertainties in LTWS