Yield predictions of timothy ( Phleum pratense L.) in Norway under future climate scenarios

.


Introduction
Global climate change is now the most important challenge for the breeding of forage crops and crop management. The warming will be more rapid in northern regions compared to the global average (Rantanen et al. 2022). Both winter and summer temperatures across Norway are projected to increase (Hansen-Bauer et al. 2015), and the temperature increase is expected to extend the growing season for grasslands (Olesen et al. 2011). Furthermore, precipitation patterns are projected to change with more extreme events and more unstable winters across Norway (Hansen-Bauer et al. 2015). Less snow cover and more frequent mild spells and rain on snow may increase the risks of damage due to freezing and ice encasement (Rapacz et al. 2014, Bjerke et al. 2015. These changes will create abiotic stresses that may affect forage yields. Also, drought (caused by periods of decreased precipitation and increased evapotranspiration) and increased climate variability are expected to have negative impacts on grasslands. The magnitude of changes (both positive and negative) is expected to be the largest in northernmost Europe (Olesen et al. 2011). Due to this complex situation with contrasting effects following climate change, more detailed projections of location-specific changes across Norway are needed.
Timothy (Phleum pratense L.) is a temperate perennial forage grass widely used in northern parts of Europe (Nordic countries), Japan, Canada, and USA (Tamaki et al. 2010, Stewart andEllison 2016). Stable and high yields, high forage quality, good winter hardiness and persistency are decisive factors for a sustainable forage grass production in Norway. As timothy possesses most of these traits, it is a key factor for the production of high-quality local forage and vital to economically viable livestock production in Norway (Steinshamn et al. 2016). There is a need for cultivars with specific adaptation to the short growing season with long days and long winters with variable stress conditions. Rapid and ongoing climate changes raise questions of how current timothy varieties will perform in future climates. The availability of extensive data from Norwegian official variety testing of timothy combined with statistical modeling gives a unique opportunity to study this question.
Several simulation or process-based models have been developed to predict timothy yield (Korhonen et al. 2018), simulating in detail the distinct phases of crop growth and development. Combined with climate projections these simulation models have been used to project future yield output, e.g., the use of the CATIMO (Bonesmo and https://doi.org/10.23986/afsci.127935 Belanger 2002) to predict yield of timothy across Canada for 2040-2069 (Jing et al. 2013). Persson and Höglind (2014) simulated timothy dry matter yields using the LINGRA model for the periods 1961-1990, 2046-2056 and 2080-2099 for five locations in various parts of Norway based on climate projections. The total biomass yield was predicted to increase for all five locations in both future time periods. The model, however, focused on summer growth processes. Full-year grassland models, such as BASGRA (Höglind et al. 2016) also accounting for cold hardening and the effect of winter conditions, have to date not been used with climate projections.
Using statistical modeling and a machine learning approach to predict yield avoids the detailed specification and calibration of process-based models and lets the prediction framework directly account for uncertainty. Statistical modeling has for instance been used to predict yields of 12 major Californian crops ) but has never been used for timothy. A statistical model accounts for the relationship between predictive variables, or predictors, and produces an outcome with stochastic noise (Kuhn and Johnson 2013). The most important agroclimatic variables for grasslands are temperature and water availability. In general, the growth of temperate (C3) plant species is slow at low temperatures and increases as the temperature rises to an optimum level, above which it decreases when the temperature becomes too high. This implies a non-linear relation between temperature and yields, and Baker and Jung (1968), found that the top growth of timothy was at its optimum at a temperature of 18-20 °C. Later experiments have shown, however, that cool-season grasses can grow rapidly also at lower temperatures (Thorvaldsson and Martin 2004). Water availability is equally important, as timothy is known to be sensitive to drought (Sheaffer et al. 1992). Based on data from Finland, Mäkinen et al. (2015) found that higher temperatures during the growing season increased yield, while heat stress at the beginning of the season and increased precipitation during autumn had a negative impact.
In this study, we built a statistical prediction model for timothy yield based on agro-climatic predictors, in particular temperature and precipitation, using data available from Norwegian variety testing of timothy. We aimed to determine the most predictive model and assess its predictive ability attributable to weather patterns. Based on climate projections, the optimal model was used to predict yield output for different locations across Norway for the decades 2050-2059 and 2090-2099, to quantify how timothy yield will be affected by changes in temperature and precipitation during the growing season.  Table 1 for further details.

Timothy data
The present study was conducted using data originating from the Norwegian Value for Cultivation and Use (VCU) testing of timothy varieties. Results from trials in the period 1988-2017 were retrieved from different digital and written sources. Each trial was established in two consecutive years at each location in a monoculture and harvested for three ley years. Historically, a total of 24 locations were used for the VCU trials of timothy. We focused on data from 8 research stations (Table 1) which comprised 81% of the data material. For these 8 locations, there were 18 477 observations available for the period 1988-2017, covering in total 166 combinations of location and establishment year and 119 different varieties.
The VCU trials use a completely randomized block experimental design with three blocks and a plot size of 10.5m 2 (1.5m × 7.0m). Phenotypic traits are recorded in three consecutive years after the establishment year and include spring cover, time of heading before the first cut, weed occurrence before each cut, disease occurrence, dry matter yield (DMY), and forage quality. In this study, we focused on the accumulated total DMY for all three years and used the median over all repetitions and varieties per location and three-year testing period as the dependent variable in the analyses. Trial plots were harvested two to three times per growing season following the customary practice at each location. All varieties were harvested at the same time at each location and the first harvest was made at about 50% heading of timothy. Trials were harvested using Haldrup harvest machines (Haldrup GmbH, Ilshofen, Germany). During the harvesting years, the trials were fertilized according to soil type, nutrient status, and local climate i.e., according to common practice at each location with about 280 kg N ha -1 each season at the southernmost location and 160 kg N ha -1 at the northernmost location. The trials were not irrigated.

Weather and climate data
Previous studies have shown that a range of agro-climatic variables impacts the DMY of timothy, such as temperature, water availability, length of the growing season, and light conditions (Mäkinen et al. 2015). Historical agro-climatic variables for the eight research stations such as daily mean, maximum and minimum temperature (°C), and daily precipitation (mm) were collected from Agrometeorology Norway (lmt.nibio.no). Based on the daily measurements, monthly and seasonal summary statistics, representing the climatic conditions, were calculated.
We quantified temperature over the growing season in terms of growing degree days (GDD), defined as the cumulative daily temperature above a base temperature that is specific for each species (Baskerville and Emin 1969). As the daily mean temperature was available, the growing degree for a single day i was calculated by , where is the daily mean temperature (McMaster and Wilhelm 1997). GDD was summarized per month from March through August, as the last cut was typically early September. For timothy, the base temperature T base was set to 5 °C as is common in models for the growth and development of temperate crops (e.g., Skjelvåg 1998, Bonesmo 1999, Nissinen 2001. Korhonen et al. (2018), for example, started simulations in the CATIMO timothy model and the BASGRA ryegrass model when the consecutive five-day mean temperature in spring exceeded 5 °C. While others have used 0 °C as the base temperature (e.g. Bonesmo and Bélanger 2002), 5 °C was used as the Table 1. Overview of VCU trial locations where the data was collected from, with latitude, longitude, and altitude, the period of active operation, and the number of trials with complete data for all three ley years for each location. Annual GDD and precipitation for each location is found in Table 5.

Location
Latitude ( metabolic activity of timothy at 0-5 °C is more related to acclimation than to above-ground plant growth, especially in northern varieties (Dalmannsdottir et al. 2017).
We quantified precipitation over the growing season by the number of rainy days, defined as a day with precipitation over a threshold of 1.0 mm. Thus, the variable measured the frequency and not the intensity of precipitation. The number of rainy days was summarized per month and for the period from March through August, as the last cut was early September.
Future projections of temperature and precipitation were provided by the EURO-CORDEX initiative (Jacob et al. 2020). EURO-CORDEX provided a multi-model ensemble of regional climate projections for Europe at a daily temporal resolution and spatial resolution of 12km, obtained by running a regional climate model (RCM) using the output of a global general circulation model (GCM) as boundary conditions. In addition, the RCM output was bias-corrected using a transformation of a cumulative distribution function (CDF) (Michelangeli et al. 2009, Vrac et al. 2012) and distribution-based scaling (DBS) (Yang et al. 2010). The scaling used data from the regional reanalysis MESAN (Yang et al. 2010), and the observation-based gridded data product E-OBS (

Data quality
Both the phenotypic and the agro-climatic variables had missing observations. DMY was recorded separately for each ley year and summed over all three years to produce the total DMY. Hence, if the yield from one or more years was missing, the total DMY was also missing. There were in total 166 different three-year trials established over all locations. Among these, DMY was not recorded for 12 trials in the first ley year, 18 trials in the second ley year, and 26 trials in the third ley year. There were 35 trials where DMY was not recorded in at least one ley year, such that total DMY was available for 131 complete three-year trials for all locations.
For the daily agro-climatic variables, the number of missing observations varied substantially between years and locations across the data period . The missing observations were, however, largely the same for temperature and precipitation, implying that if the temperature was recorded, so was precipitation. A practical limit of 15% missing recordings per month (a maximum of 4 missing days) was set to produce a monthly summary. There were 55 trials with GDD or the number of rainy days missing for at least one month. Hence, there were 94 trials overall with complete observations for all variables.

Methodology
We analyzed the data in two steps: First, we determined the best prediction model for DMY based on the agroclimatic variables. Second, we projected the future yield output at the 8 locations for the two decades 2050-2059 and 2090-2099. For the prediction model of total DMY, we use linear regression (Hastie et al. 2019) with the yearly agro-climatic variables at the locations as predictors. We consider the model with linear predictors: and the model with the quadratic predictors: where y ij is the median of total DM yield after the establishment year i at location j and var1 i+k is the agro-climatic predictor for the k th ley year at location j. Further, α l is the linear effect of the l th predictor, β l is the corresponding quadratic effect, μ is an overall intercept, and ϵ ij is a noise term with expectation zero. As the focus was total DMY over all ley years, the climatic variables for all three years were considered as predictors. For instance, if a trial was established in 2003, the total yield was given by the sum of the yields from the growing seasons in 2004, 2005, and 2006. Hence the agro-climatic variables for all three years were included together as a set of predictors. The median of the total DMY per year and location was used as the outcome to ensure the robustness of the effect estimates. This minimized the negative effect of outliers and ensured that years with different numbers of observations at a given location would be comparable.

Assessing predictive ability
Earlier work studying the effect of agro-climatic predictors on DMY did not directly assess the predictive power of the variables but focused on the statistical significance of predictors (see e.g., Mäkinen et al. 2015Mäkinen et al. , 2018. This work emphasized instead the predictive ability of the agro-climatic variables, so that future yield output may be accurately predicted under climate change, particularly when extrapolation is required. To rigorously evaluate the predictive ability of a model, predictions must be assessed on an independent data set not used to estimate the predictive model. The resulting predictions are usually referred to as being out-ofsample (Hastie et al. 2019). K-fold cross-validation is the most used procedure to assess the out-of-sample prediction error and at the same time properly exploit all available training data (Hastie et al. 2019). Then the training data is divided into K parts (folds), and each fold is held out of the model building and predicted in turn as an outof-sample data set. The final prediction error is calculated by averaging over all folds.
To determine the optimal prediction model, we utilized a cross-validation (CV) scheme where all recorded yields for a given location and establishment year constituted a separate fold. This scheme equals leave-one-out (N-fold) CV, where each year-location combination constitutes a separate fold. The prediction error was quantified by the root mean square error (RMSE) over all folds for the N combinations of years and locations where is the prediction of the median yield for a given year and location, when the associated data was excluded from the training data. If a predictor decreased the prediction error, it would improve the predictive power. The model with the lowest RMSE was selected as the optimal model. We also considered the coefficient of determination, R 2 , and the adjusted R 2 to assess the model fit (Kuhn and Johnson 2013) and used a likelihood ratio test to determine if a difference in fit between the two models was significant. The R 2 can be interpreted as the proportion of explained variability, while the adjusted R 2 is corrected for in-sample optimism (and may be negative), thus more accurately reflecting the out-of-sample predictive power.
The model was selected following a forward stepwise procedure (Claeskens and Hjort 2008), where the best predictor in subsequent steps, was included and the prediction error and fit recalculated until no improvement in RMSE and adjusted R 2 could be achieved.

Results
This section presents the results of the two steps in the analysis. First, an overview of the procedure to determine the optimal predictive model for total DMY is given, and then the results of the projected future yield for the periods 2050-2059 and 2090-2099 are presented.

Optimal predictive model
The constant baseline model with no agro-climatic predictors, i.e. the average of overall yield, achieved an RMSE of 474. Table 2 shows the prediction error quantified by the RMSE, R 2 , and adjusted R 2 for the linear and quadratic models of all agro-climatic predictors. Including GDD from March through August as a linear term in the model decreased the prediction error to 458 (Table 2). When adding a quadratic effect of GDD from March through August to the linear term, the RMSE further decreased to 456 and the adjusted R 2 increased from 0.11 to 0.14 ( Table 2). This indicates a modest predictive ability from GDD over the entire growing season. Including instead GDD for July only gave a substantial reduction in the RMSE to 413 and increased the R 2 to 0.31 (Table 2) 6 https://doi.org/10. 23986/afsci.127935 Similarly, the inclusion of rainy days over the entire growing season from March through August increased the RMSE for the linear and quadratic models to 487 and 511 (Table 2), respectively. The inclusion of the number of rainy days in July only as a quadratic effect, decreased the RMSE to 459 and increased the R 2 from 0.01 to 0.12 (Table 2). This shows that both GDD and rainy days had a non-linear effect on timothy DM yield and that the cumulative temperature and precipitation in July were more predictive than the cumulative temperature and precipitation throughout the entire growing season.
For the forward stepwise model selection procedure, Table 3 shows the RMSE, R 2, and adjusted R 2 for the best model in each step. Details for all models in each step are shown in Tables S.1-S.3 in the Supplementary material. Based on the results in Table 2, the GDD for July was added in the first step. After including GDD for July in the model, the best-performing predictor in the second step was found to be the quadratic effect of rainy days in June (see Table S.1 for details). Including both GDD for July and rainy days in June gave an RMSE of 415 and increased the adjusted R 2 to 0.30 (Table 3). This is better than both models of the individual predictors, indicating that GDD and rainy days contributed predictive skill independently of each other. After including GDD for July and rainy days for June in the model, the best-performing predictor in the third step was a quadratic effect of rainy days in July (Table S.2). Including GDD for July and the rainy days in both June and July gave an RMSE of 415 and further increased the adjusted R 2 to 0.34. Given a model with GDD for July and rainy days in June and July, no other predictors gave an improvement in either RMSE or the adjusted R 2 (Table S.3).
Finally, as June and July are successive months, the number of rainy days for both months was combined into a single predictor, representing the number of rainy days for the period June-July. By using the combined Table 2. Root mean squared error (RMSE), R 2 and adjusted R 2 for the linear and quadratic models of all individual predictors. The differences in RMSE quantify the predictive contribution of each predictor, and the predictor with the lowest prediction error is marked in bold. predictor (final line, Table 3), the RMSE was lowered to 406 and the adjusted R 2 increased slightly to 0.35. According to the likelihood ratio test, the improvements in the model fit of all consecutive models were significant under a significance level of 0.1 (Table 3). For all the best-performing predictors, it was beneficial to include the effect as a quadratic term. The final model improved the out-of-sample prediction error by 14% compared to the baseline model with no agro-climatic variables (Table 3).

Linear Quadratic
To summarize, the optimal predictive model for total DMY was given by the quadratic functions of the GDD for July and the number of rainy days in June and July, for each of the three ley years (Table 4). A scatter plot of the observed and the out-of-sample predicted DMY from the final model (using leave-one-out crossvalidation) is presented in Figure 2. The correlation between the observed and predicted DMY is 0.512. Figure 3 visualizes the estimated quadratic effects of GDD and rainy days for each ley year with 95% confidence bands, over the observed total DMY. The effects of each predictor are given with all other predictors fixed at their mean value.

Future scenarios
Based on the climate projections described previously, we predicted the DMY of timothy using the model in Table 4. The yield was predicted at 8 locations for each year in the two decades: 2050-2059 and 2090-2099. The values of the predictors for the three consecutive ley years were assigned randomly within the two decades, and negative predictions were truncated to zero. Figure 4 shows the median (diamond) with the 80% quantile bands (10% and 90% quantiles) of the yield for the observed data from 1988-2017 (black), together with the projected yield for the 2050s (red) and the 2090s (green) for all 8 locations. For Apelsvoll (interior of Southeastern Norway), the median yield decreases for the period 2050-2059 and decreases further for 2090-2099. Even though the upper 90% quantile remains similar, the lower 10% quantile decreases substantially (up to 78%), hence greatly increasing the variability. Common to all these locations is a predicted increase in yield variability, although not as conspicuous as for Apelsvoll in the southeast. It may also be observed that the historical measurements at Tjøtta consistently underperformed without any known reason (additional investigation of the source material was conducted), and the location should therefore be viewed as an outlier.
An overview of the climate at each individual VCU trial location is given in Table 5, in terms of the average GDD in July and the number of days of precipitation in June-July for the historic period 1988-2017 and projected for the periods 2050-2059 and 2090-2099. Figure 5 shows the scatter plots of the GDD (July) and rainy days (June-July) for the first ley year for the observed period (black) and for the projected periods 2050-2059 (red) and 2090-2099 (green). The left panel shows the distributions for all locations, while the middle and right panels show the  distributions for Apelsvoll and Holt. At all locations, it is seen that GDD will increase while the number of rainy days and the negative correlation between the two predictors remains the same. To see the more distinct differences between individual locations, we show the most contrasting locations: Apelsvoll and Holt. For Apelsvoll the median of the GDD distribution increases for the 2050s and 2090s, while the median of the number of rainy days decreases slightly (Table 5). For Holt, in the right panel, both the GDD and the number of rainy days will increase for the periods 2050-2059 and 2090-2099 (Table 5).

Optimal predictive model
The optimal predictors of timothy DMY were found to be cumulative GDD in July and the number of rainy days (>1mm) in June and July. These two agro-climatic variables explained 43% of the yield variability, which is consistent with other statistical models for crop yield.  showed that the share of the year-to-year variations in global average yields attributed to temperature and precipitation during the growing season for the world's six most widely grown crops ranged from 29% (for rice and sorghum) to 65% (for barley).
In comparison, Persson and Höglind (2014) used daily temperature and precipitation as inputs in the process-based LINGRA model to predict future yield. Jing et al. (2013) used the process-based CATIMO model that also considered temperature and precipitation as input to their model. We found in initial analyses that the number of days with precipitation (>1mm) gave better predictions than using the amount of precipitation itself (see Supplementary  Table S.4). This means that an even distribution of precipitation in June and July has a higher impact on yield than the actual amount of precipitation. This is reasonable, considering that June and July are the warmest months in Norway with the highest evapotranspiration and the highest risk of a water deficit. A similar phenomenon was seen by Virkajärvi (2003) who observed that smaller but frequent rain showers (<5mm) were more important for sward production than the soil moisture content (at 20 cm depth). Timothy has, in general, a low regrowth capacity after 1st cut compared to many other grasses (Lemeziene 2004). The root system is rather shallow and has low tolerance to drought during this period ).
The optimal agro-climatic predictors were found to be monthly rather than seasonal variables, which may be surprising as the plant grows throughout the season. However, the growth will be relatively more affected by temperature in the month with the highest temperatures than in cooler months. The warmest month therefore acts as a more parsimonious description of the driver of yield. Even though the information content may be regarded as higher for seasonal variables, there is also more random noise, and to exploit the additional information, one would require a more complex model at the cost of generalizability.
The signs of quadratic terms were found to be negative for both GDD and rainy days and each of the three ley years. This implies that there is an optimal temperature and an optimal rain frequency, as seen in Figure 3. The GDD optimum for July was observed at 300-320, which corresponds to an average of 9.7-10.3 growing degrees per day. Adjusting for the base temperature of 5 °C, this can be interpreted as an average daily temperature of around 15 °C. This is lower than the optimal temperature found by Baker and Jung (1968) but agrees with Bertrand et al. (2008) who found an optimum daytime temperature of 17 °C for the growth of timothy in summer. These results   also agree with experiments from countries with low summer temperatures, showing that cool-season grasses can grow quickly at lower temperatures (Thorvaldsson and Martin 2004). Longer photoperiods at higher latitudes may also partly compensate for the lower growth temperature (Hay 1990, Mølmann et al. 2021. The optimal precipitation frequency in June-July was found at 19 rainy days in ley year 1 and 30 rainy days in ley year 2 but showed a rather flat pattern with no distinct optimum in ley year 3. This suggests that timothy is less sensitive to both more and less frequent precipitation as plants become older, which may be attributed to a higher root density with increasing plant age. In timothy and other perennial grasses, the increase in root biomass with increasing age mostly occurs in the upper 15 cm (Bolinder et al. 2002), which is important to make the sward more drought tolerant in dry summers and more robust to traffic damage in rainy summers.

Projection of future DM yields in various parts of Norway
The projections showed that changes in forage yields in the future may differ substantially across Norway. The results for the eight locations suggested three groupings based on yield projections: Eastern (Apelsvoll), southern (Saerheim), and central Norway (Kvithamar) were characterized by a reduction in yield in the future. At Fureneset, Tjøtta, and Vågønes the yield was predicted to increase until 2050-2059 (especially at Tjøtta and Vågønes) and then decrease again towards 2100. The yield at the northernmost location Holt in Tromsø was predicted to increase right up to 2100, but with the most rapid increase from the present until 2050-2059.
Interior areas (Eastern Norway) may experience a substantial decrease and higher variability in DM yields in the future due to increasing temperatures and decreasing frequency of precipitation. Our projections, therefore, suggest that the current varieties of timothy may not be suitable for a location such as Apelsvoll. For coastal and mountainous Southern Norway and southern parts of Northern Norway, there seems to be a minor increase in average yield in the near future, but a minor decrease towards the end of the century. For Løken, representing the mountainous areas in Southern Norway, our results agree with Höglind et al. (2013) who predicted higher DMY in 2040-2065 in this area, along with 13 other sites in Northern Europe. In contrast, our results for Saerheim were not in agreement with the predictions of Höglind et al. (2013) for the nearby location Sola in Southwestern Norway. Our predictions for a decrease in DMY already in the 2050s in Eastern Norway and in the 2090s in the rest of Southern and Central Norway were not in accordance with Persson and Höglind (2014) who predicted the DMY to increase across a wide range of Norwegian locations.
A reason for the differences between our projections and those by Persson and Höglind (2014) could be that our analysis did not account for changes in crop management (the timing and number of harvests), as done by Persson and Höglind (2014). Rather, our projections are to be interpreted as conditional projections, where we assume no changes in crop management. As such, our results thus give good indications for regions where adaptation of crop management practices and/or timothy varieties will be needed to keep up or increase current yield levels.
Northern Norway, in particular the Troms and Finnmark counties, on the other hand, can become optimal locations for production of the current timothy varieties, as both the temperature and the days of precipitation are projected to increase. An important assumption for these projections is that there is no change in genetic composition either intentionally or by natural selection. Our results mirror the conclusions of Dellar et al. (2018) where regions with warmer and wetter conditions (the northern and Alpine regions of Europe) can expect higher pasture yields, while warmer and drier areas can expect a reduction in yield. Likewise, Golinski et al. (2018) found a significant positive effect of increasing GDD on grassland yields at Holt. Jing et al. (2013) found a similar pattern in Canada, where the annual DM yield is expected to increase in eastern Canada but decrease in western Canada under climate change. Increased variability in annual yield was also projected by Chang et al. (2017).
The projections suggest that the overall DM production across Norwegian locations will not change for the two future decades, mainly because decreases in southeast Norway may be compensated by increased yield in Northern Norway. Increased total DMY in the Northern parts of Norway can be considered a positive effect of climate change, especially if we keep in mind that almost the entire agricultural area in this region is grassland-based. Note that the extrapolation error to future decades will be larger in Eastern Norway as the projected future climate at this location is unprecedented in the historical record. For Holt and the Northern locations, on the other hand, we are more certain of the conclusion as the projected future climate at these locations has been observed at other locations in the training data. The variation in DMY will, however, increase, in accordance with the likelihood of extreme weather events. Warmer and wetter autumn in combination with the mostly unchanged light conditions may also cause reduced winterhardiness and lead to less persistent grasslands (Dalmannsdottir et al. 2017). These interactions were not accounted for in this paper.
12 https://doi.org/10.23986/afsci.127935 As with any analysis, our investigation has some limitations. The statistical approach presented here cannot assess the effect of changing conditions if the future condition was not present in the training data, for instance, changes in the timing and number of harvests, or changes in the CO 2 levels. We have also not considered the aspects of a longer growing season, increased risks of winter damage, or changes in the nutritive value of timothy.
Similarly, using predictors aggregated over single months instead of over the full season may be a limitation when extrapolating if the peak of summer shifts in the future. The climate projections were based on the RCP4.5 scenario, representing an intermediate pathway where emissions peak around 2040 and decline afterward. If the real-world development of climate gas emissions deviates from this scenario, the conditions underlying the projections will change. However, this is a limitation facing any climate change study and it has recently been strongly argued (e.g. Hausfather and Peters 2020) that the more extreme RCP8.5 scenario should not be used in studies focusing on the most likely outcome.
Extrapolation beyond the historical range of a predictor is always difficult, and we emphasize the importance of using a non-linear relation that captures the physiological processes outside the observed predictor range in a reasonable fashion. We ensured this by specifying the non-linear relations to be quadratic and not using spline functions or additive models, which would be too flexible outside the predictor range. The choice of the quadratic function is founded on the knowledge of plant physiology (Thorvaldsson and Martin 2004). Additional factors that may affect the predictability include genetic differences, micro-climate at a given location, soil characteristics, and systematic errors.
Even though yearly weather patterns have been found to explain 30-40% of year-to-year variations in yield, this does not translate directly to improvement in out-of-sample prediction. Our best weather predictors gave a 14% improvement in the out-of-sample prediction error compared to the constant baseline model. From a statistical point of view, one may argue that weather, therefore, has limited predictive power. However, we believe that this prediction performance can be satisfactory given the many sources of noise affecting plant growth. The natural limitation of data when observing yearly time series may cause the prediction errors to be sensitive to outliers or influential observations. Therefore, also the statistical significance of the model fit was evaluated to choose the best parsimonious model.

Conclusion
In this study we have, based on all available Norwegian VCU data for timothy: 1) identified the most predictive agro-climatic variables for total DMY, and 2) projected future timothy production for 2050-2059 and 2090-2099 for 8 locations across Norway, using climate projections. We have shown that quadratic functions of growing degree days (GDD) in July and the number of days with precipitation above 1mm in June and July are the most predictive climatic variables for the total DMY of timothy. Our projections indicate that the production of today's timothy varieties may decrease substantially in South-Eastern Norway but increase in Northern Norway by the middle of the century, assuming other aspects such as crop management practices remain unchanged.