By Dr Ahmed Abdullah
Damascus University, Syria
Cite as: Abdullah, A. (2018), "Managerial Use of Factor Analysis for Identifying Stability of Yield Parameters and Morphological Characters in Barley Trials ", International Journal of Management and Applied Research, Vol. 5, No. 2, pp. 69-81. https://doi.org/10.18646/2056.52.18-006 | Download PDF
This paper aims to demonstrate the usefulness of factor analysis for managing barley trials. It shows whether the relationship between barley yield parameters and morphological characters were affected by genotype and environment. The objectives of this study were to determine whether or not the interrelationship between barley yield parameters (total plant yield, grain yield, straw yield and thousand grain weight), and the morphological characters (vegetative duration, plant height, length of growing season and leafiness) was stable in each hybrid (hybrid 1, hybrid 2 and hybrid 3) and each of two areas in Syria by using factor analysis. The study also aimed to ascertain whether the relationship between yield parameters and morphological characters was affected by genotype environment interaction. There were three hybrids in each of the two Syrian areas: Tel Hadya and Breda. Each hybrid had 102 families. Correlation and factor analysis indicated that the interrelationship for yield parameters, except TGW, was stable, while the interrelationship between morphological characters was not stable. Factor analysis was applied to barley breeding data to determine the relationships between yield parameters and morphological characters. The results demonstrate that this relationship was stable in Tal Hadya, but not stable in Breda. The genotype and the environment influenced the relationship between yield parameters and morphological characters. Consequently, the relationship between yield parameters and morphological characters will not be suitable for forecasting yield.
Barley is one of the best cereals for cultivation in dry and semi-dry areas. It is the world’s fourth most important cereal in terms of production, (after wheat, rice, and maize) and is the second most important (after wheat) in the Middle East and North Africa (Amabile et al., 2014). The effective management of barley production is complicated by a number of contingency factors, which makes it almost unpredictable. Therefore, the formative use of factor analysis was introduced by the author in order to ensure that all influencing barley growth factors are evaluated using a unified framework.
Syria is one the major cereal-producing countries of the Middle East. However, the country’s water supply is becoming progressively scarce, and the future demand for water threatens to surpass available resources. Thus, much research has been focused on increasing barley yields. Factor analysis was envisaged to be the most suitable tool to support the efficient processing of outcomes of barley trials and associated with it managerial decision making towards choosing better yields of barley cultivation.
In 1901, Pearson discovered a new method in multivariate analysis: this method was factorial analysis. However, it is Spearman, who devoted more than 40 years of his life to the development of factor analysis, who is regarded as the father of the subject (Harman 1976). Factor analysis is one of the techniques used for reducing a set of variables to a lower number of unobserved variables called factors (Johnson and Wichern, 2006), and it has been used for decision-making in many business and management related areas. For example, Kalutara et al. (2012) used factor analysis to establish a decision-making framework for the sustainable management of community buildings in Australia. In 2003, Pett, Lakey et al. published a book on the use of factor analysis for research instrument development in health care.
Factor analysis has also been making its way to agricultural science to inform decision-making of multiplicity of conditions affecting growth of crops. It has been used, for example, to study the relationships between several site variables and corn yields in the fields of five producers (Mallarino et al. 1999). The main results were that the set of measured factors allowed for explanation of yield variability. Factor analysis was applied to the data to observe the effects of the genetics of beans on yield in order to increase yield and improve quality (Nasser 2002). Statistical analysis of agronomic traits could be informative in selecting parents for generation of new varieties or breeds because agronomic traits are a reflection of gene effects. The use of factor analysis provides reliable information for the calculation of genetic distance and gives a complete picture of genetic variability contributed by each trait (Parvathaneni et al., 2011).
Factor analysis has been used to identify growth and plant characters related to wheat by Walton (1972). He studied fourteen characters, yield parameters, certain morphological structures above the flag leaf nodes, and three developmental stages. Factor analysis showed that the factor concerned with flag leaf area and duration was the most important factor affecting the yield component. It has also been used to study the yield of dry beans through the analysis of plant variables. Factor analysis was applied to 22 morphological and yield-determining traits of 16 cultivars and strains of dry beans. Denis and Adams, 1978, applied factor analysis to indicate that plants height, numerous nodes, leaves, and reproductive structures were the most important variables affecting yield.
Factor analysis has also been used with breeding programmes. For example, Smith et al., 2001, used many kinds of analysis, including factor analysis, to study the genetics of barley. They used a large set of barley data from South Australiain different trials to reduce the number of factors and to see how the variance and co-variance changed. In 2002, Volis et al. employed factor analysis to reduce 20 measured morphological and phonological traits of wild barley before using multiple regression analysis. The study of Volis et al. (2002) showed that, for high water, nutrients (as variables) and ecotype plant size (RF1 as a factor) significantly affected plant fitness, whether estimated by reproductive biomass or by yield. In 1999, Bertholdsson used factor analysis to find the best barley for malt. He discovered a type of barley with low and stable grain protein content (GPC) suitable for use with malt. In a similar way, Parvathaneni et al. (2011) used factor analysis to understand the morphological and molecular diversity among the genotypes of cucumber.
It is important to study the divergence and genetic relationships among the barley accessions since the characterisation of genetic variability helps to identify better genotypes which could be used in hybridisations as parents and improves quality of agricultural products (Amabile et al., 2014). The use of factor analysis in this context can help to reduce a large set of traits to a smaller but meaningful set of traits, thereby allowing researchers and practitioner to identify trait that contributes to maximum variability since genetic improvements largely depends on the magnitude of genetic variation (Parvathaneni et al., 2011).
The goal of this research is to observe the most influential choices in studying the interrelationship between yield parameters, the interrelationship between morphological characters and the relationship between yield parameters and morphological characters for barley trials, and also to see whether this interrelationship was stable for each of the yield parameters and morphological characters. Factor analysis was also used to determine whether the relationship between yield parameters and morphological characters was stable, or whether genotype environment interaction had affected this relationship in order to organise the most effective parameters for the optimal harvesting of barley.
2. Materials and Methods
The experiments were conducted at two experimental stations (Tel Hadya and Breda, Syria). There are three barley hybrids in each area. Each hybrid had one hundred and two families. The experiments were designed as a randomized block design with two replicates in Tel Hadya and Breda. The rate of seeds was 100 (kg/ha) and the rate of nitrogen was 60 (kg.ha-1).
Eight parameters were registered. Total plant yield: A plot of barley, two metres square was harvested from each experimental plot after removing the border rows. Total plant yield was recorded and expressed as kg.ha-1. Grain yield was measured after mechanically separating grain from straw, expressed as kg.ha-1; straw yield (kg.ha-1); thousand grain weight (TGW); vegetative duration (the number of days from germination until 50% ear emergence); plant height (cm) was estimated by taking the mean of three random samples from each experimental plot; length of growing season (the number of days from germination to harvest); and leafiness (divided into five scores: score 1: very low leafiness; score 2: low; score 3: medium; score 4: high; and score 5: very high).
2.2. Statistical methods
Factor analysis was used to find the interrelationship among parameters and to study the stability of this relationship. The principal component factor analysis method is probably the most widely used with factor analysis (Kollo and von Rosen, 2006). It was used to calculate eigenvalue, factor loading, the communalities, the unique (error) variance and cumulative proportion. Other methods are used in combination with factor analysis, for example, maximum likelihood methods (Johnson and Wichern, 2006).
- The eigenvalues λ : The factors with eigenvalues greater than 1.00 are considered significant, while all factors with eigenvalues less than 1.00 are considered insignificant, and are disregarded (Child, 1990; Härdle and Hlávka, 2007; Johnson and Wichern, 2006). It is given by the equation: ∣A - λI ∣ = 0 (1)
Where AK-K is a matrix and IK-K is an identity matrix, λ1≧ λ2≧ λK≧ 0 , E is called eigenvectors of the matrix A associated with the eigenvalue λ (Child 1990) when: AE= λE (2)
- The factor loadings: the correlations between variables and factors are called factor loadings (Manly and Alberto 2016). It is given by the equation:(3)
Where: λ is an eigenvalue and e(1)........ e(K) are the eigenvectors of R - Ψ(The reduced correlation matrix).
- The communalities hi2: The communality for a measured variable reflects how much of the variance of a given measured variable is useful in delineating the factor as a set. It is the sum of squared factor loading for that variable, and the value of communality is between 0.0 and 1.0 (Kline, 1984). It is given by the equation: (4)
- The unique (error) variance ( Ψi ): the variance which is not explained by factors and not associated with other factors (Johnson and Wichern 2006). It is given by the equation: (5)
- Cumulative proportion (total % of variance; R 2 ):the variance which is explained by each factor (Hair et al. 2005). For example: the variance which is explained by first factor is:
The cumulative proportion is given by: (6)
3. Results and Discussion
Factor analysis was used to study the interrelationship between yield parameters (total plant yield X1, grain yield X2, straw yield X3, TGW X4) and the interrelationship between morphological characters (vegetative duration X5 , plant height X6, length of growing season X7 and leafiness X8). This use of Factor analysis utilised correlation matrix between variables to determine which sets of variables would cluster together (Dillon et al. 1984).
3.1. The interrelationship among variables in Tel Hadya
The simple correlation matrix between parameters for hybrid 1 is given as:
According to the above simple correlation matrix, it can be seen that the total plant yield was strongly correlated to grain yield and straw yield (r=0.897, 0.884, P<0.05 respectively). Grain yield was significantly correlated to straw yield ((r=0.588, P<0.05). Grain yield was not correlated to vegetative duration (r=-0.161, P>0.05) while total plant yield, straw yield and TGW had a significant weak correlation with vegetative duration (r=-0.218,-0.230, 0.319, P<0.05 respectively). Plant height is significantly correlated to all parameters except TGW and vegetative duration (r=0.026, 0.108, P>0.05 respectively) (Mohammadi, et al. 2012). Leafiness is also significantly correlated to total plant yield, straw yield and plant height (r=0.210, 0.214, 0.290, P<0.05 respectively). Thus, there are three factors included in the testing. The first factor consisted of total plant yield, grain yield and straw yield. The second factor was TGW, vegetative duration and length of growing season. The third factor was plant height and leafiness. The result for applying principle component factor analysis with SPSS program is given by Table 1. According to Table 1, Hybrid 1, there were three factors considered only because eigenvalue four (λ4=0.805) was less than 1.
The first factor accounted for about 36.63% of total variance, while 20.47% and 13.40% of total variance were explained by factor two and factor three respectively. The total variance, which is significantly underpinned by three factors, was 70.5% (Table 1). According to factor loading for Hybrid 1, the first factor correlated with total plant yield (0.985), grain yield (0.898) and straw yield (0.856). The second factor correlated with TGW (0.729), vegetative duration (0.721) and length of growing season (0.712). The third factor correlated with plant height (0.728) and leafiness (0.832). All loading factors for hybrid 1 have a positive loading on the factors. 99.1% of variance for total plant yield, 81% of variance for grain yield and 78% for straw yield are explained by the three factors, but most of this variance is explained by the first factor. Also 56%, 58.2%, 51.8% of variance for TGW, vegetative duration and length of growing season is respectively explained by three factors, while 66%, 73.7% of variance for plant height and leafiness is explained by these factors (Walton 1972).
Most of the numbers in the residual matrix were very small (less than 0.2 consistent with Kline, 1994) except for the covariance between vegetative duration and length of growing season (25%) and 28% of covariance between length of growing season and leafiness (Table 2). According to the conducted factor analysis, the morphological characters did not affect yield parameters because yield parameters (except TGW) and morphological characters had not grouped together in one factor. These results contradict the findings of (Fonge et al. 2016).
The results for applying factor analysis for each hybrid (hybrid 2 and hybrid 3) and Tel Hadya are given by Table 1. Table 1 shows that there were only three factors in each field (hybrid 2, hybrid 3 and Tel Hadya) because the eigenvalue four (λ4) was less than 1 (Manly, 1994). The variance value explained by this model ranged from 71.8% (hybrid 2) to 79.1% (hybrid 3), and the cumulative variances (R2) were statistically significant. For three fields (hybrid 2, hybrid 3 and Tel Hadya), it can be seen that there were three factors. The first factor (for all fields) strongly connected with total plant yield, grain yield and straw yield. This factor explains variance ranging from 34.3% for hybrid 2 and 40.2% for hybrid 3 (Table 1). The second factor, which accounted for about 20.5% of the variation, was strongly correlated with vegetative duration and length of growing season for three fields (Table 1). TGW correlated with the second factor only for hybrid 2 (Walton 1972). However, the communality (h2 =29.9%) for TGW was very weak. The third factor, which accounted for 15.2%, 17.0% and 18% of the variation (Tel Hadya, hybrid 2 and hybrid 3 respectively), was strongly correlated to plant height and leafiness for three fields (Table 1). Nearly 99% of the variation of the total plant yield was explained by three factors. The variance accounted between 98% (Tel Hadya) and 99.7% (hybrid 3; Table 1). Additionally, the total of variance of the grain yield, explained by factors, was between 84.1% and 88.7%. About 90% of variation of the straw yield for all fields was explained by three factors. Most of the variance for the total plant yield, grain yield, and straw yield, is explained by the first factor. The variance for TGW, which was explained by three factors, was not stable because it was 29.9% (hybrid 2), while for hybrid 3 it was 69.8% (Table 1) (Walton 1972). Also, the variance for leafiness was not stable (48.4% Tel Hadya, 65.8% hybrid 1). The variances, which are explained by three factors for vegetative duration, length of growing season and plant height were stable and more than 60%.
The residual, indicated by Table 2, was small; however, the residual covariance between leafiness and plant height was nearly 25%. The covariance between TGW and leafiness was 28.4% for Tel Hadya. Also the residual indicated that TGW and leafiness were not stable (Table 2). In conclusion, the interrelationship between yield parameters (except TGW) and the interrelationship between morphological characters were not affected by genotype.
3.3. The interrelationship among variables in Breda
The results of applying factor analysis at each three hybrids and Breda is given in Table 3. According to the factor analysis, there were only three factors in each field (hybrid1, hybrid 2, hybrid 3 and Breda) because the eigen value four (λ4) was less than 1. The total variance, which was explained by three factors, was hybrid 1 72.7%, hybrid 2 74.8%, hybrid3 69.8% and Breda 66.4%. The first factor for all fields was strongly associated with total plant yield, grain yield, straw yield and leafiness. This factor explained a variance that ranged from 36.1% for hybrid 3 to 39.4% for hybrid 1 (Table 3). TGW and length of growing season influenced the second factor for hybrid 2 and 3 and the third factor for Breda. The second factor (hybrid 1 and Breda) and the third factor (hybrid 2) were strongly connected to vegetative duration and plant height. The second factor explained variance that ranged from 18.1% for hybrid 1 to 21.1% for hybrid2 (Table 3), while factor 3 explained variance between 13.2% and 15.3%.
The total variance, explained by three factors, for total plant yield, grain yield and straw yield, is indicated in Table 4. The three factors explained more than 73% of variance, although most of the variance was explained by the first factor. All the variables were stable except leafiness, because the variance, which was explained by three factors, was 20.2% (hybrid 3), and 69.8 for hybrid 2 69.8% (Table 3).
The residual, indicated by Table 4, was small, but the residual covariance between length of growing season and TGW was nearly 34% (hybrid 3 and Breda). Also, the residual covariance between vegetative duration and duration of growing season and plant height was about 24% (hybrid 1). In conclusion, the interrelationship between yield parameters was not affected by genotype (except TGW), while the interrelationship between morphological characters was affected by genotype.
3.4. Stability of the relationship between variables
To study the stability of the relationships, factor analysis was used to identify how the parameters total plant yield X1, grain yield X2, straw yield X3, TGW X4, vegetative duration X5 , plant height X6, length of growing season X7 and leafiness X8) distributed. Table 6 shows that the relationship between yield parameters and morphological characters in Tel Hadya was stable because the yield parameters and morphological characters have same distributions in factors for three hybrids in Tel Hadya. The first factor was influenced by total plant yield, grain yield and straw yield, the second factor was related to vegetative duration and length of growing season, and the third factor was related to plant height and leafiness (Table 5). Consequently, the genotype interaction did not influence the relationship between the yield parameters and morphological characters (Tel Hadya). In Breda, it can be seen that the relationship was affected by genotype, because the interrelationship between morphological characters was not stable (Table 5). For example, plant height has not linked to any other morphological parameters in the first factor, while it correlated with vegetative duration with a different factor. For hybrid 2 (Breda), the plant height grouped with vegetative duration in factor 3, while in Breda, this correlate between them showed in factor 3. Plant height was connected to TGW and length of growing season for hybrid 3 (Table 5). The relationship between yield parameters and morphological characters was affected by location. The factors in Tel Hadya and Breda did not have the same group of parameters. For example, in Tel Hadya, leafiness did not correlate with yield parameters (total plant yield, grain yield and straw yield), while in Breda it correlated with yield parameters. Vegetative duration was grouped with length of growing season in Tel Hadya, while in Breda the length of growing season was grouped with TGW. Consequently, the relationship between yield parameters and morphological characters had been affected by genotype environment interaction.
The discussed use of Factor analysis enabled the effective management of barley production, which otherwise was almost unpredictable due to a high number of contingency factors. The formative use of factor analysis developed in this study allowed for identification and evaluation of all influencing growth of barley factors in a form of the unified framework.
The outcomes of Factor analysis indicate that about 73% of total variance had been expended by three factors in each hybrid (hybrid 1, hybrid 2 and hybrid 3) and each area (Tel Hadya and Breda). The interrelationships between yield parameters (total plant yield, grain yield and straw yield) were stable in each hybrid and each area. The communalities for yield parameter were high (>0.85); while the interrelationship between morphological characters was stable in Tel Hadya, this is not in the case of Breda. More specifically, the relationship between yield parameters and morphological characters were stable in Tel Hadya; however, this relationship was not stable in Breda. Consequently, the relationship between parameters (yield parameters and morphological characters) was affected by genotype and environment. The discussed use of factor analysis would help decision-makers to determine and/or optimise the most effective parameters for cultivating barley in the pre-specified conditions.
The main research limitation of the study was that the relationship for each hybrid was not clear because the sample size for each hybrid was insufficient. The directions for future research include: a) supporting the study of the genotype-environment interaction with the use of Partial Least Squares analysis, and b) extending the locations and seasons to introduce a more indicative data set.
The author gratefully acknowledges the provision of a PhD grant from the University of Damascus, Syria. The author is also grateful for the fellowship grant provided by CARA (the Council for At-Risk Academics).
- Amabile R. F., Faleiro F. G., Capettini F., Sayd R. M., Peixoto J. R. and Guercia R. F. (2014), “Characterization and genetic variability of barley accessions (Hordeum vulgare L.) irrigated in the savannas based on malting quality traits”, Journal of the Institute of Brewing, Vol. 120, No. 4, pp. 404–414. https://doi.org/10.1002/jib.179
- Bertholdsson, N. (1999),"Characterization of malting barley cultivars with more or less stable grain protein content under varying environmental conditions", European Journal of Agronomy, Vol. 10, No. 1, pp. 1-8. https://doi.org/10.1016/S1161-0301(98)00043-4
- Child, D. (1990), The essentials of factor analysis, London: Cassell Educational.
- Denis, J. & Adams, M. (1978),"A Factor Analysis of Plant Variables Related to Yield in Dry Beans. I. Morphologic Traits", Crop Science, Vol. 18, No. 1, pp. 74-78. https://doi.org/10.2135/cropsci1978.0011183X001800010020x
- Dillon, W. R. G., Dillon, M. W. R. and Goldstein, M. (1984), Multivariate analysis methods and applications, New York: Wiley.
- Fonge, B., Bechem, E. and Awo, E. (2016),"Fertilizer rate on growth, yield, and nutrient concentration of leafy vegetables", International Journal of Vegetable Science, Vol. 22, No. 1, pp. 274-288. https://doi.org/10.1080/19315260.2015.1005726
- Hair, J., Anderson, R. E. and Tatham, R. L. (2005), Multivariate data analysis, Upper Saddle River: Prentice Hall International.
- Härdle, W. and Hlávka, Z. (2007), Multivariate statistics, Berlin: Springer.
- Harman, H. H. (1976), Modern factor analysis, USA: University of Chicago Press.
- Johnson, R. A. and Wichern, D. W. (2006), "Multivariate Analysis", In: S. Kotz, C. B. Read, N. Balakrishnan, B. Vidakovic and N. L. Johnson (ed.): Encyclopedia of Statistical Sciences, https://doi.org/10.1002/0471667196.ess6094
- Kline, P. (1984), An easy guide to factor analysis, New York: Routledge.
- Kollo, T. and von Rosen, D. (2006), Advanced multivariate statistics with matrices, Netherland: Springer Science & Business Media.
- Mallarino, A., Oyarzabal, E. and Hinz, P. (1999),"Interpreting within-field relationships between crop yields and soil and plant variables using factor analysis", Precision Agriculture, Vol. 1, No. 1, pp.15-25. https://doi.org/10.1023/A:1009940700478
- Manly, B. F. and Alberto, J. A. N. (2016), Multivariate statistical methods: a primer, Boca Raton: CRC Press.
- Mohammadi, M., Sharifi, P., Karimizadeh, R. and Shefazdeh, M.K., (2012), "Relationships between grain yield and yield components in bread wheat under different water availability (dryland and supplemental irrigation conditions)", Notulae Botanicae Horti Agrobotanici Cluj-Napoca, Vol. 40, No 1, pp. 195-200. https://dx.doi.org/10.15835/nbha4017350
- Nasser, M. (2002), Applied Multivariate analysis and SPSS in Statistical Analysis, Syria: Damascus University.
- Parvathaneni, R.K., Natesan, S., Devaraj, A.A. and Muthuraja, R. (2011), "Fingerprinting in cucumber and melon (Cucumis spp.) Genotypes using morphological and ISSR markers", Journal of Crop Science and Biotechnology, Vol. 14, No. 1, pp. 39–43. https://doi.org/10.1007/s12892-010-0080-1
- Pett, M.A., Lackey, N.R. and Sullivan, J.J., (2003), Making sense of factor analysis: The use of factor analysis for instrument development in health care research, USA: Sage.
- Smith, A., Cullis, B. and Thompson, R. (2001),"Analyzing variety by environment data using multiplicative mixed models and adjustments for spatial field trend", Biometrics, Vol. 57, No. 1, pp.1138-1147. https://doi.org/10.1111/j.0006-341X.2001.01138.x
- Volis, S., Mendlinger, S. and Ward, D. (2002), "Differentiation along a gradient of environmental productivity and predictability in populations of Hordeum spontaneum Koch: multilevel selection analysis", Biological Journal of the Linnean Society, Vol. 75, No. 1, pp. 313-318. https://doi.org/10.1046/j.1095-8312.2002.00020.x
- Walton, P. (1972),"Factor analysis of yield in spring wheat (Triticum aestivum L.)", Crop Science, Vol. 12, No. 1, pp.731-733. https://doi.org/10.2135/cropsci1972.0011183X001200060003x