Sensorial analysis of categorized data of special coffee to identify similar crop seasons pairs using Kappa

Main Article Content

jackelya Silva
Marcelo Angelo Cirilo
Flávio Meira Borém
Diego Egídio Ribeiro
Loureço Manuel


This paper presents the proposal of a statistical method to analyse dependent agreement data with categorical ordinal responses for a longitudinal study in sensorial analysis of special coffee. The assessment of sensory attributes of special coffees were carried out by certified raters using a continuous scale of grades. The approach aimed to applying data categorization methods commonly used in machine learning which generated not only a concise summary of continuous attributes to describe the data but also allowed to maximize the agreement grades in a longitudinal study. A previous analysis was carried out to identify the similarity of grades in all sample unities. The categorization allowed the construction of marginal models for all distinct pairs time observed in the longitudinal study for modeling the concordance correlations
kappa. It also enabled to conclude that samples of harvests related to yellow grain fruits have similar sensorial characteristics. Higher altitudes are significantly favorable to obtain samples with similar sensorial characteristics identifying the set of covariates which contributed either in positive or negative way while estimating kappa.

Article Details

How to Cite
Silva, jackelya, Cirilo, M. A., Borém, F. M., Ribeiro, D. E., & Manuel, L. (2023). Sensorial analysis of categorized data of special coffee to identify similar crop seasons pairs using Kappa. Brazilian Journal of Biometrics, 41(1), 30–43.


Akanda, M. A. S.; Khanam, M. et al. Goodness-of-fit tests for gee models using kappa-like statistic to diabetes mellitus study. Journal of Applied Sciences, 5 (9), 1597–1601 (2005).

Avelino J.; Barboza, B.; Araya, J. C.; Fonseca, C.; Davrieux, F.; Guyout, B., & Cilas, C. Effects of slope exposure, altitude and yield on coffee quality in two altitude terroirs of costa rica, orosi and santa maría de dota. Journal of the Science of Food and Agriculture, 85 (11), 1869–1876 (2005).

Borém, F. M. Projeto protocolo de identidade, qualidade e rastreabilidade para embasamento da indicacao geografica dos cafes da mantiqueira. [S. I.], (2007).

Boré, F. M.; Cirillo, M. A.; De Carvalho Alves, A. P.; Dos Santos, C. M.; Liska, G. R.; Ramos, M. F.; & Lima, R. R. Coffee sensory quality study based on spatial distribution in the mantiqueira mountain region of brazil. Journal of Sensory Studies, 35 (2), e12552 (2020).

Bor’em, F. M.; Luz, M. P. S.; Sáfadi, T.; Volpato, M. M. L.; Alves, H. M. R.; Borém, R. A. T.; & Maciel, D. A. Meteorological variables and sensorial quality of coffee in the mantiqueira region of minas gerais. Coffee Science, 14 (1), 38-47 (2019).

Borém, F. M. and Shuler, J. Handbook of Coffee Post-harvest Technology: A Comprehensive Guide to the Processing, Drying, and Storage of Coffee. Gin Press. 282p. (2014)

Cohen, J. A coefficient of agreement for nominal scales. Educational and psychological measurement, 20 (1), 37–46 (1960).

Decazy,F.; Avelino, J.; Guyot, B.; Perriot, J.-J.; Pineda, C.; & Cilas, C. Quality of different honduran coffees in relation to several environments. Journal of food science, 68 (7), 2356–2361 (2003).

Donner, A. & Klar, N. The statistical analysis of kappa statistics in multiple samples. Journal of clinical epidemiology, 49 (9), 1053–1058 (1996).

Donner, A.; Shoukri, M. M.; Klar, N.; and Bartfay, E. Testing the equality of two dependent kappa statistics. Statistics in Medicine, 19 (3), 373–387 (2000).

Duarte, G. S., Pereira, A. A., and Farah, A. Chlorogenic acids and other relevant compounds in brazilian coffees processed by semi-dry and wet post-harvesting methods. Food Chemistry, 118 (3), 851–855 (2010).

Fleiss, J. L. Measuring nominal scale agreement among many raters. Psychological bulletin, 76, (5), 378-382 (1971).

Giomo, G. and Borém, F. Cafés especiais no brasil: opção pela qualidade. Informe Agropecuário, Belo Horizonte, 32 (261), 7–16 (2011).

Gonin, R.; Lipsitz, S. R.; Fitzmaurice, G. M.; and Molenberghs, G. Regression modelling of weighted κ by using generalized estimating equations. Journal of the Royal Statistical Society: Series C (Applied Statistics), 49 (1), 1–18 (2000).

Heagerty, P. J. and Zeger, S. L. Marginal regression models for clustered ordinal measurements. Journal of the American Statistical Association, 91 (435), 1024–1036 (1996).

Joet, T.; Laffargue, A.; Descroix, F.; Doulbeau, S.; Bertrand, B.; Dussert, S. Influence of environmental factors, wet processing and their interactions on the biochemical composition of green arabica coffee beans. Food chemistry, 118 (3), 693–701 (2010).

Kerber, R. Chimerge: Discretization of numeric attributes. pages 123–128 (1992).

Klar, N.; Lipsitz, S. R.; and Ibrahim, J. G. An estimating equations approach for modelling kappa. Biometrical Journal: Journal of Mathematical Methods in Biosciences, 42 (1), 45–58 (2000).

Knopp, S.; Bytof, G.; and Selmar, D. (2006). Influence of processing on the content of sugars in green arabica coffee beans. European Food Research and Technology, 223 (2), 195-201 (2006).

Kurgan, L. A. and Cios, K. J. Caim discretization algorithm. IEEE transactions on Knowledge and Data Engineering, 16 (2), 145–153 (2004).

Liang, K.-Y; and Zeger, S. L. Longitudinal data analysis using generalized linear models. Biometrika, 73 (1), 13–22 (1986).

Liang, K.-Y; and Zeger, S. L.; and Qaqish, B. Multivariate regression analyses for categorical data. Journal of the Royal Statistical Society: Series B (Methodological), 54 (1), 3–24 (1992).

Lingle, T. R. (2011). The coffee cupper’s handbook: a systematic guide to the sensory evaluation of coffee’s flavor. Specialty Coffee Association of America Long Beach. 2011.

Ma, Y.; Tang, W.; Feng, C.; and Tu, X. M. Inference for kappas for longitudinal study data:applications to sexual health research. Biometrics, 64 (3), 781–789 (2008).

Scholz, M. B.; Kitzberger, C. S. G.; Prudencio, S. H., et al. The typicity of coffees from different terroirs determined by groups of physico-chemical and sensory variables and multiple factor analysis. Food Research International, 114, 72–80 (2008).

Tolessa, K.; D’heer, J.; Duchateau, L.; and Boeckx, P. Influence of growing altitude, shade and harvest period on quality and biochemical composition of ethiopian specialty coffee. Journal of the Science of Food and Agriculture, 97 (9), 2849–2857 (2017).

Tsai, C.-J.; Lee, C.-I.; and Yang, W.-P. A discretization algorithm based on class-attribute contingency coefficient. Information Sciences, 178 (3), 714–731 (2008).

Williamson, J. M.; Kim, K.; and Lipsity, S. R. Analyzing bivariate ordinal data using a global odds ratio. Journal of the American Statistical Association, 90 (432), 1432–1437 (1995).

Williamson, J. M.; Lipsity, S. R; and Manatunga, A. K. Modeling kappa for measuring dependent categorical agreement data. Biostatistics, 1 (2), 191–202 (2000).

Zeger, S. L.; and Liang, K.-Y. Longitudinal data analysis for discrete and continuous outcomes. Biometrics, 42 (1), 121–130 (1986).