BAYESIAN AND FREQUENTIST APPROACHES FOR FITTING THE GAMMA-TIME-DEPENDENT MODEL TO DESCRIBE NEUTRAL DETERGENT FIBER DEGRADATION

 ABSTRACT: The aim of the study was evaluate and compare the efficiency of Bayesian and frequentist approach to describe the rumen degradation of NDF. Simulated data was composed by four scenarios: regular restriction in the number of incubation times, random loss of incubation times, loss of specific parts of degradation curves, variation in the precision of the incubations procedures. Two real datasets was used, these real data encompassed the evaluation of NDF degradation of a tropical grass (Brachiaria decumbes). The model was fitted according their characteristics approach and compared by plots and assessors. The Bayesian and frequentist approach presented reliable estimates of degradation parameters for the majority of the data tested. Therefore, in specific cases with short random records number, the Bayesian approach showed greater bias of the estimates of incubation residue and estimates of degradation rate without a biological coherence of the parameters, compared to frequentist inference. In another words, the Bayesian approach fitted with prior diffuse, presented less flexible. Nevertheless, it is emphasized the importance of the background information before the modeling, mainly for the Bayesian approach, in order to define proper prior distributions. Future thorough studies about the influence of non-informative prior for the parameters are necessary.


BAYESIAN AND FREQUENTIST APPROACHES FOR FITTING THE GAMMA-TIME-DEPENDENT MODEL TO DESCRIBE NEUTRAL DETERGENT FIBER DEGRADATION
Hugo Colombarolli BONFÁ1 Edenio DETMANN 1 Fabyano Fonseca e SILVA 1 Janderson Florêncio FIGUEIRAS 1  ABSTRACT: The aim of the study was evaluate and compare the efficiency of Bayesian and frequentist approach to describe the rumen degradation of NDF.Simulated data was composed by four scenarios: regular restriction in the number of incubation times, random loss of incubation times, loss of specific parts of degradation curves, variation in the precision of the incubations procedures.Two real datasets was used, these real data encompassed the evaluation of NDF degradation of a tropical grass (Brachiaria decumbes).The model was fitted according their characteristics approach and compared by plots and assessors.The Bayesian and frequentist approach presented reliable estimates of degradation parameters for the majority of the data tested.Therefore, in specific cases with short random records number, the Bayesian approach showed greater bias of the estimates of incubation residue and estimates of degradation rate without a biological coherence of the parameters, compared to frequentist inference.In another words, the Bayesian approach fitted with prior diffuse, presented less flexible.Nevertheless, it is emphasized the importance of the background information before the modeling, mainly for the Bayesian approach, in order to define proper prior distributions.Future thorough studies about the influence of non-informative prior for the parameters are necessary.

Introduction
The utilization of the frequentist inference, or the classical inference, was almost unanimous among the scientists in the early years of twentieth century.However, with the computational improving, the Bayesian inference reappeared as a viable alternative to statistical modeling and analysis.Bayesian inference was avoided by researchers for a long time because of the highly complex mathematical resolution, which was not considered viable to be made by using simple algebraic algorithms (LESAFFRE and LAWSON, 2012).However, at early 1960's, the Bayesian inference reappeared in a theoretical paper (JEFFREYS, 1961), but just became widely available to be used from 1990's (GELFAND et al., 1990), when complex integration resolutions could be solved by simulation.
One of the most important objectives of the statistics is to fit or build models.When classical frequentist inference is applied on data, population parameters are considered fixed effects; so the information about the parameters is only obtained by sampling.However, when using Bayesian inference, the population parameters are considered random effects and can be described probabilistically.Therefore, for the process of Bayesian estimation, a smaller dataset would be required than do the frequentist inference.In other words, the model fit by Bayesian method presents lower dependence from data size compared to the frequentist approach (BEAUMONT and RANNALA, 2004).Recently, it has got more difficult to perform experiments with a great number of animals because the high labor, high cost and, mainly, the stricter ethical regulations.In a particular way, experiments where a surgical intervention is necessary have been more affected by the ethical committee exigencies.From this, it can be stated that evaluation of the rumen kinetic parameters (i.e., degradation and transit) is one of the most affected evaluations, because the majority of the experiments are performed by using in situ methodology.Therefore, more flexible statistical approaches, in terms of number of experimental animals, have been demanded.
When studies on ruminal kinetics are performed in the tropics, the degradation of the neutral detergent fiber (NDF) must be considered one of the most relevant information as NDF is the main source of energy for cattle production (DETMANN et al., 2008) and around 90 to 95% of its utilization occur in the rumen (HUHTANEN et al., 2010).The ruminal degradation pattern of NDF is described by nonlinear models and follows the action-mass law.The rumen kinetics of NDF is typically a time-dependent process as the probability of a fiber particle either escapes to the lower-gut or be degraded by microorganisms varies along the residence time in the rumen.The inclusion of timedependency of the rumen kinetics in the nonlinear description is most commonly incorporated by using gamma time-dependent models (ELLIS et al., 1994).
However, the adjusting of nonlinear models can be complexed, mainly for rumen kinetics data, because the errors may not follow a normal distribution.Sometimes, the normal distribution is supposed to be asymptotically achieved.However, the actual data number is normally low.On the other hand, the pattern of degradation or transit is described along time, which creates a typical heteroscedasticity or a "funnel" effect (DETMANN et al., 2001) and a dependency between errors.As an alternative to the frequentist approach, the Bayesian methodology does not require the assumption of normality as a necessary condition and the inferences on the parameters are made on their posterior distribution.In this case, a model is supposed for each dataset and the parameters of each model are compared based on their posterior distributions (ROSSI et al., 2010).
However, it seems necessary to evaluate the model adjustment capacity and to compare the parameter estimates obtained by Bayesian and frequentist approach, in order to clarify the advantages, disadvantages, and limitations of each approach for ruminal degradation modeling.Therefore, the aim of this study was to evaluate and to compare the efficiency of the Bayesian and frequentist approaches to describe the ruminal degradation of neutral detergent fiber by using a gamma time-dependent model.

Model applied to interpret ruminal degradation of NDF
The basic model applied to interpret the ruminal degradation profile of NDF was based on a Gamma-2 order of time-dependency according to derivations of Van Milgen et al. (1991): where R(t) is the residue of NDF at incubation time t (%); U is the potentially degradable fraction of NDF (%); c is the common rate of lag and degradation (h -1 ); and I is the asymptote reached when

  t
, which means the undegradable fraction of NDF (%).It must be noted that parameters U and I represent the two fractions of the NDF, and their sum must reproduce the total NDF (100%), as this fiber analytical approach does not encompass a soluble fraction (MERTENS, 2005).

Simulated data
The simulated data used for the model adjusting and evaluation by the two statistical approaches consisted of a main database split into four scenarios and four subsets for each scenario.Each complete scenario consisted of 145 observations simulated according to the model descried in Equation ( 1).The values of the parameters of the Equation (1) were assumed to be the average values obtained in a previous work carried out by Figueiras (2013), who evaluated the degradation of NDF from a tropical grass in cattle under grazing.The average values were 62.92, 37.08, and 0.059 for U, I, and c, respectively.Each subset was simulated according to a theoretical hour-by-hour incubation design, then varying from 0 to 144 hours of incubation.From the whole subsets (n = 145), different scenarios were produced as described below.
Scenario A -Regular restriction in the number of incubation times: the subsets were produced to simulate a decrease in the number of incubation times; however, presenting an equal time interval between incubation points.Therefore, the four subsets represented sampling designs with degradation times equally spaced in 3, 6, 12 and 24 hours (subsets A1, A2, A3, and A4, respectively).In this way, these subsets were composed by 48, 24, 12 and, 6 incubation points, respectively.To simulate this scenario, the standard deviation of the residual random error among incubation points was assumed to be equal to 1.0.
Scenario B -Random loss of incubation times: the subsets were produced to simulate different degrees of random losses of incubation times.Therefore, the four subsets represented sampling designs with degradation times randomly distributed from 0 to 144 hours of incubation.In this sense, these subsets were composed by 48, 24, 12 and 6 incubation points (subsets B1, B2, B3, and B4, respectively).To simulate this scenario, the standard deviation of the residual random error among incubation points was assumed to be equal to 1.0.
Scenario C -Loss of specific parts of the degradation curves: the subsets were produced to simulate the loss of incubation points located at the same part of the degradation curve.This scenario was composed by four subsets (C1, C2, C3, and C4), where each subset consisted of 75% of the total subset record number.Each subset was divided into four parts (four quarters); three of these parts composed each subset.Therefore, each subset have 108 records, where the first subset consist in the main database without the first part (without initial quarter), the second subset consist, of the main database without the second part (without second quarter) and so on.To simulate this scenario, the standard deviation of the residual random error among incubation points was assumed to be equal to 1.0.
Scenario D -Variation in the precision of the incubation procedures: the subsets were produced to simulate obtaining information with different levels of precision.This scenario was composed by four subsets (D1, D2, D3, and D4) with 145 records each.However, they were simulated considering the standard deviation of the residual random error among incubation points of 2.0, 2.5, 3.0, and 3.5, respectively.
For all scenarios, each subset was simulated ten times in order to allow a more robust evaluation of the ability of different approaches to adjust in front of different scenarios.
The objective of the four scenarios were: (A) to evaluate circumstances where minimal sampling procedures are demanded; (B) to evaluated circumstances where an intense loss of information occurred; (C) to evaluate the robustness of the statistical approach regarding the loss of a specific sequential number of samples; and (D) to evaluate the robustness of the statistical approach when a low-precision dataset are available to estimate the parameters (Figure 1).

Actual data
After the evaluation of simulated data, two real datasets were used to evaluate the Bayesian and frequentist approaches.Both datasets encompassed the evaluation of NDF degradation of a tropical grass (Brachiaria decumbens) in two experiments carried out according to complete 5 × 5 Latin square design with five treatments, which consisted of different supplementation schemes for growing cattle under grazing (FIGUEIRAS, 2013).The NDF degradation was measured within each experimental period (five for each experiment) according to the incubation times: 0, 3, 6, 9, 12, 24, 36, 48, 60, 72, 96, 120, and 144 hours.Therefore, ten different adjustments were performed (one for each treatment).In spite of few losses of bags (incubation points), each adjustment was performed, on average, with n = 65.Details of the treatments and incubation procedures can be found in Figueiras (2013).

Model fitting
Model fitting following the frequentist approach was performed according the ordinary least squares method with the solutions obtained by the Gauss-Newton algorithm using lme4 package from the R software (BATES et al., 2015).Significance of the parameters was checked using the asymptotic confidence intervals.
For the fit according to Bayesian approach, the variance components were defined by ) 10 , 10 ( 1 . Gamma and uniform distributions are standard choices of diffuse informative prior distributions to the variance components of model (GELMAN et al., 2004).Considering that rumen kinetics data errors may not follow a normal distribution, the choice of priors distributions for the variance components was based on the construction of minimally informative priors.For model parameters was specifies prior distributions: The posterior distribution was simulated by Monte Carlo Markov Chain (MCMC) process by using the statistical software OpenBugs (THOMAS et al., 2016), which was interfaced with R via BRugs package (LIGGES, 2013) (LUNN et al., 2009).Briefly, the posterior distribution is simulated by the MCMC method by constructing Markov Chains, for which partial distributions approximate the posterior density, and using Monte Carlo integration to compute integrals and expectations.The Monte Carlo error approximation is reduced by increasing the number of samples up to a point at which no further gain in accuracy is achieved in the approximation of the posterior summaries.Two chains with overdispersal initial values were determined for each parameter and chain mixing, autocorrelation, posterior distribution, and the Heildelberger-Welch diagnostic (HEIDELBERGER; WELCH, 1992) implemented via the BOA (Bayesian Output Analysis) package (SMITH, 2007).A minimum of 500,000 iterations was determined by simulation, of which 200,000 iterations composed a burn-in period, and after the burn-in period 300,000 iterations were saved to obtain posterior distribution estimates.The chains were thinned by a factor of 100, totalizing 3,000 simulated values in a final sample.
In the Bayesian approach, the posterior probabilities of the parameters were calculated as the proportion of chain iterations of the MCMC sampling method spent in each subset.

Evaluation of adjustment and comparison of parameter estimates
The quality of the adjustments was evaluated through the asymptotic standard deviation of the residual error (ASDR) using the likelihood estimator as follow: where i Y ˆis the predict value, i Y is the simulated or real (observed), and n is the number of records.
Specifically for the simulated dataset, the accuracy of the estimates for each parameter was evaluated by computing the bias (B).Such an assumption was based on the fact that estimates of the parameters obtaining by each method should converge to the parametric values used to simulate the scenarios and subsets.The average estimates of B (%) were calculated as: 100 where i  ˆ is the estimate of the parameter in the evaluation i for an specific scenario and subset (i = 1, 2, …, 10), and  is the parametric value.
The pattern of the average difference between the estimates of the parameters obtained by Bayesian and frequentist approaches was evaluated by using a paired t-test (α = 0.05).
When necessary, descriptive plots were also drawn to a better understanding of the pattern of results.
The statistical computations was performed using the R statistical software (R CORE TEAM, 2017), except the MCMC method that was performed in the R statistical software (R CORE TEAM, 2017) assisted by statistical software WinBUGS (LUNN et al., 2000).

Simulated data
In the Bayesian simulation, the samples came from 300,000 iterations, already discounting the burn-in period, which resulted in 3,000 total record iterations.This total record number in each subset was verified for convergence of chains by the Heildelberger-Welch diagnostic.From this diagnostic, it was not registered any non-convergence in the subsets.Thus, 3,000 record iterations were sufficient to achieve convergence.Similarly, the frequentist approach did not show any case of unsuccessful convergence.During the model fitting, it was not determined any border for the parameters estimate, in order to evaluate if the approaches would be able to estimate the parameters in accordance with the biological coherence borders.
At first glance, both inferences showed a similar performance.On average, the parameters estimates, precision (Table 1) and bias, and the residual error (Table 2) were very similar to each other when both methods were numerically contrasted.It is important to note that for most subsets the sums of the potentially degradable and undegradable fractions (U+I, Table 1) were very close to 100%, which agreed with the biological and chemical assumptions of NDF degradation and added reliability to the model adjustments (Table 1).However, subset B4, which was developed to test the highest level of random loss of incubation points, showed discrepant values for U+I for both approaches (103.57for Bayesian and 103.09 for frequentist).In a particular way, the adjustment of the subset B4 through Bayesian approach lead to estimates of parameter c that did not match any expected biological pattern (Table 1) and with a highly prominent bias (Table 2).In spite of being very similar, the estimates of the undegradable fraction I differed between Bayesian and frequentist approaches (P<0.05).On average, I fraction was higher for the frequentist approach (Table 3).Is must be emphasized that this pattern was kept even when not considering subset B4 in the paired t-test (data not shown).There was no difference between approaches with regard parameters U and c (P≥0.16). 1 See text for details about scenarios. 2The values correspond to the averages of ten adjustments for each subset within each scenario. 3The values between parentheses correspond to standard deviation of bias among ten adjustments.

Actual dataset
In general, the pattern of the results regarding U+I obtained for the actual dataset followed what has been observed for the simulated dataset (Table 4).There was little variation among treatments within each experiment for the frequentist approach, and for Experiment 2 when Bayesian approach was applied.However, the Bayesian approach tended to over and underestimate the fractions U and I for treatments 2 and 5 within Experiment 1, respectively (Table 4), when compared with the other treatments and with the estimates obtained with frequentist approach.It must be pointed out that those treatments were the ones with the lowest precision among all evaluated treatments.
In spite of that behavior, there were no differences between approaches with regard the parameter estimates (P≥0.19,Table 3).Nonetheless, similarly to the simulated dataset, the average estimates of parameter I were numerically higher for the frequentist approach (Table 3).The relevance of this pattern will be discussed with more details in the next section.

Discussion
The essence of the Bayesian method is there is no logical distinction between model parameters and data (BEAUMONT and RANNALA, 2004).Both of them are random variables with a joint probability distribution that is specified by each evaluated model.The aim of the Bayesian inference is to calculate the posterior distribution of the parameters, the conditional distribution of parameters given a data, assisted by a prior distribution.Conversely, the frequentist approach is based on maximizing the probability of the data given the parameters (that is, maximizing the likelihood as a function of the parameter for a fixed dataset).
In fact, both inferences aim to estimate the parameters with the least possible error.The Bayesian inference has a specific advantage comparing to frequentist inference that is to consider the prior information (SILVA et al., 2011;2013).However, it is common in practice to utilize a default or objective prior distribution.Then, the Bayes theorem would not provide any guarantee as to performance (BAYARRI and BERGER, 2004).Another problem is the choice of an improper prior distribution.In this case, the Bayes theorem may generate an incorrect performance and, consequently, an improper posterior distribution.
In terms of mathematic, the results coming from frequentist inference and Bayesian inference with non-informative priors would tend to be the same.Browne and Draper (2006) compared the Bayesian approach (without prior influence) and frequentist approach (likelihood-based) in different types of models and concluded that both methods lead to similar and unbiased estimates.Such a conclusion agrees, in general, with the results obtained for the scenarios A, C, and D, where both approaches seems to present similar robustness in front of an organized minimal sampling (Scenario A); an loss of an entire section of the curve, but preserving the other parts (Scenario C); and a dataset with increased random error (Scenario D).
However, the choice of non-informative prior distribution can affect the inferences; noticeably in cases where the number of records/groups evaluated are small or when the variance of the records/group is close to zero (GELMAN, 2004;2006).For instance, the uniform distribution is commonly considered an improper prior distribution for the variance parameter.The uniform distribution (0, A) (i.e.uniform distribution with values in the range 0 and A) can produce a limited proper posterior distribution as A → ∞, as long as the number of records/groups.Thus, for a finite but sufficiently large A, the inferences are not sensitive to the choice of A (GELMAN, 2006).In order to evaluate both inferences as balanced as possible in this paper, it was used a "weakly informative prior" gamma distribution (U, I, c ~Gam (10 -3 , 10 -3 ) ) for the parameters.The choice of gamma distribution was based on the fact that only positive values belongs to this distribution and none parameter of the evaluated model can assume negative values (Equation 1).
However, the constraint arisen from limit data availability along with using a noninformative prior distribution seemed to affect the estimates for subset B4, leading to great bias of the estimates of incubation residue (Figure 2) and estimates of degradation rate without a biological coherence (Tables 1 and 2).It should be noted that a similar pattern was not observed for frequentist approach (Figure 3).When random losses are simulated, it is not possible to assure where the remained incubation points would be located regarding time.Therefore, it is possible to happen that points may be located very close to each other, which could cause a lack of agreement between the real information and the prior distribution used here.Therefore, it may suggest that under a high level of random loss of incubations points, a more detailed study of the prior distribution (maybe coming from information from other studies or meta-analyses) must be done to assure reliability and accuracy of predicted values.It must be highlighted that utilization of Bayesian approach to rumen kinetics is not a routine in nutritional studies and knowledge on distribution of the related parameters is not available yet.In spite of the previous discussion, one constraint was observed for Bayesian approach.The evaluation of simulated dataset showed a systematic underestimation of undegradable fraction of NDF (Tables 1, 2, and 3).The dimension of fractions U and I are inherent to the feed and cannot be influenced by dietary characteristics (DETMANN et al., 2008).In other words, when a diet favors or disfavors the ruminal degradation of fiber, its effects would be perceived only on degradation rate.In spite of a non-significant difference between the approaches evaluated here, the evaluation of the real dataset showed numerically lower estimates of fraction I when the Bayesian approach was used (Tables 3  and 4).The fraction I was evaluated in both experiments by Figueiras (2013) using a longterm incubation procedure (288 hours) as recommended by Valente et al. (2011).This procedure is used based on the minimal time necessary to incubation residue becomes statistically similar to the residue theoretically obtained at infinite time.Figueiras (2013) found estimates for I fraction of 29.68 and 33.76% in Experiments 1 and 2, respectively.From this, it is noted that frequentist approach produced estimated closer to that values (26.31 and 32.61%) when compared to Bayesian approach (22.51 and 32.48%).Moreover, the high variation among treatments for the estimates of U and I fractions in Experiment 1 (Table 4) indicates problems with regard Bayesian approach.The forage evaluated within each experiment was the same.Therefore, variations between treatments must be only minimal and random as the dimensions of the fractions are inherent to the feed itself and cannot vary according to different supplements.Mertens (2005) highlighted that the final points of an incubation procedure should provide information that allow a good adjustment of the model regarding asymptote.In other words, at least the final incubation point should be located on a specific time in order to give reliable information of the dimension of undegradable fraction and allow a more accurate adjustment.The endpoint at 144 hours seemed a good procedure when the frequentist approach was used, but not when Bayesian approach was applied.It may be a reflex of the lack of a more reliable prior information, as discussed before.

Conclusions
The Bayesian approach and frequentist approach presented reliable estimates for the majority of the data tested.Therefore, in specific cases with short random records number, the Bayesian approach showed greater bias of the incubation residue estimates and estimates of degradation rate without a biological coherence, compared to frequentist inference.In another words, the Bayesian approach fitted with non-informative prior, presented less flexible than frequentist inference.Nevertheless, it is emphasized the importance of the background information before the modeling, especially in a Bayesian approach, in order to define proper prior distributions.Future thorough studies about the influence of non-informative prior correlated to parameters estimated are necessary.

Figure 1 -
Figure 1 -Examples of the simulated scenarios: A, regular restriction in the number of incubations times; B, random loss of incubation times; C, loss of specific parts of the degradation curves; and D, variation in the precision of the incubation procedures.

Figure 2 -
Figure 2 -Descriptive relationship between actual (from subsets) and predicted according to Bayesian approach (from model adjustment) of the NDF residues as a function of time for scenario B. The solid lines represent the equality line.

Figure 3 -
Figure 3 -Descriptive relationship between actual (from subsets) and predicted according to frequentist approach (from model fitted) of the NDF residues as a function of time for scenario B. The solid lines represent the equality line.

Table 1 -
Average estimates of the parameters potentially degradable fraction (U, %), common rate of lag and degradation (c, h -1 ), undegradable fraction (I, %) obtained on simulated data using the Bayesian or frequentist approaches

Table 2 -
Asymptotic standard deviation of residual error (ASDR) for the adjusted model and average bias (B, %) for the estimates of the parameters potentially degradable fraction (U, %), common rate of lag and degradation (c, h -1 ), undergradable fraction (I, %) obtained on simulated data using the Bayesian or frequentist approaches

Table 3 -
Paired t-test to compare the estimates of the parameters potentially degradable