Master of Science (MS)
Oceanography and Coastal Sciences
Recent advances in statistical understanding have focused fisheries research attention on addressing the theoretical and statistical issues encountered in standardizing catch-rate data. Similarly, the present study evaluates the performance of boosted regression trees (BRT), the product of recent progress in machine learning technology, as a potential tool for catch-rate standardization. The BRT method provides a number of advantages over the traditional GLM and GAM approaches including, but not limited to: robust parameter estimates as a result of the integrated stochastic gradient boosting algorithm; model structure learned from data and not determined a priori, thereby avoiding assumptions required for model specification; and easy implementation of complex and/or multi-way interactions. Performance of the BRT method was evaluated comparatively, where GLM, GAM and BRT main-effects models, and a BRT two-way model, were trained using zero-truncated, lognormal catch-rate data, with identical predictors and dataset. Data used were observer-collected records of yellowfin tuna catch from the Gulf of Mexico longline fishery, 1998-2005. Model comparisons were based, primarily, on percent deviance explained by the trained models and prediction error using a test dataset, measured as root mean squared error (RMSE). Secondarily, the relative influence of model predictors and handling of spatially correlated error structures by each of the four models were examined. Fitted GLM, GAM, BRT and BRT two-way models accounted for 19.56%, 25.10%, 26.10% and 37.3% of total model deviance, respectively. RMSE values for the GLM (0.3552), GAM (0.3554), BRT (0.3546) and BRT two-way (0.3509) models indicate that the BRT-based models performed marginally better than the traditional GLM and GAM methods, with lower prediction error. Indices of predictor influence and spatial analysis of model residuals, for the main-effects models, suggest GAM and BRT models perform comparably in the partitioning of variance amongst predictors and handling of autocorrelated variance structures. Overall, results of the main-effects models indicate that the BRT method is as equally adept as GAMs in fitting non-linear responses, however unlike the GAM, the BRT avoided overfitting the data, thereby providing more robust estimates. The BRT two-way interaction model further demonstrates: the ability of the BRT method in fitting complex models, while avoiding overfitting; the ease with which interactions can be incorporated and specific terms extracted, such as the year term; and the potential role of complex interactions in accounting for non-stationary processes. Although the results presented here are not definitive, for every measure of performance examined the BRT-based models performed as equally well or better than the traditional GLM/GAM standardization methods, thereby confirming the utility of the BRT method for catch standardization purposes.
Document Availability at the Time of Submission
Release the entire work immediately for access worldwide.
Abeare, Shane, "Comparisons of boosted regression tree, GLM and GAM performance in the standardization of yellowfin tuna catch-rate data from the Gulf of Mexico lonline [sic] fishery" (2009). LSU Master's Theses. 2880.