LSU Master's Theses

Comparison of data mining and statistical techniques for classification model

Identifier

etd-11012006-192748

Rochana Lahiri, Louisiana State University and Agricultural and Mechanical CollegeFollow

Degree

Master of Science (MS)

Department

Information Systems and Decision Sciences (Business Administration)

Document Type

Thesis

Abstract

The purpose of this study is to observe the performance of three statistical and data mining classification models viz., logistic regression, decision tree and neural network models for different sample sizes and sampling methods on three sets of data. It is a 3 by 2 by 3 by 8 study where each statistical or data mining method has been employed to build a model for each of 8 different sample sizes and two different sampling methods. The effect of sample size on the overall performance of each model against two sets of test data are observed and compared. It is seen that for a given dataset, none of the three methods is found to outperform any other and their performances are comparable. This is in contrast to many of the existing studies as cited in the literature review chapter of this thesis. But the absolute value of prediction accuracy varied between the three datasets indicating that the data distribution and data characteristics play a role in the actual prediction accuracy, especially the ratio of the binary values of the dependent variable in the training dataset and the population. The models built with each of the sample size and sampling method for each method were run on two sets of test data to test whether the prediction accuracy was being replicated. It was found that for each of the cases the prediction accuracy was replicated across the test datasets.

Date

2006

Document Availability at the Time of Submission

Release the entire work immediately for access worldwide.

Recommended Citation

Lahiri, Rochana, "Comparison of data mining and statistical techniques for classification model" (2006). LSU Master's Theses. 1857.
https://repository.lsu.edu/gradschool_theses/1857

Committee Chair

Helmut S. Schneider

DOI

10.31390/gradschool_theses.1857

Download

Included in

Management Sciences and Quantitative Methods Commons

COinS

LSU Master's Theses

Comparison of data mining and statistical techniques for classification model

Identifier

Degree

Department

Document Type

Abstract

Date

Document Availability at the Time of Submission

Recommended Citation

Committee Chair

DOI

Included in

Search

Browse

Author Corner

SPONSORED BY

LSU Master's Theses

Comparison of data mining and statistical techniques for classification model

Identifier

Author

Degree

Department

Document Type

Abstract

Date

Document Availability at the Time of Submission

Recommended Citation

Committee Chair

DOI

Included in

Share

Search

Browse

Author Corner

SPONSORED BY