Explain why cross-validation is used in both supervised learning (classification) and unsupervised learning (clustering)?

ASSIGNMENT

Provide a brief description and examples of each of the following methods of clustering:

Partitioning methods.

Hierarchical methods.

Density-based methods.

Grid-based methods.

 Load the soybean diagnosis data set in Weka (found in Weka-3.6/data/soybean.arff), then perform the following:

Build a decision tree by selecting J48 as the classifier and 10-way cross-validation. Then fill out the following table:

Correctly Classified Instances  
Incorrectly Classified Instances  
Kappa statistic  
Mean absolute error  
Root mean squared error  
Relative absolute error  
Root relative squared error  
Total Number of Instances  

Build a Naïve Bayes classifier and select 10-way cross-validation. Then fill out the following table:

Correctly Classified Instances  
Incorrectly Classified Instances  
Kappa statistic  
Mean absolute error  
Root mean squared error  
Relative absolute error  
Root relative squared error  
Total Number of Instances  

Compare between results in previous two sections (a and b), which algorithm give the better result and why?

Construction and evaluation of a classifier’s accuracy on a dataset require partitioning labeled data into a training set and a test set. Explain three main methods used for such partitioning.

Explain why cross-validation is used in both supervised learning (classification) and unsupervised learning (clustering)?

Explain why cross-validation is used in both supervised learning (classification) and unsupervised learning (clustering)?
Scroll to top