Classification predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values (class labels) in a classifying attribute and uses it in classifying new data Prediction models continuous-valued functions, i.e., predicts unknown or missing values.
Machine learning is normally not rule based. Instead, it is normally statistically based.
Bayesian Theorem
Given training data X, posterior probability of a hypothesis H, P(H|X)=P(X|H)P(H)/P(X) follows the Bayes theorem:
Informally, this can be written as
posterior = likelihood x prior / evidence.
Practical difficulty: require initial knowledge of many probabilities, significant computational cost.
Advantages
Easy to implement
Good results obtained in most of the cases
Disadvantages
Assumption: class conditional independence, therefore loss of accuracy
Practically, dependencies exist among variables
E.g., hospitals: patients: Profile: age, family history etc
Symptoms: fever, cough etc., Disease: lung cancer, diabetes etc
Dependencies among these cannot be modeled by Naïve Bayesian Classifier
How to deal with these dependencies?
Bayesian Belief Networks
Leave a comment