PAC Learning Model
Intro
This model intends to fix the major problem with the consistency model. It should say something about generalizing from a smaller set of data to a larger set of examples.
Generalization Error
Our goal is to obtain an accurate hypothesis. We need to define an error to measure the accuracy.
where x is an example that comes from an unknown target distribution D. h is the hypothesis. testing examples also come from D.
Now we no longer seek a hypothesis that is consistent with training set. Rather, we want to formulate a hypothesis that minimize the .
Note: Since the training data is randomly selected from an unknown distribution, there is always the chance the training set is very unrepresentative of the source distribution.
The Probably Approximately Correct Model
A target concept class is PAC-learnable by a hypothesis space if an algorithm A such that , any target distribution D, any positive and , A uses a training set where examples taken from iid from D and produces such that and
Explanation:
m is a polynomial dependent on and . for a more accurate hypothesis, you need to a larger training set.
is the accuracy parameter, is the confidence parameter. They are both user specified.
We want a hypothesis that is highly probably () approximately correct (-good). The probably approximately correct model, get it?
Last updated