PAC Learning Model
Last updated
Last updated
This model intends to fix the major problem with the consistency model. It should say something about generalizing from a smaller set of data to a larger set of examples.
Our goal is to obtain an accurate hypothesis. We need to define an error to measure the accuracy. where x is an example that comes from an unknown target distribution D. h is the hypothesis. testing examples also come from D.
Now we no longer seek a hypothesis that is consistent with training set. Rather, we want to formulate a hypothesis that minimize the .
Note: Since the training data is randomly selected from an unknown distribution, there is always the chance the training set is very unrepresentative of the source distribution.
A target concept class is PAC-learnable by a hypothesis space if an algorithm A such that , any target distribution D, any positive and , A uses a training set where examples taken from iid from D and produces such that and
Explanation:
m is a polynomial dependent on and . for a more accurate hypothesis, you need to a larger training set.
is the accuracy parameter, is the confidence parameter. They are both user specified.
We want a hypothesis that is highly probably () approximately correct (-good). The probably approximately correct model, get it?