PAC Learning Model

Intro

This model intends to fix the major problem with the consistency model. It should say something about generalizing from a smaller set of data to a larger set of examples.

Generalization Error

Our goal is to obtain an accurate hypothesis. We need to define an error to measure the accuracy. PrxD[h(x)c(x)]=errD(h)Pr_{x\sim D}[h(x) \neq c(x)] = err_D(h) where x is an example that comes from an unknown target distribution D. h is the hypothesis. testing examples also come from D.

Now we no longer seek a hypothesis that is consistent with training set. Rather, we want to formulate a hypothesis that minimize the errD(h)err_D(h).

Note: Since the training data is randomly selected from an unknown distribution, there is always the chance the training set is very unrepresentative of the source distribution.

The Probably Approximately Correct Model

A target concept class CC is PAC-learnable by a hypothesis space HH if \exists an algorithm A such that cC\forall c \in C, any target distribution D, any positive ϵ\epsilon and δ\delta, A uses a training set S={(x1,c(x1)),(x2,c(x2),...,(xm,c(xm))}S = \{(x_1, c(x_1)), (x_2, c(x_2), ..., (x_m, c(x_m))\} where m=ploy(1ϵ,1δ,...)m = ploy(\frac{1}{\epsilon}, \frac{1}{\delta}, ...) examples taken from iid from D and produces hHh \in H such that errD(h)ϵerr_D(h) \leq \epsilon and Pr[errD(h)]1δPr[err_D(h)] \leq 1 - \delta

Explanation:

  • m is a polynomial dependent on ϵ\epsilon and δ\delta. for a more accurate hypothesis, you need to a larger training set.

  • ϵ\epsilon is the accuracy parameter, δ\delta is the confidence parameter. They are both user specified.

  • We want a hypothesis that is highly probably (1δ1 - \delta) approximately correct (ϵ\epsilon-good). The probably approximately correct model, get it?

Last updated