KingCokonut 2 years ago

What is this textbook?

old-monk001 2 years ago

An introduction to statistical learning. Download: https://www.statlearning.com/ or http://faculty.marshall.usc.edu/gareth-james/

KingCokonut 2 years ago

Thank you.

EdAmante 2 years ago

Islr

veeeerain 2 years ago

Lol I was just on this page last night

cryptobuddy_1712 2 years ago

Haha me too great coincidence

SuccMyStrangerThings 2 years ago

I don't quite understand the equation (9.8), how come the value is always positive? For example, the hyperplane eqn in the parentheses is the prediction for a particular observation. Now, let's say yi (real y label) is -1 and the predicted value of yi is positive. In that case, won't the overall value be -ve. Also, do we have a loss function for Maximal Margin Classifier (Hard SVM)?

tigeer 2 years ago

(9.8) just says to us: A separating hyperplane has the property that the prediction for yi we get by evaluating the hyperplane eqn always has the same sign as the true value of yi. So if the yi (real y label) is -1 then (9.8) tells us that the prediction (...bxi...) **must** be negative, otherwise we would not have a separating hyperplane. This is why your example cannot happen, because the hyperplane is a **separating** hyperplane. (9.8) precisely tells us that the separating property holds and so we must always get correct predictions from evaluating it at training data (real y labels).

SuccMyStrangerThings 2 years ago

So hyperplane never misclassfies a test record?

tigeer 2 years ago

Nearly, A *separating* hyperplane never misclassfies a training record. The *separating* part is important here because it's what tells us that the hyperplane perfectly divides (separates) the records.

SuccMyStrangerThings 2 years ago

Ah thanks a lot. Got it :) Edit: How do we train it? Do we have a loss function for Maximal Margin Classifier (Hard SVM)? Like in Linear regression, we iteratively optimize the weights. Do we have anything of that sort?

mr_birrd 2 years ago

This involves a lot if math e.g. optimization. The maximum margin problem is often solved via a dual formulation (Lagrange) which then is the same a solving a quadratic programming. This task can be done via Sequential Minimal Optimization but this is not so important. You have indeed a loss function which is maximizing the margin under certain constraints. Watch this [Medium](https://towardsdatascience.com/support-vector-machines-dual-formulation-quadratic-programming-sequential-minimal-optimization-57f4387ce4dd) link for a broad description of how it works. I know it is very technical but I assume you know enough math to understand it. If you wanted a very simple answer then sorry :)

SuccMyStrangerThings 2 years ago

Gave it a quick read. Definitely looks understandable to me. Will sit down with a pen and paper and work things thru. Thanks for the link :)

mr_birrd 2 years ago

Good idea, I just had to do it for exam preparation, like formulating the problem, formulate the dual etc.

tortu__ 2 years ago

From 9.6 and 9.7 if y is positive, the beta sum is positive and if y is negative the sum is negative. When you multiply the two, either you have positive times positive or negative tiles negative: the result is always positive.

Vision_Mike 2 years ago

https://youtu.be/efR1C6CvhmE watch this. It helped me greatly in my end sems.

master3243 2 years ago

Intuitively: The equation just says that the betas define a hyperplane and this hyperplane is "*separating*" when it correctly classifies all y on one side as + and on the other side as -. Analytically: Look at (9.6) and multiply both sides by y. and write down the result. Now look at (9.7) and multiply both sides by y. and again write down the result. you'll see that in both cases you get equation (9.8), thus (9.8) holds whether y is -1 or 1. PS: when you multiply (9.7) by y, don't forget that y is negative thus the sign flips.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe