Neural Network [cs231n - week 2 : Image Classfication]

The purpose of this post is to summarize the content of cs231n lecture for me, so it could be a little bit unkind for people who didn’t watch the video. In addition, I omitted some contents that I don’t think it’s important enough, so use this article as just an assistance.

Prologure

Obstacles for Image Classfication

For below reasons! ~~Obstacle is that cats are too cute..!~~

There Is No Magic in Image Classification

There is no function like below.

def classify_image(image):
    # Some magic here?
    return class_label

Instead, image classification functions follow these two steps.

function1 : inputs images, outputs model
function2 : inputs model, predicts images

Algorithms for Image Classification

Simple Nearest Algorithm

It’s literally simple. All you have to do to use this algorithm is just to calculate the mean value of the gap of each spot. If the calculated mean value is low, this algorithm says the two images are similar.

Limitation

K-nearest Neighbors Algorithm

It is an algorithm that classfies domains by the number of spots around there. K represents how many spots have to participate in votes to determine where a spot has to belong to.
You can feel what exactly it is in here : http://vision.stanford.edu/teaching/cs231n-demos/knn/

No Use

There is no need to say about simple nearest algorithm, but K-nearest neighbors algorithm also is not used in practice for below three reasons.

slow on test
‘distance’ is not informative
curse of dimension : num of features explode as dimension grows

Hyperparameters

Definition

According to Wikipedia, hyperparameter is a parameter of a prior distribution; the term is used to distinguish them from parameters of the model for the underlying system under analysis.

How to Determine Hyperparameters

We can determine hyperparameters by splitting dataset like below.

all train : overfitting when k=1

It’s bad.
train/test : choose hyperparameters on test data

It’s bad too.
train/validation/test : choose hyper on validation data and evaluate the result on test data

It’s not bad.
train/test and train=fold1~foldn : use foldm(1<=m<=n) as validation [almost perfect but impractical]

It’s almost perfect to determine hyperparameters, but impractical.

Linear Classification

The professor said both simple nearest and k-nearest neighbor aren’t used in practice. So what algorithm is suitable for image classification? The answer is linear classification.
I already know what the linear classification is because I’ve learned about it from Andrew Ng of Coursera, so I will write this section very briefly.

Correlation between Linear Classification And Neural Network

neural_network = sum(linear_classifications)

and

cnn = neural_network + nlp      # "but it's too exciting for ch.2" - pf. Justin Johnson

Parametrical Model

Parameter indicates the ‘weight’.
No loger to need to know the whole data, but parameters w.

Limitation

Only one template per category (no diversity in view angle or etc)
There are impossible cases
ex) xor

Appreciation

Even students of Stanford sometimes ask stupid questions.
Cats are cute.

Questions

I excluded non-helpful questions.

white space? (k-nearest neighbor)
-> vote result is draw
when we choose L1 distance over L2?
-> when each aspect has own meaning (like features of DB table)
getDiff(train, validation)
-> an algorithm directly access labels in training set, but not in validation set. # Isn’t it the explanation of the test set rather than a validation set?
full retrain for hyperparameter?
-> sometimes do in practice. but it’s up to taste.

Dono (頓悟, sudden enlightenment)

이 블로그 검색