What classifiers to use for deciding if two datasets depict the same individual?_问答_开发者

What classifiers to use for deciding if two datasets depict the same individual?

开发者 https://www.devze.com 2023-03-20 09:27 出处：网络

Suppose I have pictures of faces of a set of individuals. The question I\'m trying to answer is: \"do these two pictures represent the same individual\"?

Suppose I have pictures of faces of a set of individuals. The question I'm trying to answer is: "do these two pictures represent the same individual"?

As usual, I have a training set containing several pictures for a number of individuals. The individuals and pictures the algorithm will have to process are of course not in the training set.

My question is not about image processing algorithms or particular features I should use, but on the issue of classification. I don't see how traditional classifier algorithms such as SVM or Adaboost can be used in this context. How should I use them? Should I use other classifiers? Which ones?

NB: my real applica开发者_开发百科tion is not faces (I don't want to disclose it), but it's close enough.

Note: the training dataset isn't enormous, in the low thousands at best. Each dataset is pretty big though (a few megabytes), even if it doesn't hold a lot of real information.

You should probably look at the following methods:

P. Jonathon Phillips: Support Vector Machines Applied to Face Recognition. NIPS 1998: 803-809
Haibin Ling, Stefano Soatto, Narayanan Ramanathan, and David W. Jacobs, A Study of Face Recognition as People Age, IEEE International Conference on Computer Vision (ICCV), 2007.

These methods describe using SVMs to same person/different person problems like the one you describe. If the alignment of the features (eyes, nose, mouth) is good, these methods work very nicely.

How big is your dataset? I would start this problem by coming up with some kind of distance metric (say euclidean) that would characterize differences between image(such as differences in color,shape etc. or say local differences)..Two image representing same individual would have small distance as compared to image representing different individual..though it would highly depend on the type of data set you are currently working.

Forgive me for stating the obvious, but why not use any supervised classifier (SVM, GMM, k-NN, etc.), get one label for each test sample (e.g., face, voice, text, etc.), and then see if the two labels match?

Otherwise, you could perform a binary hypothesis test. H0 = two samples do not match. H1 = two samples match. For two test samples, x1 and x2, compute a distance, d(x1, x2). Choose H1 if d(x1, x2) < epsilon and H0 otherwise. Adjusting epsilon will adjust your probability of detection and probability of false alarm. Your application would dictate which epsilon is best; for example, maybe you can tolerate misses but cannot tolerate false alarms, or vice versa. This is called Neyman-Pearson hypothesis testing.