Description

IEEE Transactions On Knowledge And Data Engineering. Volume:20, Issue: ... comp.graphics rec.motorcycles rec.sport.baseball sci.space talk.politics.mideast ...

Transcripts

Semisupervised Clustering with Metric Learning Using Relative Comparisons Nimit Kumar, Member, IEEE, and Krishna Kummamuru IEEE Transactions On Knowledge And Data Engineering Volume:20, Issue:4, Pages:496-503 指導老師：陳彥良 教授 、許秉瑜 教授 � � 告 人：林欣瑾 中華民國 97 年 8 月 14 日

Outline Introduction Related work Problem definition The learning calculations Experimental study Summary and conclusions

Introduction(1/3) Semisupervised grouping calculations are turning out to be more well known fundamentally as a result of (1)the wealth of unlabeled information (2)the high cost of getting marked information The most prominent type of supervision utilized as a part of bunching calculations is as far as pairwise input →must-joins: information guides having a place toward the same group →cannot-join: information directs having a place toward the distinctive bunch

Introduction(2/3) The pairwise limitations have two disadvantages: (1) The focuses in can\'t connect imperatives may really lie in wrong groups and still fulfill the can\'t interface requirements (2) the must-join requirement would deceive the grouping calculation if the focuses in the requirement have a place with two unique groups of the same class. Su pervision to be accessible as far as relative examinations : x is near y than to z . (as triplet imperatives )

Introduction(3/3) This paper call the proposed calculation Semisupervised SVaD (SSSVaD) Assume an arrangement of marked information, relative correlations can be gotten from any three focuses from the set if two of them have a place with a class not the same as the class of the third point. Tr iplet requirements give more data on the hidden disparity measure than the pairwise imperatives.

Related work

Problem definition Given an arrangement of unlabeled examples and triplet requirements , the target of SSSVaD is to discover an allotment of the information set alongside the parameters of the SVaD measure that minimize the inside bunch divergence while fulfilling whatever number triplet imperatives as could be expected under the circumstances.

The learning algorithms(1/2) 1.Spatially Variant Dissimilarity (SVaD) 2.Semisupervised SVaD (SSSVaD) 3.M etric pairwise compelled K-Means ( MPCK-Means) 4.rMPCK-Means 5.K-Means Algorithms (KMA)

The learning algorithms(2/2) SSSVaD versus MPCK-Means

Experimental study Data sets(20 NewsGroup):

Experimental study Effect of the Number of Clusters (1)Binary

Experimental study (2)Multi5

Experimental study (3)Multi10

Experimental study Effect of the Amount of Supervision (1)Binary

Experimental study (2)Multi5

Experimental study (3)Multi10

Experimental study Effect of Initialization (1)Binary

Experimental study (2)Multi10

Summary and conclusions The effectiveness of relative examinations over pairwise imperatives was built up through comprehensive experimentations. The proposed calculation (SSSVaD) accomplishes higher precision and is more vigorous than comparative calculations utilizing pairwise imperatives for supervision.

Thanks for your tuning in