[Deep Learning] T-sne Visualization

T-sne is a dimensionality reduction technique based on clustering. It’s well suited for embedding high-dimensional data, thus useful to visualize high-dim feature vectors output from deep neural networks. (Similar to PCA but more robust)

Usually we reduce the dimension to 2 for the sake of visualization in 2D space. And a common way to visualize the clustering of high-dim vectors is to create a 2D grid and use the calculated (x,y) as coordinate to position the original image. An example is shown below, the data is CIFAR10 and the features are CNN feature vectors:


tsne embedding on CIFAR10 CNN features

And a zoomed in version of a corner; the dataset is pretty well-clustered:


zoomed in

Based on the tsne embedding, you are able to evaluate your trained network, whether the learned features represent the  images in the correct way as you want. Also, you are able to tell the mis-classified data. But since this is a low dimension representation, the distance shown here doesn’t necessarily reflects the real distance between clusters.

A well-written code in MATLAB is kindly provided by Alex Karpathy in his tSNE JS demo page:  Tsne JS demo

Happy embedding!



3 thoughts on “[Deep Learning] T-sne Visualization

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s