Using R to plot clustered output


In [1]:
setwd('~/Codes/DL - Topic Modelling')
library(ggplot2)
library(dplyr)

# load cluster analysis on DGM data
load('data/cluster_results.RData')


Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union

Clustered output with respect to original labels (20 classes)


In [3]:
for (i in 1:20) {
	plt_dat <- data.frame(labels = Y[X.clust$cluster == i]) %>%
					group_by(labels) %>%
					summarize( freq = n())
	plt <- ggplot(plt_dat, aes(x=labels, y = freq)) + 
	            geom_bar(stat="identity") + 
	            theme(axis.text.x = element_text(angle = 60, hjust = 1))
	print(plt)
    # --- does not work in Jupyter Notebook ---
	# invisible(readline(prompt="Press [enter] to continue"))
	rm(plt)
}


Clustered output with respect to original labels (6 class)


In [4]:
for (i in sort(unique(X.clust_overview$cluster))) {
	plt_dat <- data.frame(labels = Y_overview[X.clust_overview$cluster == i]) %>%
					group_by(labels) %>%
					summarize( freq = n())
	plt <- ggplot(plt_dat, aes(x=labels, y = freq)) + geom_bar(stat="identity")
	print(plt)
	rm(plt)
}


[1] "----- Current cluster:  1  -----"
[1] "----- Current cluster:  2  -----"
[1] "----- Current cluster:  3  -----"
[1] "----- Current cluster:  4  -----"
[1] "----- Current cluster:  5  -----"
[1] "----- Current cluster:  6  -----"