Read data fro CSV file and get only rows that all columns are fulfilled

youngData <- read.csv("C:/Users/Eren/Desktop/data/data/youth_responses.csv",sep=",") 
youngData <- youngData %>%  filter(complete.cases(.)) %>% tbl_df()

I wanted to see how following interests related to each other, History,Psychology,Politics,Mathematics,Physics,Economy Management,Biology,Chemistry,Reading,Geography,Foreign languages,Medicine,Law

So that I selected only these columns (History to Law, except PC and Internet)

scienceInterest <- youngData %>% select(History:Law, -PC, -Internet)

Then applied multidimensional scaling with 2 dimensions to find out distance between interests.

scienceDistance <- 1 - cor(scienceInterest)
scienceMds <- cmdscale(scienceDistance,k=2)
colnames(scienceMds) <- c("x","y")

Plotted the graph with taking rownames as labels and colors. (Color here does not imply anything, it only makes graph beatiful)

ggplot(data.frame(scienceMds),aes(x=x,y=y)) + geom_text(aes(label=rownames(scienceMds),color=rownames(scienceMds)),angle=45,size=4)

Comments

As the result on the plot, we see that who likes Biology, mostly likes Medicine and Chemirstry also. Same close relation also occurs between Psychology and Foreign Languages