In [1]:
library(readr)
library(dplyr)
library(stringr)
library(wordcloud)
library(tidytext)
In [2]:
library(tidyRSS)
I'm using a Portuguese list. If you want to analyze english feeds, just change next cell to:
data("stop_words")
In [10]:
stopwords <- read_csv('portuguese-stopwords.txt', col_names = 'word')
Change the urls for your favorite news feed providers...
In [11]:
feed <- tidyfeed("http://feed1")
feed2 <- tidyfeed("http://feed2")
#feed3 <- tidyfeed("https://<feed3")
#feed4 <- tidyfeed("https://<feed4>")
In [12]:
feedall <- bind_rows(feed,feed2)
If you want to analyze english feeds, change the following cell to:
rss_t <- feedall %>%
unnest_tokens(word,item_title) %>%
anti_join(stop_words,by="word")
In [13]:
rss_t <- feedall %>%
unnest_tokens(word,item_title) %>%
anti_join(stopwords,by="word")
In [14]:
words = rss_t %>%
group_by(word) %>%
summarise(freq = n()) %>%
arrange(desc(freq))
words = as.data.frame(words)
rownames(words) = words$word
In [15]:
library(wordcloud)
In [16]:
options(warn=-1)
wordcloud(words$word,words$freq,scale=c(8,.02),min.freq=3,max.words=Inf,colors=brewer.pal(8,"Dark2"))
options(warn=0)
In [ ]: