R-Google新闻源中的Tm.plugin.webmining

R-Google新闻源中的Tm.plugin.webmining,r,purrr,R,Purrr,我正在努力学习R中的文本挖掘 在尝试挖掘google新闻和财经页面时,我遇到了使用tm.plugin.webmining包的问题(请参阅下面附带的代码和错误消息) 如果能得到任何帮助,我将不胜感激 使用GoogleNewsSource googlenews <- WebCorpus(GoogleNewsSource("Microsoft")) Unknown IO errorfailed to load external entity "http://news.google.com/new

我正在努力学习R中的文本挖掘

在尝试挖掘google新闻和财经页面时,我遇到了使用tm.plugin.webmining包的问题(请参阅下面附带的代码和错误消息)

如果能得到任何帮助,我将不胜感激

使用GoogleNewsSource

googlenews <- WebCorpus(GoogleNewsSource("Microsoft"))
Unknown IO errorfailed to load external entity "http://news.google.com/news?hl=en&q=Microsoft&ie=utf-8&num=100&output=rss"
Error: 1: Unknown IO error2: failed to load external entity "http://news.google.com/news?hl=en&q=Microsoft&ie=utf-8&num=100&output=rss"

library(tm.plugin.webmining)
library(purrr)

company <- c("Microsoft", "Apple", "Google", "Amazon", "Facebook",
             "Twitter", "IBM", "Yahoo", "Netflix")
symbol <- c("MSFT", "AAPL", "GOOG", "AMZN", "FB", "TWTR", "IBM", "YHOO", "NFLX")

download_articles <- function(symbol) {
  WebCorpus(GoogleFinanceSource(paste0("NASDAQ:", symbol)))
}

stock_articles <- data_frame(company = company,
                             symbol = symbol) %>%
  mutate(corpus = map(symbol, download_articles))
failed to load HTTP resource
Error in mutate_impl(.data, dots) : 1: failed to load HTTP resource

googlenews问题在于
NASDAQ:TWTR
。从
公司中删除
“Twitter”
和从
符号中删除
“TWTR”
可以解决此错误

company <- c("Microsoft", "Apple", "Google", "Amazon", "Facebook", "Netflix")
symbol <- c("MSFT", "AAPL", "GOOG", "AMZN", "FB", "NFLX")

download_articles <- function(symbol) {
  WebCorpus(GoogleFinanceSource(paste0("NASDAQ:", symbol)))
}
stock_articles <- data_frame(company = company,
                             symbol = symbol) %>%
  mutate(corpus = map(symbol, download_articles))
stock_articles
#     # A tibble: 6 x 3
#     company symbol          corpus
#       <chr>  <chr>          <list>
# 1 Microsoft   MSFT <S3: WebCorpus>
# 2     Apple   AAPL <S3: WebCorpus>
# 3    Google   GOOG <S3: WebCorpus>
# 4    Amazon   AMZN <S3: WebCorpus>
# 5  Facebook     FB <S3: WebCorpus>
# 6   Netflix   NFLX <S3: WebCorpus>

company问题出在
NASDAQ:TWTR
。从
公司中删除
“Twitter”
和从
符号中删除
“TWTR”
可以解决此错误

company <- c("Microsoft", "Apple", "Google", "Amazon", "Facebook", "Netflix")
symbol <- c("MSFT", "AAPL", "GOOG", "AMZN", "FB", "NFLX")

download_articles <- function(symbol) {
  WebCorpus(GoogleFinanceSource(paste0("NASDAQ:", symbol)))
}
stock_articles <- data_frame(company = company,
                             symbol = symbol) %>%
  mutate(corpus = map(symbol, download_articles))
stock_articles
#     # A tibble: 6 x 3
#     company symbol          corpus
#       <chr>  <chr>          <list>
# 1 Microsoft   MSFT <S3: WebCorpus>
# 2     Apple   AAPL <S3: WebCorpus>
# 3    Google   GOOG <S3: WebCorpus>
# 4    Amazon   AMZN <S3: WebCorpus>
# 5  Facebook     FB <S3: WebCorpus>
# 6   Netflix   NFLX <S3: WebCorpus>

公司欢迎访问Stack Overflow请浏览欢迎访问Stack Overflow请浏览我建议您访问并搜索您感兴趣的特定公司,如“Twitter”。然后搜索结果将为您带来正确的符号:在本例中为NYSE:TWTR@raoul你可能想把这些信息加入你的答案中。我建议你去搜索一家你感兴趣的公司,比如“Twitter”。然后搜索结果将为您带来正确的符号:在本例中为NYSE:TWTR@raoul你可能想把这些信息加入你的答案中。