Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/77.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/tfs/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 按日期合并推文_R_Text Mining_String Concatenation - Fatal编程技术网

R 按日期合并推文

R 按日期合并推文,r,text-mining,string-concatenation,R,Text Mining,String Concatenation,我希望这不是一个基本的问题, 我有一个tweets的数据帧(在R中)。 我的目标是按日期计算情绪 如果有人能给我建议,我将不胜感激, 如何按日期连接tweetstweet$text,其中 每个观察结果都会变成一个合并的tweet/文本字符串 例如,如果我有: Created_Date Tweet 2014-01-04 "the iphone is magnificent" 2014-01-04 "the iphone's screen is poo

我希望这不是一个基本的问题, 我有一个tweets的数据帧(在R中)。 我的目标是按日期计算情绪

如果有人能给我建议,我将不胜感激, 如何按日期连接tweets
,其中 每个观察结果都会变成一个合并的tweet/文本字符串


Created_Date       Tweet

2014-01-04         "the iphone is magnificent"

2014-01-04         "the iphone's screen is poor"

2014-01-04         "I will always use Apple products"

2014-01-03         "iphone is overpriced, but I love it"

2014-01-03         "Siri is very sluggish"

2014-01-03         "iphone's maps app is poor compared to Android"
我想要一个循环/函数来按创建日期合并推文 结果是这样的

Created_Date       Tweet

2014-01-04         "the iphone is magnificent", "the iphone's screen is poor",              "I will always use Apple products"

2014-01-03         "iphone is overpriced, but I love it", "Siri is very sluggish", "iphone's maps app is poor compared to Android"

 dat <-   structure(list(Created_Date = structure(c(1388793600, 1388793600, 
    1388793600, 1388707200, 1388707200, 1388707200), class = c("POSIXct", 
    "POSIXt"), tzone = "UTC"), Tweet = c("the iphone is magnificent", 
    "the iphone's screen is poor", "I will always use Apple products", 
    "iphone is overpriced, but I love it", "Siri is very sluggish", 
    "iphone's maps app is poor compared to Android")), .Names = c("Created_Date", 
    "Tweet"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 


# construction of a sample data.frame
text = c("Some random text.", 
         "Yet another line.",
         "Will this ever stop.",
         "This may be the last one.",
         "It was not the last.")
date = c("9-11-2017",
tweet = data.frame(text, date)

# array with dates in the data.frame
dates = levels(tweet$date)

# initialise results with empty strings
resultString = rep.int("", length(dates)) 

for(i in 1:length(dates)) # loop over different dates
    for(j in 1:length(tweet$text)) # loop over tweets
        if (tweet$date[j] == dates[i]) # concatenate to resultString if dates match
            resultString[i] = paste0(resultString[i], tweet$text[j])

# combine concatenated strings with dates in new data.frame
result = data.frame(date=dates, tweetsByDate=resultString)

# output:
# date                               tweetsByDate
# 1 10-11-2017   Will this ever stop.It was not the last.
# 2 11-11-2017 Yet another line.This may be the last one.
# 3  9-11-2017                          Some random text.
使用data.table的示例 如果您使用的是语料库,那么您可以使用
term\u counts
term\u matrix


# map terms in the AFINN dictionary to Positive/Negative; others to Neutral
stem_sent <- new_stemmer(sentiment_afinn$term,
                         ifelse(sentiment_afinn$score > 0, "Positive", "Negative"),
                         default = "Neutral")

term_matrix(dat$Tweet, group = dat$Created_Date, stemmer = stem_sent)
## 2 x 3 sparse Matrix of class "dgCMatrix"
##            Negative Neutral Positive
## 2014-01-03        2      17        1
## 2014-01-04        1      14        .

aggregate(ta$Tweet,by=list(ta$Created_Date),FUN=function(X)paste(X, collapse = ","))
# map terms in the AFINN dictionary to Positive/Negative; others to Neutral
stem_sent <- new_stemmer(sentiment_afinn$term,
                         ifelse(sentiment_afinn$score > 0, "Positive", "Negative"),
                         default = "Neutral")
term_counts(dat$Tweet, group = dat$Created_Date, stemmer = stem_sent)
##   group      term     count
## 1 2014-01-03 Negative     2 
## 2 2014-01-04 Negative     1
## 3 2014-01-03 Neutral     17
## 4 2014-01-04 Neutral     14
## 5 2014-01-03 Positive     1
term_matrix(dat$Tweet, group = dat$Created_Date, stemmer = stem_sent)
## 2 x 3 sparse Matrix of class "dgCMatrix"
##            Negative Neutral Positive
## 2014-01-03        2      17        1
## 2014-01-04        1      14        .