如何将searchTwitter结果（从库（twitteR））转换为data.frame？_R_Twitter_Rodbc

如何将searchTwitter结果（从库（twitteR））转换为data.frame？

r twitter

如何将searchTwitter结果（从库（twitteR））转换为data.frame？,r,twitter,rodbc,R,Twitter,Rodbc,我正在将twitter搜索结果保存到数据库（SQL Server）中，从twitter中提取搜索结果时出错如果我执行： library(twitteR) puppy <- as.data.frame(searchTwitter("puppy", session=getCurlHandle(),num=100)) 这一点很重要，因为要使用RODBC将其添加到使用sqlSave的表中，它需要是data.frame。至少这是我收到的错误信息： Error in sqlSave(localSQ

我正在将twitter搜索结果保存到数据库（SQL Server）中，从twitter中提取搜索结果时出错

如果我执行：

library(twitteR)
puppy <- as.data.frame(searchTwitter("puppy", session=getCurlHandle(),num=100))

这一点很重要，因为要使用RODBC将其添加到使用sqlSave的表中，它需要是data.frame。至少这是我收到的错误信息：

Error in sqlSave(localSQLServer, puppy, tablename = "puppy_staging",  : 
  should be a data frame

那么，对于如何将列表强制为data.frame或如何通过RODBC加载列表，有人有什么建议吗

我的最终目标是创建一个表，反映searchTwitter返回的值的结构。以下是我试图检索和加载的内容的示例：

library(twitteR)
puppy <- searchTwitter("puppy", session=getCurlHandle(),num=2)
str(puppy)

List of 2
 $ :Formal class 'status' [package "twitteR"] with 10 slots
  .. ..@ text        : chr "beautifull and  kc reg Beagle Mix for rehomes: This little puppy is looking for a new loving family wh... http://bit.ly/9stN7V "| __truncated__
  .. ..@ favorited   : logi FALSE
  .. ..@ replyToSN   : chr(0) 
  .. ..@ created     : chr "Wed, 16 Jun 2010 19:04:03 +0000"
  .. ..@ truncated   : logi FALSE
  .. ..@ replyToSID  : num(0) 
  .. ..@ id          : num 1.63e+10
  .. ..@ replyToUID  : num(0) 
  .. ..@ statusSource: chr "&lt;a href=&quot;http://twitterfeed.com&quot; rel=&quot;nofollow&quot;&gt;twitterfeed&lt;/a&gt;"
  .. ..@ screenName  : chr "puppy_ads"
 $ :Formal class 'status' [package "twitteR"] with 10 slots
  .. ..@ text        : chr "the cutest puppy followed me on my walk, my grandma won't let me keep it. taking it to the pound sadface"
  .. ..@ favorited   : logi FALSE
  .. ..@ replyToSN   : chr(0) 
  .. ..@ created     : chr "Wed, 16 Jun 2010 19:04:01 +0000"
  .. ..@ truncated   : logi FALSE
  .. ..@ replyToSID  : num(0) 
  .. ..@ id          : num 1.63e+10
  .. ..@ replyToUID  : num(0) 
  .. ..@ statusSource: chr "&lt;a href=&quot;http://blackberry.com/twitter&quot; rel=&quot;nofollow&quot;&gt;Twitter for BlackBerry®&lt;/a&gt;"
  .. ..@ screenName  : chr "iamsweaters"

试试这个：

ldply(searchTwitter("#rstats", n=100), text)

twitteR返回一个S4类，因此您需要使用它的一个助手函数，或者直接处理它的插槽。您可以使用

unclass（）

查看插槽，例如：

unclass(searchTwitter("#rstats", n=100)[[1]])

通过使用相关功能（从twitteR帮助：？statusSource）可以直接访问上述插槽：

正如我提到的，我的理解是，您必须在输出中自己指定这些字段中的每一个。下面是一个使用两个字段的示例：

> head(ldply(searchTwitter("#rstats", n=100), 
        function(x) data.frame(text=text(x), favorited=favorited(x))))
                                                                                                                                          text
1                                                     @statalgo how does that actually work? does it share mem between #rstats and postgresql?
2                                   @jaredlander Have you looked at PL/R? You can call #rstats from PostgreSQL: http://www.joeconway.com/plr/.
3   @CMastication I was hoping for a cool way to keep data in a DB and run the normal #rstats off that. Maybe a translator from R to SQL code.
4                     The distribution of online data usage: AT&amp;T has recently announced it will no longer http://goo.gl/fb/eTywd #rstat
5 @jaredlander not that I know of. Closest is sqldf package which allows #rstats and sqlite to share mem so transferring from DB to df is fast
6 @CMastication Can #rstats run on data in a DB?Not loading it in2 a dataframe or running SQL cmds but treating the DB as if it wr a dataframe
  favorited
1     FALSE
2     FALSE
3     FALSE
4     FALSE
5     FALSE
6     FALSE

如果您想经常这样做，可以将其转换为函数。

我使用的代码是我不久前发现的：

#get data
tws<-searchTwitter('#keyword',n=10)

#make data frame
df <- do.call("rbind", lapply(tws, as.data.frame))

#write to csv file (or your RODBC code)
write.csv(df,file="twitterList.csv")

#获取数据
tws对于那些遇到我所做的相同问题的人来说，这是一个错误的说法
Error in as.double(y) : cannot coerce type 'S4' to vector of type 'double' 

我只是简单地改变了文本中的单词
ldply(searchTwitter("#rstats", n=100), text) 

要查看statusText，如下所示：
ldply(searchTwitter("#rstats", n=100), statusText)

只是一个友好的提示：p
我知道这是一个老问题，但我认为这是一个“现代”版本来解决这个问题。只需使用函数twListToDf

gvegayon <- getUser("gvegayon")
timeline <- userTimeline(gvegayon,n=400)
tl <- twListToDF(timeline)

gvegayon这里有一个很好的函数将其转换为DF
TweetFrame<-function(searchTerm, maxTweets)
{
  tweetList<-searchTwitter(searchTerm,n=maxTweets)
  return(do.call("rbind",lapply(tweetList,as.data.frame)))
}

TweetFrametwitteR
软件包现在包含一个函数twListToDF
，它将为您完成此任务
puppy_table <- twListToDF(puppy)

puppy_table Shane，我需要加载什么库来实现这一点？是普利尔吗？我看是普利尔。它确实将列表转换为data.frame。现在，从searchTwitter返回的10列位于data.frame中的一列中。我怎样才能把它们分开？你能更新你的问题吗？我不确定您希望最终输出是什么样子…我更新了我的问题，谢谢您的建议。我正在浏览他们的文章，试图使其具有正确的结构。@analyticsPierce:请看我的回答。在我看来，这个解决方案只有在搜索和处理特定用户的tweet时才有效？是的，这不是一个真正的解决方案。。。至少我不同意这个问题或twitter的一般用例。问题基本上是如何从twitteR的状态对象获取data.frame。如果您有一个列表，这是原始问题的情况，那么您只需将该函数应用于列表中的每个对象。这对我很好。适用于多个用户：twListToDF（lappy（c（“@handle1”，“@handle2”），getUser））
gvegayon <- getUser("gvegayon")
timeline <- userTimeline(gvegayon,n=400)
tl <- twListToDF(timeline)

TweetFrame<-function(searchTerm, maxTweets)
{
  tweetList<-searchTwitter(searchTerm,n=maxTweets)
  return(do.call("rbind",lapply(tweetList,as.data.frame)))
}

tweets <- TweetFrame(" ", n)

puppy_table <- twListToDF(puppy)