Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/73.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
将州、城市分组为R中的国家_R - Fatal编程技术网

将州、城市分组为R中的国家

将州、城市分组为R中的国家,r,R,在R中,我运行了一个代码来获取两列数据框,其中包含城市、国家和相应的数字 我在列上运行了summary(),并将结果转换为数据帧 我试图把所有的州聚在一起,组成一个国家。例如,在下面的输出中,我想将美国所有的州、城市组合成一个国家“美国”。我是否可以使用grep()查找模式,然后使用一些软件包进行分组?请建议一种方法 location<-summary(pind$userLocation) location<-as.data.frame(location) location 我不确

在R中,我运行了一个代码来获取两列数据框,其中包含城市、国家和相应的数字

我在列上运行了
summary()
,并将结果转换为数据帧

我试图把所有的州聚在一起,组成一个国家。例如,在下面的输出中,我想将美国所有的州、城市组合成一个国家“美国”。我是否可以使用
grep()
查找模式,然后使用一些软件包进行分组?请建议一种方法

location<-summary(pind$userLocation)
location<-as.data.frame(location)
location

我不确定我是否理解这个问题,但我会试一试

因此,您希望为每个位置字符串标识它所属的国家,然后将它们分组,并基于国家组执行操作

如果是这样的话,那么我想到的就是使用ggmap中使用GoogleMapsAPI的地理编码功能,这只有在你不做那么多查询的情况下才有意义

require(dplyr)
require(ggmap)

MyGeoCode <- function(Location){
  return(geocode(Location,output = "more")$country)
}

location$country <- sapply(location$location,MyGeoCode)

 location <- location %>% group_by(country) %>% summarise(TotalPerCountry=sum(numbercolumn,na.rm = TRUE))
require(dplyr)
需要(ggmap)

MyGeoCode由于您的数据没有那么广泛,因此手工操作非常容易。我浏览了每一条记录,确定了它属于哪个国家,并添加了一个新的列,其中包含了结果。有了国家/地区后,您可以使用
aggregate()
获得总额:

location <- data.frame(location=c(271286,58145,1027,900,866,755,590,535,438,392,379,375,373,354,335,323,299,275,275,271,259,254,249,247,244,231,224,221,220,218,204,200,200,198,190,189,187,184,182,182,181,173,172,167,167,163,157,149,148,145,144,142,140,138,134,133,128,127,126,125,124,124,123,121,121,121,116,116,116,115,110,109,105,104,103,103,101,101,100,98,98,95,94,92,89,89,89,88,86,83,83,83,82,81,79,78,76,75,75),row.names=c('','null','Texas','United States','USA','Paris','California','Canada','Florida','New York','Australia','London','Ohio','Michigan','Chicago, IL','Los Angeles, CA','Chicago','Colorado','New York, NY','North Carolina','Minnesota','Seattle, WA','Los Angeles','Indiana','Virginia','Wisconsin','Arizona','Atlanta, GA','Dallas, TX','Oregon','Georgia','Houston, TX','Oklahoma','Utah','Austin, TX','Pennsylvania','Illinois','San Diego, CA','Tennessee','UK','Missouri','Kentucky','San Francisco, CA','Louisiana','NYC','Alabama','Nashville, TN','Iowa','Boston, MA','Kansas','Southern California','Denver, CO','New Jersey','Sydney, Australia','South Carolina','Washington, DC','Maryland','Arkansas','Portland, OR','Phoenix, AZ','Atlanta','London, UK','Melbourne, Australia','Ontario, Canada','Seattle','Washington','Las Vegas, NV','New Zealand','United Kingdom','Brooklyn, NY','CA','Minneapolis, MN','Houston, Texas','NC','New York City','Toronto','Austin, Texas','Charlotte, NC','South Africa','Pittsburgh, PA','San Francisco','Vancouver, BC','Germany','Phoenix, Arizona','Barcelona','Dallas, Texas','Portland, Oregon','England','Idaho','.','San Diego','West Virginia','Nevada','The Netherlands','France','Raleigh, NC','Kansas City, MO','Massachusetts','US'));
location$country <- factor(c(NA,NA,'United States','United States','United States','France','United States','Canada','United States','United States','Australia','United Kingdom','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United Kingdom','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','Australia','United States','United States','United States','United States','United States','United States','United States','United Kingdom','Australia','Canada','United States','United States','United States','New Zealand','United Kingdom','United States','Canada','United States','United States','United States','United States','Canada','United States','United States','South Africa','United States','United States','Canada','Germany','United States','Spain','United States','United States','United Kingdom','United States',NA,'United States','United States','United States','Netherlands','France','United States','United States','United States','United States'));
aggregate(location~country,location,sum);
##           country location
## 1       Australia      640
## 2          Canada      964
## 3          France      834
## 4         Germany       94
## 5     Netherlands       81
## 6     New Zealand      116
## 7    South Africa      100
## 8           Spain       89
## 9  United Kingdom      885
## 10  United States    15964

location我想说的是,识别美国所有的州,例如密歇根州、德克萨斯州、纽约州等,我想把它们归为美国的一个国家。因此,所有美国州以及上述结果中的计数,应在“美国”下分组,并聚合各个州的计数。您能否提供一个示例数据框,说明您希望最终产品的外观如何?我的输出数据框应如下所示(美国计数应为所有美国州的组合,并对其计数进行汇总),同样适用于加拿大。它应包括多伦多、不列颠哥伦比亚等的计数,并应归入加拿大美国1900加拿大1535英国182澳大利亚123德国94荷兰81法国79我希望数据框架中的每个国家。只是该国的各个国家应按国家名称分组,而不是明确地分组。
location <- data.frame(location=c(271286,58145,1027,900,866,755,590,535,438,392,379,375,373,354,335,323,299,275,275,271,259,254,249,247,244,231,224,221,220,218,204,200,200,198,190,189,187,184,182,182,181,173,172,167,167,163,157,149,148,145,144,142,140,138,134,133,128,127,126,125,124,124,123,121,121,121,116,116,116,115,110,109,105,104,103,103,101,101,100,98,98,95,94,92,89,89,89,88,86,83,83,83,82,81,79,78,76,75,75),row.names=c('','null','Texas','United States','USA','Paris','California','Canada','Florida','New York','Australia','London','Ohio','Michigan','Chicago, IL','Los Angeles, CA','Chicago','Colorado','New York, NY','North Carolina','Minnesota','Seattle, WA','Los Angeles','Indiana','Virginia','Wisconsin','Arizona','Atlanta, GA','Dallas, TX','Oregon','Georgia','Houston, TX','Oklahoma','Utah','Austin, TX','Pennsylvania','Illinois','San Diego, CA','Tennessee','UK','Missouri','Kentucky','San Francisco, CA','Louisiana','NYC','Alabama','Nashville, TN','Iowa','Boston, MA','Kansas','Southern California','Denver, CO','New Jersey','Sydney, Australia','South Carolina','Washington, DC','Maryland','Arkansas','Portland, OR','Phoenix, AZ','Atlanta','London, UK','Melbourne, Australia','Ontario, Canada','Seattle','Washington','Las Vegas, NV','New Zealand','United Kingdom','Brooklyn, NY','CA','Minneapolis, MN','Houston, Texas','NC','New York City','Toronto','Austin, Texas','Charlotte, NC','South Africa','Pittsburgh, PA','San Francisco','Vancouver, BC','Germany','Phoenix, Arizona','Barcelona','Dallas, Texas','Portland, Oregon','England','Idaho','.','San Diego','West Virginia','Nevada','The Netherlands','France','Raleigh, NC','Kansas City, MO','Massachusetts','US'));
location$country <- factor(c(NA,NA,'United States','United States','United States','France','United States','Canada','United States','United States','Australia','United Kingdom','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United Kingdom','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','United States','Australia','United States','United States','United States','United States','United States','United States','United States','United Kingdom','Australia','Canada','United States','United States','United States','New Zealand','United Kingdom','United States','Canada','United States','United States','United States','United States','Canada','United States','United States','South Africa','United States','United States','Canada','Germany','United States','Spain','United States','United States','United Kingdom','United States',NA,'United States','United States','United States','Netherlands','France','United States','United States','United States','United States'));
aggregate(location~country,location,sum);
##           country location
## 1       Australia      640
## 2          Canada      964
## 3          France      834
## 4         Germany       94
## 5     Netherlands       81
## 6     New Zealand      116
## 7    South Africa      100
## 8           Spain       89
## 9  United Kingdom      885
## 10  United States    15964