R 将逗号分隔整数的长字符串转换为x和y列_R_Stata_Data Manipulation

R 将逗号分隔整数的长字符串转换为x和y列

r stata

R 将逗号分隔整数的长字符串转换为x和y列,r,stata,data-manipulation,R,Stata,Data Manipulation,我的数据是一条长长的单行值，由逗号分隔，其中每个其他值都是x或y坐标数据如下所示： 262273113876602621628144452226192351681640 但我希望它看起来像这样： 26227311387660 26216281444522 26192351681640 除了像我在上面的示例中那样浏览整个文件并删除逗号并按enter键外，我如何在R（或Stata）中自动执行此操作？在R中： ## Read in your data ## data = readLines(&quo

我的数据是一条长长的单行值，由逗号分隔，其中每个其他值都是x或y坐标

数据如下所示： 262273113876602621628144452226192351681640

但我希望它看起来像这样：

26227311387660

26216281444522

26192351681640

除了像我在上面的示例中那样浏览整个文件并删除逗号并按enter键外，我如何在R（或Stata）中自动执行此操作？

在R中：

## Read in your data
## data = readLines("path/to/your_file.txt")
## Should get you something like this (using the example in your Q)
data = "2622731,1387660,2621628,1444522,2619235,1681640"
data = unlist(strsplit(data, ","))
data = matrix(as.numeric(data), ncol = 2, byrow = TRUE)
data
#         [,1]    [,2]
# [1,] 2622731 1387660
# [2,] 2621628 1444522
# [3,] 2619235 1681640

也许在那一点上

data = as.data.frame(data)
names(data) = c("x", "y")
#         x       y
# 1 2622731 1387660
# 2 2621628 1444522
# 3 2619235 1681640

在R中：

也许在那一点上

data = as.data.frame(data)
names(data) = c("x", "y")
#         x       y
# 1 2622731 1387660
# 2 2621628 1444522
# 3 2619235 1681640

在Stata中，可接受的R解决方案的模拟可能涉及

拆分

和

重塑长

。以下是另一种方法：

* data example 
clear
set obs 1
gen strL data = "2622731,1387660,2621628,1444522,2619235,1681640"

* code for data example 
replace data = subinstr(data, ",", " ", .)
set obs `=wordcount(data)/2' 
gen x = real(word(data[1], 2 * _n - 1))
gen y = real(word(data[1], 2 * _n))

list 

     +---------------------------------------------------------------------+
     |                                            data         x         y |
     |---------------------------------------------------------------------|
  1. | 2622731 1387660 2621628 1444522 2619235 1681640   2622731   1387660 |
  2. |                                                   2621628   1444522 |
  3. |                                                   2619235   1681640 |
     +---------------------------------------------------------------------+

在Stata中，可接受的R解决方案的模拟可能涉及

拆分

和

重塑长

。以下是另一种方法：

* data example 
clear
set obs 1
gen strL data = "2622731,1387660,2621628,1444522,2619235,1681640"

* code for data example 
replace data = subinstr(data, ",", " ", .)
set obs `=wordcount(data)/2' 
gen x = real(word(data[1], 2 * _n - 1))
gen y = real(word(data[1], 2 * _n))

list 

     +---------------------------------------------------------------------+
     |                                            data         x         y |
     |---------------------------------------------------------------------|
  1. | 2622731 1387660 2621628 1444522 2619235 1681640   2622731   1387660 |
  2. |                                                   2621628   1444522 |
  3. |                                                   2619235   1681640 |
     +---------------------------------------------------------------------+

使用

扫描

并使用

矩阵

重塑形状：

s <- "2622731,1387660,2621628,1444522,2619235,1681640" # test data

matrix(scan(text = s, sep = ",", quiet = TRUE), ncol = 2, byrow = TRUE)
##         [,1]    [,2]
## [1,] 2622731 1387660
## [2,] 2621628 1444522
## [3,] 2619235 1681640

s使用扫描
并使用矩阵
重塑：
s <- "2622731,1387660,2621628,1444522,2619235,1681640" # test data

matrix(scan(text = s, sep = ",", quiet = TRUE), ncol = 2, byrow = TRUE)
##         [,1]    [,2]
## [1,] 2622731 1387660
## [2,] 2621628 1444522
## [3,] 2619235 1681640

s非常感谢您！！我不得不改为（strsplit（as.character（x），“，”），但最终还是得到了它！非常感谢你！！我不得不改为（strsplit（as.character（x），“，”），但最终还是得到了它！另一种方法是：read.csv（text=gsub（，\\d+），“\\1\n”，data），header=FALSE）
另一种方法是：read.csv（text=gsub（，\\d+），“\\1\n”，data），header=FALSE）