Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/81.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/string/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
r字符串分离问题_R_String_Split - Fatal编程技术网

r字符串分离问题

r字符串分离问题,r,string,split,R,String,Split,我正在处理以下几个字符串 Col1 -------------------------- 554 - partial-completion_3 4011 - structure painted 5459 - 1 int mam-corrosion issue 996 - cast iron countershock `5459 - 1 int mam-corrosion issue` 我的目标是像这样把这些字符串分成两部分 Col1_Id Col2_Desc -------

我正在处理以下几个字符串

Col1
--------------------------
554 - partial-completion_3
4011 - structure painted
5459 - 1 int mam-corrosion issue
996 - cast iron countershock
       `5459 - 1 int mam-corrosion issue`
我的目标是像这样把这些字符串分成两部分

Col1_Id   Col2_Desc
--------------------------
554       partial-completion_3
4011      structure painted
5459      1 int mam-corrosion issue
996       cast iron countershock
       `5459 - 1 int mam-corrosion issue`
       `5459 - 1 int mam` 
我尝试使用了
分离
功能

df_sep =   df %>% 
  separate(Col1, c("Col1_ID", "Col2_Desc"), "-")
       `5459 - 1 int mam-corrosion issue`
只有当字符串中只有一个-时,如果有两个- 以字符串为例

       `5459 - 1 int mam-corrosion issue`
然后,separate函数在第二个-之后删除描述,输出如下

Col1_Id   Col2_Desc
--------------------------
554       partial-completion_3
4011      structure painted
5459      1 int mam-corrosion issue
996       cast iron countershock
       `5459 - 1 int mam-corrosion issue`
       `5459 - 1 int mam` 
这不是我所期望的。我期待下面这样的输出

       `5459 - 1 int mam-corrosion issue`
    Col1_Id   Col2_Desc
    --------------------------
    554       partial-completion_3
    4011      structure painted
    5459      1 int mam-corrosion issue
    996       cast iron countershock

非常感谢您的任何提示或建议。

我们可以使用
sub
将第一个
-
替换为
,然后使用
read.csv

       `5459 - 1 int mam-corrosion issue`
read.csv(text= sub("-", ",", df1$Col1), header=FALSE, 
          col.names=c("Col1_Id",   "Col2_Desc"), stringsAsFactors=FALSE)
#   Col1_Id                  Col2_Desc
#1     554       partial-completion_3
#2    4011          structure painted
#3    5459  1 int mam-corrosion issue
#4     996     cast iron countershock

separate
的情况下,有一个
额外的
参数,可用于解决此问题

       `5459 - 1 int mam-corrosion issue`
library(tidyr)
separate(df1, Col1, into = c("Col1_Id", "Col2_Desc"), extra="merge")
#  Col1_Id                 Col2_Desc
#1     554      partial-completion_3
#2    4011         structure painted
#3    5459 1 int mam-corrosion issue
#4     996    cast iron countershock
数据
df1一个基本R选项是
strsplit
将列拆分为一个列表,然后使用
rbind.data.frame
构建一个data.frame
SetNames
用于方便地在同一行中设置名称

       `5459 - 1 int mam-corrosion issue`
setNames(do.call(rbind.data.frame, strsplit(df1$Col1, split=" - ")),
         c("Col1_Id", "Col2_Desc"))

  Col1_Id                 Col2_Desc
1     554      partial-completion_3
2    4011         structure painted
3    5459 1 int mam-corrosion issue
4     996    cast iron countershock

看起来akrun已经解决了这个问题,但在将来,如果您以一种易于复制的方式(如
dput()
)共享数据,或者通过在您提供的代码中创建数据来共享数据会更好,这很好,很好。