使用正则表达式组合将分隔符保留在Strsplit中

使用正则表达式组合将分隔符保留在Strsplit中,r,regex,strsplit,R,Regex,Strsplit,我正在搜索一些数据,这些数据需要我使用strsplit组合regex函数。我已经知道了如何分割字符串,但是我很难在保留分隔符方面应用指导 下面是我正在抓取的字符串示例: text您可以将正则表达式中的消费模式放入lookbehinds: > text<-c("This activity center is fun and helps give your birds exercise! With climbing ladders, a swing, tightrope and an a

我正在搜索一些数据,这些数据需要我使用
strsplit
组合
regex
函数。我已经知道了如何分割字符串,但是我很难在保留分隔符方面应用指导

下面是我正在抓取的字符串示例:


text您可以将正则表达式中的消费模式放入lookbehinds:

> text<-c("This activity center is fun and helps give your birds exercise! With climbing ladders, a swing, tightrope and an assortment of engaging toys, the Activity Center has everything your bird needs to relieve stress and boredom all in one place. Relieves stress & boredom Durable & brightly colored wood Easy to clean bottom Simple installation & assemblyMaterial: WoodDimensions (Overall): 12.0 inches (H) x 15.0 inches (W) x 18.5 inches (L)Weight: 6.0 poundsHolds up to: 20.0 poundsIntended Pet Type: BirdCare and Cleaning: Hand washPet activity: ClimbTCIN: 16707835UPC: 030172025594Item Number (DPCI): 083-01-0246Report incorrect product information")
> strsplit(text, "(?<=[0-9])(?=[A-Z])|(?<=[a-z])(?=[A-Z])|(?<=\\))(?=[A-Z])", perl=TRUE)
[[1]]
 [1] "This activity center is fun and helps give your birds exercise! With climbing ladders, a swing, tightrope and an assortment of engaging toys, the Activity Center has everything your bird needs to relieve stress and boredom all in one place. Relieves stress & boredom Durable & brightly colored wood Easy to clean bottom Simple installation & assembly"
 [2] "Material: Wood"                                                                                                                                                                                                                                                                                                                                                
 [3] "Dimensions (Overall): 12.0 inches (H) x 15.0 inches (W) x 18.5 inches (L)"                                                                                                                                                                                                                                                                                     
 [4] "Weight: 6.0 pounds"                                                                                                                                                                                                                                                                                                                                            
 [5] "Holds up to: 20.0 pounds"                                                                                                                                                                                                                                                                                                                                      
 [6] "Intended Pet Type: Bird"                                                                                                                                                                                                                                                                                                                                       
 [7] "Care and Cleaning: Hand wash"                                                                                                                                                                                                                                                                                                                                  
 [8] "Pet activity: Climb"                                                                                                                                                                                                                                                                                                                                           
 [9] "TCIN: 16707835"                                                                                                                                                                                                                                                                                                                                                
[10] "UPC: 030172025594"                                                                                                                                                                                                                                                                                                                                             
[11] "Item Number (DPCI): 083-01-0246"                                                                                                                                                                                                                                                                                                                               
[12] "Report incorrect product information"     

>text strsplit(text,”(?你为什么不把消费部分放到lookbehinds中?后续问题(我非常乐意发布一个单独的问题,或者修改我上面的问题):strsplit是否允许我在子字符串的内容不一致时将子字符串发送到列?例如,我可能正在计算不包含
材质
子字符串的第二个字符串。在转换到数据帧时,我仍然希望有一个
材质
列,并且该值为NA f或不具有该子串的字符串。@ RoDeY在分割方法中是绝对不可能的,您应该考虑匹配,甚至捕获,因为捕获的“字段”的数量总是不变的(捕获的数量是由正则表达式中的捕获组的数量定义的)。.有一个很好的代码是别人写的,看看你是否想遵循这个方法。