Python R将原始数据转换为字符

Python R将原始数据转换为字符,python,r,mongodb,lapply,data-cleaning,Python,R,Mongodb,Lapply,Data Cleaning,我尝试将mongodb的R load数据与包“mongolite”一起使用,代码如下: df <- db$find('{}', '{"CurrentId":1,"_id":0}') [[1]] list() [[2]] list() [[3]] list() [[4]] list() [[5]] list() [[6]] [[6]][[1]] [1] 56 cd 5f 02 b8 9b 5b d0 26 cb 39 c9 [[6]][[2]] [1] 56 cd 6c 13

我尝试将mongodb的R load数据与包“mongolite”一起使用,代码如下:

df <- db$find('{}', '{"CurrentId":1,"_id":0}')
[[1]]
list()

[[2]]
list()

[[3]]
list()

[[4]]
list()

[[5]]
list()

[[6]]
[[6]][[1]]
[1] 56 cd 5f 02 b8 9b 5b d0 26 cb 39 c9

[[6]][[2]]
[1] 56 cd 6c 13 b8 9b 5b d0 26 cb 39 d5

[[6]][[3]]
[1] 56 cd 6f c6 b8 9b 5b d0 26 cb 39 de
[[1]]
list()

[[2]]
list()

[[3]]
list()

[[4]]
list()

[[5]]
list()

[[6]]
[[6]][[1]]
[1] "56cd5f02b89b5bd026cb39c9"

[[6]][[2]]
[1] "56cd6c13b89b5bd026cb39d5"

[[6]][[3]]
[1] "56cd6fc6b89b5bd026cb39de"
[1] ""                                                                                                                                                                                                                                                   
[2] ""                                                                                                                                                                                                                                                   
[3] ""                                                                                                                                                                                                                                                   
[4] ""                                                                                                                                                                                                                                                   
[5] ""                                                                                                                                                                                                                                                   
[6] "as.raw(c(0x56, 0xcd, 0x5f, 0x02, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xc9))as.raw(c(0x56, 0xcd, 0x6c, 0x13, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xd5))as.raw(c(0x56, 0xcd, 0x6f, 0xc6, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xde))"
df[[6]][[1]]
是:

 [1] 56 cd 5f 02 b8 9b 5b d0 26 cb 39 c9
typeof(df[[6]][[1]])
的类型是:

 [1] "raw"
我使用
粘贴(dc3[[6]][[1]],collapse='')
将原始类型转换为字符串,就像mongodb ObjectId格式一样:

 [1] "56cd5f02b89b5bd026cb39c9"
然后我尝试将
df
中的所有原始数据转换为
string
,如上所述。因此我使用
sapply
函数:

sapply(df,函数(x)粘贴(as.character(x),collapse='')

得到这个:

[1] ""                                                                                                                                                                                                                                                   
[2] ""                                                                                                                                                                                                                                                   
[3] ""                                                                                                                                                                                                                                                   
[4] ""                                                                                                                                                                                                                                                   
[5] ""                                                                                                                                                                                                                                                   
[6] "as.raw(c(0x56, 0xcd, 0x5f, 0x02, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xc9))as.raw(c(0x56, 0xcd, 0x6c, 0x13, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xd5))as.raw(c(0x56, 0xcd, 0x6f, 0xc6, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xde))"
但我想得到这样的东西:

df <- db$find('{}', '{"CurrentId":1,"_id":0}')
[[1]]
list()

[[2]]
list()

[[3]]
list()

[[4]]
list()

[[5]]
list()

[[6]]
[[6]][[1]]
[1] 56 cd 5f 02 b8 9b 5b d0 26 cb 39 c9

[[6]][[2]]
[1] 56 cd 6c 13 b8 9b 5b d0 26 cb 39 d5

[[6]][[3]]
[1] 56 cd 6f c6 b8 9b 5b d0 26 cb 39 de
[[1]]
list()

[[2]]
list()

[[3]]
list()

[[4]]
list()

[[5]]
list()

[[6]]
[[6]][[1]]
[1] "56cd5f02b89b5bd026cb39c9"

[[6]][[2]]
[1] "56cd6c13b89b5bd026cb39d5"

[[6]][[3]]
[1] "56cd6fc6b89b5bd026cb39de"
[1] ""                                                                                                                                                                                                                                                   
[2] ""                                                                                                                                                                                                                                                   
[3] ""                                                                                                                                                                                                                                                   
[4] ""                                                                                                                                                                                                                                                   
[5] ""                                                                                                                                                                                                                                                   
[6] "as.raw(c(0x56, 0xcd, 0x5f, 0x02, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xc9))as.raw(c(0x56, 0xcd, 0x6c, 0x13, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xd5))as.raw(c(0x56, 0xcd, 0x6f, 0xc6, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xde))"
有人知道怎么处理吗?有没有更有效的方法来完成整个工作

更新:

我应该给出一些代码来重现我的原始数据集:

test = as.raw(as.hexmode(x = c("56","cd","5f","02","b8","9b","5b","d0","26","cb","39","c9")))
df = lapply(1:10,function(x) test)
尽管此代码产生以下结果:

[[1]]
list()

[[2]]
[[2]][[1]]
[1] 5f

[[2]][[2]]
[1] d0


[[3]]
[[3]][[1]]
[1] 26

[[3]][[2]]
[1] 56


[[4]]
list()

[[5]]
[[5]][[1]]
[1] cb


[[6]]
list()
它不像原始的
df
,但我真的不知道如何将原始数据粘贴到嵌套列表中,希望这对您有所帮助

sapply(df,函数(x)粘贴(x,collapse='')的结果如下所示:

df <- db$find('{}', '{"CurrentId":1,"_id":0}')
[[1]]
list()

[[2]]
list()

[[3]]
list()

[[4]]
list()

[[5]]
list()

[[6]]
[[6]][[1]]
[1] 56 cd 5f 02 b8 9b 5b d0 26 cb 39 c9

[[6]][[2]]
[1] 56 cd 6c 13 b8 9b 5b d0 26 cb 39 d5

[[6]][[3]]
[1] 56 cd 6f c6 b8 9b 5b d0 26 cb 39 de
[[1]]
list()

[[2]]
list()

[[3]]
list()

[[4]]
list()

[[5]]
list()

[[6]]
[[6]][[1]]
[1] "56cd5f02b89b5bd026cb39c9"

[[6]][[2]]
[1] "56cd6c13b89b5bd026cb39d5"

[[6]][[3]]
[1] "56cd6fc6b89b5bd026cb39de"
[1] ""                                                                                                                                                                                                                                                   
[2] ""                                                                                                                                                                                                                                                   
[3] ""                                                                                                                                                                                                                                                   
[4] ""                                                                                                                                                                                                                                                   
[5] ""                                                                                                                                                                                                                                                   
[6] "as.raw(c(0x56, 0xcd, 0x5f, 0x02, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xc9))as.raw(c(0x56, 0xcd, 0x6c, 0x13, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xd5))as.raw(c(0x56, 0xcd, 0x6f, 0xc6, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xde))"
只需在
sapply()
调用中使用
paste()
而不调用
as.character()
。 简短示例:

convertRaw = function(x) paste(x,collapse = '') # works identical in sapply
test = as.raw(as.hexmode(x = c("56","cd","5f","02","b8","9b","5b","d0","26","cb","39","c9"))) # line copied from your sample
convertRaw(test)
[1] "56cd5f02b89b5bd026cb39c9"
更新 实际上,使用嵌套列表会导致另一个问题。因为您处理的是嵌套列表,所以sapply调用也需要嵌套。您可以通过
lappy()
调用它。下面是一个简短的例子,希望最终解决您的问题:

test = as.raw(as.hexmode(x = c("56","cd","5f","02","b8","9b","5b","d0","26","cb","39","c9")))
testList = list(list(),list(test,test)) # here I create a short nested list
res = lapply(testList,function(y) sapply(y,function(x) paste(x,collapse = '')))
print(res) 
结果是:

[[1]] list() 

[[2]] [1] "56cd5f02b89b5bd026cb39c9" "56cd5f02b89b5bd026cb39c9"
如果您喜欢:

[[1]] list()

[[2]] [[2]][[1]] 
[1] "56cd5f02b89b5bd026cb39c9"

[[2]][[2]] 
[1] "56cd5f02b89b5bd026cb39c9"
只需调用,
lappy()
nested:

lapply(testList,function(y) lapply(y,function(x) paste(x,collapse = '')))

嗨,大卫。我尝试了您的解决方案,但仍然得到了相同的结果:“as.raw(c(0x56,0xcd,0x5f,0x02,0xb8,0x9b,0x5b,0xd0,0x26,0xcb,0x39,0xc9))as.raw(c(0x56,0xcd,0x6c,0x13,0xb8,0x9b,0x5b,0x5b,0xd0,0x26,0xcb,0x39,0xd5))as.raw(c(0x56,0xcd,0x6f,0xc6,0xb8,0xb,0x5b,0x5b,0x6b,0x6b,0x6b,0x6b,0x6b),0x6b),0x39),0x39),您误读了我的建议。”。我的建议是:你应该输入下一条评论,因为我不能再编辑了。。。你误解了我的解决办法。我建议不要这样:
sappy(df,函数(x)粘贴(as.character(x),collapse='')
您应该输入:
sappy(df,函数(x)粘贴(x,collapse='')
转换
as.character()
返回您不想要的奇怪结果。别这么说,转换是不必要的!在我回答的第二行中,我刚刚读了一个示例,在第三行中我调用了实际的函数。我尝试过,我认为您删除as.character是正确的,并且您的代码对于一行数据工作得很好,比如“test=as.raw(as.hexmode(x=c(“56”、“cd”、“5f”、“02”、“b8”、“9b”、“5b”、“d0”、“26”、“cb”、“39”、“c9”))”),但我尝试将其应用于嵌套列表,这是行不通的,因为无论列表中有多少个元素,sapply总是将它们粘贴在一起,结果就像我在回答中发布的一样。请提供一个示例,包括一个以原始数据开头的嵌套列表,以便我可以重现您的错误