Python R将原始数据转换为字符
我尝试将mongodb的R load数据与包“mongolite”一起使用,代码如下:Python R将原始数据转换为字符,python,r,mongodb,lapply,data-cleaning,Python,R,Mongodb,Lapply,Data Cleaning,我尝试将mongodb的R load数据与包“mongolite”一起使用,代码如下: df <- db$find('{}', '{"CurrentId":1,"_id":0}') [[1]] list() [[2]] list() [[3]] list() [[4]] list() [[5]] list() [[6]] [[6]][[1]] [1] 56 cd 5f 02 b8 9b 5b d0 26 cb 39 c9 [[6]][[2]] [1] 56 cd 6c 13
df <- db$find('{}', '{"CurrentId":1,"_id":0}')
[[1]]
list()
[[2]]
list()
[[3]]
list()
[[4]]
list()
[[5]]
list()
[[6]]
[[6]][[1]]
[1] 56 cd 5f 02 b8 9b 5b d0 26 cb 39 c9
[[6]][[2]]
[1] 56 cd 6c 13 b8 9b 5b d0 26 cb 39 d5
[[6]][[3]]
[1] 56 cd 6f c6 b8 9b 5b d0 26 cb 39 de
[[1]]
list()
[[2]]
list()
[[3]]
list()
[[4]]
list()
[[5]]
list()
[[6]]
[[6]][[1]]
[1] "56cd5f02b89b5bd026cb39c9"
[[6]][[2]]
[1] "56cd6c13b89b5bd026cb39d5"
[[6]][[3]]
[1] "56cd6fc6b89b5bd026cb39de"
[1] ""
[2] ""
[3] ""
[4] ""
[5] ""
[6] "as.raw(c(0x56, 0xcd, 0x5f, 0x02, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xc9))as.raw(c(0x56, 0xcd, 0x6c, 0x13, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xd5))as.raw(c(0x56, 0xcd, 0x6f, 0xc6, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xde))"
而df[[6]][[1]]
是:
[1] 56 cd 5f 02 b8 9b 5b d0 26 cb 39 c9
typeof(df[[6]][[1]])
的类型是:
[1] "raw"
我使用粘贴(dc3[[6]][[1]],collapse='')
将原始类型转换为字符串,就像mongodb ObjectId格式一样:
[1] "56cd5f02b89b5bd026cb39c9"
然后我尝试将df
中的所有原始数据转换为string
,如上所述。因此我使用sapply
函数:
sapply(df,函数(x)粘贴(as.character(x),collapse='')
得到这个:
[1] ""
[2] ""
[3] ""
[4] ""
[5] ""
[6] "as.raw(c(0x56, 0xcd, 0x5f, 0x02, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xc9))as.raw(c(0x56, 0xcd, 0x6c, 0x13, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xd5))as.raw(c(0x56, 0xcd, 0x6f, 0xc6, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xde))"
但我想得到这样的东西:
df <- db$find('{}', '{"CurrentId":1,"_id":0}')
[[1]]
list()
[[2]]
list()
[[3]]
list()
[[4]]
list()
[[5]]
list()
[[6]]
[[6]][[1]]
[1] 56 cd 5f 02 b8 9b 5b d0 26 cb 39 c9
[[6]][[2]]
[1] 56 cd 6c 13 b8 9b 5b d0 26 cb 39 d5
[[6]][[3]]
[1] 56 cd 6f c6 b8 9b 5b d0 26 cb 39 de
[[1]]
list()
[[2]]
list()
[[3]]
list()
[[4]]
list()
[[5]]
list()
[[6]]
[[6]][[1]]
[1] "56cd5f02b89b5bd026cb39c9"
[[6]][[2]]
[1] "56cd6c13b89b5bd026cb39d5"
[[6]][[3]]
[1] "56cd6fc6b89b5bd026cb39de"
[1] ""
[2] ""
[3] ""
[4] ""
[5] ""
[6] "as.raw(c(0x56, 0xcd, 0x5f, 0x02, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xc9))as.raw(c(0x56, 0xcd, 0x6c, 0x13, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xd5))as.raw(c(0x56, 0xcd, 0x6f, 0xc6, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xde))"
有人知道怎么处理吗?有没有更有效的方法来完成整个工作
更新:
我应该给出一些代码来重现我的原始数据集:
test = as.raw(as.hexmode(x = c("56","cd","5f","02","b8","9b","5b","d0","26","cb","39","c9")))
df = lapply(1:10,function(x) test)
尽管此代码产生以下结果:
[[1]]
list()
[[2]]
[[2]][[1]]
[1] 5f
[[2]][[2]]
[1] d0
[[3]]
[[3]][[1]]
[1] 26
[[3]][[2]]
[1] 56
[[4]]
list()
[[5]]
[[5]][[1]]
[1] cb
[[6]]
list()
它不像原始的df
,但我真的不知道如何将原始数据粘贴到嵌套列表中,希望这对您有所帮助
sapply(df,函数(x)粘贴(x,collapse='')的结果如下所示:
df <- db$find('{}', '{"CurrentId":1,"_id":0}')
[[1]]
list()
[[2]]
list()
[[3]]
list()
[[4]]
list()
[[5]]
list()
[[6]]
[[6]][[1]]
[1] 56 cd 5f 02 b8 9b 5b d0 26 cb 39 c9
[[6]][[2]]
[1] 56 cd 6c 13 b8 9b 5b d0 26 cb 39 d5
[[6]][[3]]
[1] 56 cd 6f c6 b8 9b 5b d0 26 cb 39 de
[[1]]
list()
[[2]]
list()
[[3]]
list()
[[4]]
list()
[[5]]
list()
[[6]]
[[6]][[1]]
[1] "56cd5f02b89b5bd026cb39c9"
[[6]][[2]]
[1] "56cd6c13b89b5bd026cb39d5"
[[6]][[3]]
[1] "56cd6fc6b89b5bd026cb39de"
[1] ""
[2] ""
[3] ""
[4] ""
[5] ""
[6] "as.raw(c(0x56, 0xcd, 0x5f, 0x02, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xc9))as.raw(c(0x56, 0xcd, 0x6c, 0x13, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xd5))as.raw(c(0x56, 0xcd, 0x6f, 0xc6, 0xb8, 0x9b, 0x5b, 0xd0, 0x26, 0xcb, 0x39, 0xde))"
只需在sapply()
调用中使用paste()
而不调用as.character()
。
简短示例:
convertRaw = function(x) paste(x,collapse = '') # works identical in sapply
test = as.raw(as.hexmode(x = c("56","cd","5f","02","b8","9b","5b","d0","26","cb","39","c9"))) # line copied from your sample
convertRaw(test)
[1] "56cd5f02b89b5bd026cb39c9"
更新
实际上,使用嵌套列表会导致另一个问题。因为您处理的是嵌套列表,所以sapply调用也需要嵌套。您可以通过lappy()
调用它。下面是一个简短的例子,希望最终解决您的问题:
test = as.raw(as.hexmode(x = c("56","cd","5f","02","b8","9b","5b","d0","26","cb","39","c9")))
testList = list(list(),list(test,test)) # here I create a short nested list
res = lapply(testList,function(y) sapply(y,function(x) paste(x,collapse = '')))
print(res)
结果是:
[[1]] list()
[[2]] [1] "56cd5f02b89b5bd026cb39c9" "56cd5f02b89b5bd026cb39c9"
如果您喜欢:
[[1]] list()
[[2]] [[2]][[1]]
[1] "56cd5f02b89b5bd026cb39c9"
[[2]][[2]]
[1] "56cd5f02b89b5bd026cb39c9"
只需调用,lappy()
nested:
lapply(testList,function(y) lapply(y,function(x) paste(x,collapse = '')))
嗨,大卫。我尝试了您的解决方案,但仍然得到了相同的结果:“as.raw(c(0x56,0xcd,0x5f,0x02,0xb8,0x9b,0x5b,0xd0,0x26,0xcb,0x39,0xc9))as.raw(c(0x56,0xcd,0x6c,0x13,0xb8,0x9b,0x5b,0x5b,0xd0,0x26,0xcb,0x39,0xd5))as.raw(c(0x56,0xcd,0x6f,0xc6,0xb8,0xb,0x5b,0x5b,0x6b,0x6b,0x6b,0x6b,0x6b),0x6b),0x39),0x39),您误读了我的建议。”。我的建议是:你应该输入下一条评论,因为我不能再编辑了。。。你误解了我的解决办法。我建议不要这样:sappy(df,函数(x)粘贴(as.character(x),collapse='')
您应该输入:sappy(df,函数(x)粘贴(x,collapse='')
转换as.character()
返回您不想要的奇怪结果。别这么说,转换是不必要的!在我回答的第二行中,我刚刚读了一个示例,在第三行中我调用了实际的函数。我尝试过,我认为您删除as.character是正确的,并且您的代码对于一行数据工作得很好,比如“test=as.raw(as.hexmode(x=c(“56”、“cd”、“5f”、“02”、“b8”、“9b”、“5b”、“d0”、“26”、“cb”、“39”、“c9”))”),但我尝试将其应用于嵌套列表,这是行不通的,因为无论列表中有多少个元素,sapply总是将它们粘贴在一起,结果就像我在回答中发布的一样。请提供一个示例,包括一个以原始数据开头的嵌套列表,以便我可以重现您的错误