R 使用pmap在TIBLE的行上迭代_R_Dplyr_Purrr_Rowwise

R 使用pmap在TIBLE的行上迭代

R 使用pmap在TIBLE的行上迭代,r,dplyr,purrr,rowwise,R,Dplyr,Purrr,Rowwise,我有一个非常简单的tibble，我想使用pmap函数对其行进行迭代以应用函数。我想我可能误解了关于pmap函数的一些观点，但我在选择参数时遇到了很大困难。所以我想知道在这种情况下，我是否应该在pmap中使用rowwise函数。但是我还没有看到一个案例。另一个问题是使用list或select函数选择要迭代的变量： # Here is my tibble # Imagine I would like to apply a `n_distinct` function with pmap on it

我有一个非常简单的tibble，我想使用

pmap

函数对其行进行迭代以应用函数。我想我可能误解了关于

pmap

函数的一些观点，但我在选择参数时遇到了很大困难。所以我想知道在这种情况下，我是否应该在

pmap

中使用

rowwise

函数。但是我还没有看到一个案例。另一个问题是使用list或

select

函数选择要迭代的变量：

# Here is my tibble
# Imagine I would like to apply a `n_distinct` function with pmap on it every rows

df <-  tibble(id = c("01", "02", "03","04","05","06"),
                  A = c("Jan", "Mar", "Jan","Jan","Jan","Mar"),
                  B = c("Feb", "Mar", "Jan","Jan","Mar","Mar"),
                  C = c("Feb", "Mar", "Feb","Jan","Feb","Feb")
)

# It is perfectly achievable with `rowwise` and `mutate` and results in my desired output

df %>%
  rowwise() %>%
  mutate(overal = n_distinct(c_across(A:C)))

# A tibble: 6 x 5
# Rowwise: 
  id    A     B     C     overal
  <chr> <chr> <chr> <chr>  <int>
1 01    Jan   Feb   Feb        2
2 02    Mar   Mar   Mar        1
3 03    Jan   Jan   Feb        2
4 04    Jan   Jan   Jan        1
5 05    Jan   Mar   Feb        3
6 06    Mar   Mar   Feb        2

# But with `pmap` it won't. 


df %>%
  select(-id) %>%
  mutate(overal = pmap_dbl(list(A, B, C), n_distinct))


# A tibble: 6 x 4
  A     B     C     overal
  <chr> <chr> <chr>  <dbl>
1 Jan   Feb   Feb        1
2 Mar   Mar   Mar        1
3 Jan   Jan   Feb        1
4 Jan   Jan   Jan        1
5 Jan   Mar   Feb        1
6 Mar   Mar   Feb        1

#这是我的藏书
#假设我想应用一个'n_distinct'函数，每行上都有pmap
df%
行（）
变异（整体=n_不同（c_穿过（A:c）））
#一个tibble:6x5
#顺时针：
id A B C总体
1月1日2月2日
2002年3月1日
2003年1月3日1月2日
2004年1月1日
1月5日1月3日2月3日
3月6日3月2日
#但有了“pmap”就不会了。
df%>%
选择（-id）%%>%
变异（总体=pmap_dbl（列表（A、B、C）、n_distinct））
#一个tibble:6x4
A、B、C总体
1月1日2月1日
3月2日3月1日
1月3日2月1日
1月4日
1月5日3月2月1日
3月6日3月2月1日

我只需要对

pmap

在tibbles上的行式迭代的应用做一点解释，所以我非常感谢您事先提供的帮助，谢谢。

我能够追踪到这个问题，但无法说明这是一个bug还是一个功能。关键是

n_distinct（）

inside

pmap

将给定的输入处理为一个包含3列的数据帧。当对数据帧应用

n_distinct（）

时，它统计不同行的数量，因此每行中有1行

n_distinct(tibble(a = c(1, 2, 2),
                  b = 3))
#> [1] 2

诀窍是首先将输入转换为向量，然后将其传递给n_

df %>%
  select(-id) %>%
  mutate(overal = pmap_dbl(list(A, B, C), ~ n_distinct(c(...))))
#> # A tibble: 6 x 4
#>   A     B     C     overal
#>   <chr> <chr> <chr>  <dbl>
#> 1 Jan   Feb   Feb        2
#> 2 Mar   Mar   Mar        1
#> 3 Jan   Jan   Feb        2
#> 4 Jan   Jan   Jan        1
#> 5 Jan   Mar   Feb        3
#> 6 Mar   Mar   Feb        2

df%>%
选择（-id）%%>%
变异（总体=pmap_dbl（列表（A，B，C），~n_独特（C（…））
#>#tibble:6 x 4
#>A、B、C总体
#>       
#>1月1日2月2日
#>3月2日3月1日
#>1月3日2月2日
#>1月4日
#>1月5日3月3日2月3日
#>3月6日3月2日

非常感谢！这是一个非常微妙的问题，我非常恼火，以至于我无法让它工作！我曾经遇到过

c（…）

技巧，但没有认真对待，但现在我认为其他一些函数也是如此。非常感谢。亲爱的@mnist，当我们使用

pmap

在数据帧的每一行上应用函数时，我这里有一个问题，我们需要使用

rowwise

还是它会自动在每一行上应用它？因为我从未见过

rowwise

pmap

组合的例子，它们大多将

rowwise

与

mutate

组合在一起。答案就是这样。请注意，数据帧是一个非常重要的特殊情况，在这种情况下，pmap（）和pwalk（）将函数.f应用于每一行。不，您不需要将

pmap

与

rowwise

相结合。非常感谢您的帮助，我非常感谢您的帮助。我想我必须重新阅读手册。如果你想使用

pmap（）

，那么你需要事先对每一行进行矢量化。我不认为n_distinct的输入实际上在

pmap

中是这样的。您可以使用

debugonce（n_distinct）

检查它，我认为您是对的。为了不误导任何人，我已经删除了最初的答案。谢谢