R 匹配列表中的元素
刚开始在R。。。在这件事上被难住了,也许是因为我不知道从哪里开始 定义一个随机变量,使其等于匹配前的试验次数。如果你有一个数字列表,4,5,7,11,3,11,12,8,8,1…,随机变量的第一个值是6,因为那时有两个11。4,5,7,11,3,11第二个值是3,因为那时你有两个8…,12,8,8。 下面的代码通过模拟均匀分布创建数字列表u 谢谢你的帮助和指点。如果有人有兴趣尝试通过编写统计文本来学习,我将在下面完整描述我正在解决的问题R 匹配列表中的元素,r,R,刚开始在R。。。在这件事上被难住了,也许是因为我不知道从哪里开始 定义一个随机变量,使其等于匹配前的试验次数。如果你有一个数字列表,4,5,7,11,3,11,12,8,8,1…,随机变量的第一个值是6,因为那时有两个11。4,5,7,11,3,11第二个值是3,因为那时你有两个8…,12,8,8。 下面的代码通过模拟均匀分布创建数字列表u 谢谢你的帮助和指点。如果有人有兴趣尝试通过编写统计文本来学习,我将在下面完整描述我正在解决的问题 set.seed(1); u = matrix(runif
set.seed(1); u = matrix(runif(1000), nrow=1000)
u[u > 0 & u <= 1/12] <- 1
u[u > 1/12 & u <= 2/12] <- 2
u[u > 2/12 & u <= 3/12] <- 3
u[u > 3/12 & u <= 4/12] <- 4
u[u > 4/12 & u <= 5/12] <- 5
u[u > 5/12 & u <= 6/12] <- 6
u[u > 6/12 & u <= 7/12] <- 7
u[u > 7/12 & u <= 8/12] <- 8
u[u > 8/12 & u <= 9/12] <- 9
u[u > 9/12 & u <= 10/12] <- 10
u[u > 10/12 & u <= 11/12] <- 11
u[u > 11/12 & u < 12/12] <- 12
table(u); u[1:10,]
示例2.6-3 Higgins概率和随机建模中的概念
假设我们随机询问人们出生的月份。让随机变量X表示我们在发现两个同月出生的人之前需要询问的人数。X的可能值为2,3,…13。也就是说,至少需要询问两个人才能进行比赛,并且询问的人数不超过13人。通过简化假设,每个月都是一个同样可能的响应候选,使用计算机模拟来估计X的概率质量函数。模拟生成出生月,直到找到匹配。在1000次重复实验的基础上,获得了以下经验分布和样本统计…R具有陡峭的初始学习曲线。我认为假设这是你的家庭作业是不公平的,是的,如果你知道你在寻找什么,就有可能找到解决办法。然而,我记得有时很难在网上研究问题,因为我不知道搜索什么,也不太熟悉术语 下面是解决R中问题的一种方法的解释。阅读注释代码,并尝试准确地了解它在做什么。尽管如此,我还是建议使用一个好的初学者资源。从记忆中,一个很好的开始和运行是,但有很多在那里
# set the number of simulations
nsim <- 10000
# Create a matrix, with nsim columns, and fill it with something.
# The something with which you'll populate it is a random sample,
# with replacement, of month names (held in a built-in vector called
# 'month.abb'). We're telling the sample function that it should take
# 13*nsim samples, and these will be used to fill the matrix, which
# has nsim columns (and hence 13 rows). We've chosen to take samples
# of length 13, because as your textbook states, 13 is the maximum
# number of month names necessary for a month name to be duplicated.
mat <- matrix(sample(month.abb, 13*nsim, replace=TRUE), ncol=nsim)
# If you like, take a look at the first 10 columns
mat[, 1:10]
# We want to find the position of the first duplicated value for each column.
# Here's one way to do this, but it might be a bit confusing if you're just
# starting out. The 'apply' family of functions is very useful for
# repeatedly applying a function to columns/rows/elements of an object.
# Here, 'apply(mat, 2, foo)' means that for each column (2 represents columns,
# 1 would apply to rows, and 1:2 would apply to every cell), do 'foo' to that
# column. Our function below extends this a little with a custom function. It
# says: for each column of mat in turn, call that column 'x' and perform
# 'match(1, duplicated(x))'. This match function will return the position
# of the first '1' in the vector 'duplicated(x)'. The vector 'duplicated(x)'
# is a logical (boolean) vector that indicates, for each element of x,
# whether that element has already occurred earlier in the vector (i.e. if
# the month name has already occurred earlier in x, the corresponding element
# of duplicated(x) will be TRUE (which equals 1), else it will be false (0).
# So the match function returns the position of the first duplicated month
# name (well, actually the second instance of that month name). e.g. if
# x consists of 'Jan', 'Feb', 'Jan', 'Mar', then duplicated(x) will be
# FALSE, FALSE, TRUE, FALSE, and match(1, duplicated(x)) will return 3.
# Referring back to your textbook problem, this is x, a realisation of the
# random variable X.
# Because we've used the apply function, the object 'res' will end up with
# nsim realisations of X, and these can be plotted as a histogram.
res <- apply(mat, 2, function(x) match(1, duplicated(x)))
hist(res, breaks=seq(0.5, 13.5, 1))
R具有陡峭的初始学习曲线。我认为假设这是你的家庭作业是不公平的,是的,如果你知道你在寻找什么,就有可能找到解决办法。然而,我记得有时很难在网上研究问题,因为我不知道搜索什么,也不太熟悉术语 下面是解决R中问题的一种方法的解释。阅读注释代码,并尝试准确地了解它在做什么。尽管如此,我还是建议使用一个好的初学者资源。从记忆中,一个很好的开始和运行是,但有很多在那里
# set the number of simulations
nsim <- 10000
# Create a matrix, with nsim columns, and fill it with something.
# The something with which you'll populate it is a random sample,
# with replacement, of month names (held in a built-in vector called
# 'month.abb'). We're telling the sample function that it should take
# 13*nsim samples, and these will be used to fill the matrix, which
# has nsim columns (and hence 13 rows). We've chosen to take samples
# of length 13, because as your textbook states, 13 is the maximum
# number of month names necessary for a month name to be duplicated.
mat <- matrix(sample(month.abb, 13*nsim, replace=TRUE), ncol=nsim)
# If you like, take a look at the first 10 columns
mat[, 1:10]
# We want to find the position of the first duplicated value for each column.
# Here's one way to do this, but it might be a bit confusing if you're just
# starting out. The 'apply' family of functions is very useful for
# repeatedly applying a function to columns/rows/elements of an object.
# Here, 'apply(mat, 2, foo)' means that for each column (2 represents columns,
# 1 would apply to rows, and 1:2 would apply to every cell), do 'foo' to that
# column. Our function below extends this a little with a custom function. It
# says: for each column of mat in turn, call that column 'x' and perform
# 'match(1, duplicated(x))'. This match function will return the position
# of the first '1' in the vector 'duplicated(x)'. The vector 'duplicated(x)'
# is a logical (boolean) vector that indicates, for each element of x,
# whether that element has already occurred earlier in the vector (i.e. if
# the month name has already occurred earlier in x, the corresponding element
# of duplicated(x) will be TRUE (which equals 1), else it will be false (0).
# So the match function returns the position of the first duplicated month
# name (well, actually the second instance of that month name). e.g. if
# x consists of 'Jan', 'Feb', 'Jan', 'Mar', then duplicated(x) will be
# FALSE, FALSE, TRUE, FALSE, and match(1, duplicated(x)) will return 3.
# Referring back to your textbook problem, this is x, a realisation of the
# random variable X.
# Because we've used the apply function, the object 'res' will end up with
# nsim realisations of X, and these can be plotted as a histogram.
res <- apply(mat, 2, function(x) match(1, duplicated(x)))
hist(res, breaks=seq(0.5, 13.5, 1))
所有这些代码都可以更容易地用u-GoogleforR来代替。生日问题带来了一些可能被证明非常有用的匹配项。如果这是家庭作业,那么值得阅读社区指南不,这不是家庭作业。我试图通过将R编码应用于旧的统计文本来学习它。谢谢你所有的评论,它们被证明是非常有用的。找到Bruce Trumbo的pdf,这导致了Suess和Trumbo的一本书,非常适合我的目的。介绍概率模拟和Gibbs采样与RAll,代码可以更容易地替换为u Google for r Birth问题提出了一些匹配,可能会被证明非常有用。如果这是家庭作业,值得阅读社区指南不,这不是家庭作业。我试图通过将R编码应用于旧的统计文本来学习它。谢谢你所有的评论,它们被证明是非常有用的。找到Bruce Trumbo的pdf,这导致了Suess和Trumbo的一本书,非常适合我的目的。概率模拟与R-Gibbs抽样简介