Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/fsharp/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Dataframe 如何压缩2个数据帧并处理缺少的值?_Dataframe_F#_Deedle - Fatal编程技术网

Dataframe 如何压缩2个数据帧并处理缺少的值?

Dataframe 如何压缩2个数据帧并处理缺少的值?,dataframe,f#,deedle,Dataframe,F#,Deedle,给定一个价格框架priceFrameas 28881 29021 29399 2010-01-01 00:00:00 -> 123.535878499576 195.28635425580265 189.92210186152082 2010-01-04 00:00:00 -> 124.19087548338847 198.10448102247753 190.15717336312

给定一个价格框架
priceFrame
as

                       28881              29021              29399
2010-01-01 00:00:00 -> 123.535878499576   195.28635425580265 189.92210186152082
2010-01-04 00:00:00 -> 124.19087548338847 198.10448102247753 190.1571733631235
2010-01-05 00:00:00 -> 123.82028508465247 197.8259452373992  190.31388769752525
2010-01-06 00:00:00 -> 124.17363872065654 197.80956077945342 189.98478759528152
2010-01-07 00:00:00 -> 123.4583130672824  197.58017836821244 190.31388769752527
2010-01-08 00:00:00 -> 124.23396739021821 198.10448102247756 190.25120196376457
2010-01-11 00:00:00 -> 125.12166067091142 197.87509861123658 190.73701640041008
2010-01-12 00:00:00 -> 124.9234378994945  195.0569718445617  191.41088803833776
2010-01-13 00:00:00 -> 125.06133200134975 195.64681233060992 191.50491663897884
2010-01-14 00:00:00 -> 124.97514818769021 196.28580619049552 191.56760237273951
2010-01-15 00:00:00 -> 123.71686450826103 192.5829186947483  192.08475967626538
2010-01-18 00:00:00 -> 123.71686450826103 194.10667328370621 192.31983117786805
2010-01-19 00:00:00 -> 123.15666971947407 195.87619474185092 191.94371677530378
2010-01-20 00:00:00 -> 121.5622691667727  191.79646471335064 192.82131704795376
2010-01-21 00:00:00 -> 121.5450324040408  188.38849746062752 192.9937028157957
2010-01-22 00:00:00 -> 121.81220222638535 186.8647428716696  192.9937028157957
2010-01-25 00:00:00 -> 121.94147794687466 184.83307008639233 192.9937028157957
2010-01-26 00:00:00 -> 121.38990153945363 185.9799821425972  193.19743145051802
2010-01-27 00:00:00 -> 120.94174570842405 184.91499237612123 193.3541457849198
2010-01-28 00:00:00 -> 120.44187958919875 182.5392459739825  193.22877431739838
2010-01-29 00:00:00 -> 119.4938576389439  183.75169586197052 193.35414578491978
                       28881     29021     29399
2010-01-04 00:00:00 -> 1.3       <missing> <missing>
2010-01-13 00:00:00 -> <missing> 1.3       <missing>
2010-01-22 00:00:00 -> <missing> <missing> 1.3
和一个分红帧
divFrame
as

                       28881              29021              29399
2010-01-01 00:00:00 -> 123.535878499576   195.28635425580265 189.92210186152082
2010-01-04 00:00:00 -> 124.19087548338847 198.10448102247753 190.1571733631235
2010-01-05 00:00:00 -> 123.82028508465247 197.8259452373992  190.31388769752525
2010-01-06 00:00:00 -> 124.17363872065654 197.80956077945342 189.98478759528152
2010-01-07 00:00:00 -> 123.4583130672824  197.58017836821244 190.31388769752527
2010-01-08 00:00:00 -> 124.23396739021821 198.10448102247756 190.25120196376457
2010-01-11 00:00:00 -> 125.12166067091142 197.87509861123658 190.73701640041008
2010-01-12 00:00:00 -> 124.9234378994945  195.0569718445617  191.41088803833776
2010-01-13 00:00:00 -> 125.06133200134975 195.64681233060992 191.50491663897884
2010-01-14 00:00:00 -> 124.97514818769021 196.28580619049552 191.56760237273951
2010-01-15 00:00:00 -> 123.71686450826103 192.5829186947483  192.08475967626538
2010-01-18 00:00:00 -> 123.71686450826103 194.10667328370621 192.31983117786805
2010-01-19 00:00:00 -> 123.15666971947407 195.87619474185092 191.94371677530378
2010-01-20 00:00:00 -> 121.5622691667727  191.79646471335064 192.82131704795376
2010-01-21 00:00:00 -> 121.5450324040408  188.38849746062752 192.9937028157957
2010-01-22 00:00:00 -> 121.81220222638535 186.8647428716696  192.9937028157957
2010-01-25 00:00:00 -> 121.94147794687466 184.83307008639233 192.9937028157957
2010-01-26 00:00:00 -> 121.38990153945363 185.9799821425972  193.19743145051802
2010-01-27 00:00:00 -> 120.94174570842405 184.91499237612123 193.3541457849198
2010-01-28 00:00:00 -> 120.44187958919875 182.5392459739825  193.22877431739838
2010-01-29 00:00:00 -> 119.4938576389439  183.75169586197052 193.35414578491978
                       28881     29021     29399
2010-01-04 00:00:00 -> 1.3       <missing> <missing>
2010-01-13 00:00:00 -> <missing> 1.3       <missing>
2010-01-22 00:00:00 -> <missing> <missing> 1.3
结果相同

                       28881              29021              29399
2010-01-01 00:00:00 -> <missing>          <missing>          <missing>
2010-01-04 00:00:00 -> 125.49087548338846 <missing>          <missing>
2010-01-05 00:00:00 -> <missing>          <missing>          <missing>
2010-01-06 00:00:00 -> <missing>          <missing>          <missing>
2010-01-07 00:00:00 -> <missing>          <missing>          <missing>
2010-01-08 00:00:00 -> <missing>          <missing>          <missing>
2010-01-11 00:00:00 -> <missing>          <missing>          <missing>
2010-01-12 00:00:00 -> <missing>          <missing>          <missing>
2010-01-13 00:00:00 -> <missing>          196.94681233060993 <missing>
2010-01-14 00:00:00 -> <missing>          <missing>          <missing>
2010-01-15 00:00:00 -> <missing>          <missing>          <missing>
2010-01-18 00:00:00 -> <missing>          <missing>          <missing>
2010-01-19 00:00:00 -> <missing>          <missing>          <missing>
2010-01-20 00:00:00 -> <missing>          <missing>          <missing>
2010-01-21 00:00:00 -> <missing>          <missing>          <missing>
2010-01-22 00:00:00 -> <missing>          <missing>          194.2937028157957
2010-01-25 00:00:00 -> <missing>          <missing>          <missing>
2010-01-26 00:00:00 -> <missing>          <missing>          <missing>
2010-01-27 00:00:00 -> <missing>          <missing>          <missing>
2010-01-28 00:00:00 -> <missing>          <missing>          <missing>
2010-01-29 00:00:00 -> <missing>          <missing>          <missing>
导致

                       28881              29021              29399
2010-01-01 00:00:00 -> 123.535878499576   195.28635425580265 189.92210186152082
2010-01-04 00:00:00 -> 124.19087548338847 198.10448102247753 190.1571733631235
2010-01-05 00:00:00 -> 123.82028508465247 197.8259452373992  190.31388769752525
2010-01-06 00:00:00 -> 124.17363872065654 197.80956077945342 189.98478759528152
2010-01-07 00:00:00 -> 123.4583130672824  197.58017836821244 190.31388769752527
2010-01-08 00:00:00 -> 124.23396739021821 198.10448102247756 190.25120196376457
2010-01-11 00:00:00 -> 125.12166067091142 197.87509861123658 190.73701640041008
2010-01-12 00:00:00 -> 124.9234378994945  195.0569718445617  191.41088803833776
2010-01-13 00:00:00 -> 125.06133200134975 195.64681233060992 191.50491663897884
2010-01-14 00:00:00 -> 124.97514818769021 196.28580619049552 191.56760237273951
2010-01-15 00:00:00 -> 123.71686450826103 192.5829186947483  192.08475967626538
2010-01-18 00:00:00 -> 123.71686450826103 194.10667328370621 192.31983117786805
2010-01-19 00:00:00 -> 123.15666971947407 195.87619474185092 191.94371677530378
2010-01-20 00:00:00 -> 121.5622691667727  191.79646471335064 192.82131704795376
2010-01-21 00:00:00 -> 121.5450324040408  188.38849746062752 192.9937028157957
2010-01-22 00:00:00 -> 121.81220222638535 186.8647428716696  192.9937028157957
2010-01-25 00:00:00 -> 121.94147794687466 184.83307008639233 192.9937028157957
2010-01-26 00:00:00 -> 121.38990153945363 185.9799821425972  193.19743145051802
2010-01-27 00:00:00 -> 120.94174570842405 184.91499237612123 193.3541457849198
2010-01-28 00:00:00 -> 120.44187958919875 182.5392459739825  193.22877431739838
2010-01-29 00:00:00 -> 119.4938576389439  183.75169586197052 193.35414578491978
所有的价格都在那里,但没有增加任何红利

let dfZipped4 = priceFrame.Zip(divFrame, JoinKind.Left, JoinKind.Left, Lookup.Exact, true, fun (p:float) d -> p + (d |> Option.defaultValue 0.0))
        dfZipped4.Print()
只会导致丢失值

当价格与股息对齐时,我如何将其添加到股息中,但保持价格不变

更新

我已经确定了执行Frocha和zuzhu回答的时间。zyzhu的第二个答案并没有产生正确的结果

对于每项技术的1000次连续运行,我得到了

frocha1: 572.974400
frocha2: 562.867600
zyzhu1: 1099.057100
frocha2始终略快于frocha1。ZYZU1总是比其他人慢。所以现在我接受弗罗查的回答


然而,如果ZYZU2能够正常工作,它可能是最快的,因为它是最简单的。在这种情况下,我会改变所接受的答案。

< P>我的方法不考虑速度限制,如下: 1) 重命名列,以便能够执行无错误的联接 2) 连接框架。 3) 将缺少的值替换为零。 4) 对相应列求和。 5) 删除“股息”列。 6) 可选:如果不需要“字符串”转换,请将priceFrame的列名称更改为原始类型

module Frame =
    //I usually add this handy function to the Frame module
    let mapReplaceCol col f frame =
        frame
        |> Frame.replaceCol col (Frame.mapRowValues f frame)

let priceFrame' = priceFrame |> Frame.mapColKeys string

//appends a "D" in the col key to eliminate col with same name
let dividends' =
    dividends
    |> Frame.mapColKeys (string >> (+) "D") 

let joinedFrame =
    priceFrame'
    |> Frame.join JoinKind.Right dividends'
    |> Frame.fillMissingWith 0.

(joinedFrame,priceFrame'.ColumnKeys |> List.ofSeq)
||> List.fold (fun acc elem ->
    acc|> Frame.mapReplaceCol elem (fun row ->
        row.GetAs<float>("D" + elem) + row.GetAs<float>(elem))
    |> Frame.dropCol ("D" + elem))

我的方法不考虑速度限制,如下: 1) 重命名列,以便能够执行无错误的联接 2) 连接框架。 3) 将缺少的值替换为零。 4) 对相应列求和。 5) 删除“股息”列。 6) 可选:如果不需要“字符串”转换,请将priceFrame的列名称更改为原始类型

module Frame =
    //I usually add this handy function to the Frame module
    let mapReplaceCol col f frame =
        frame
        |> Frame.replaceCol col (Frame.mapRowValues f frame)

let priceFrame' = priceFrame |> Frame.mapColKeys string

//appends a "D" in the col key to eliminate col with same name
let dividends' =
    dividends
    |> Frame.mapColKeys (string >> (+) "D") 

let joinedFrame =
    priceFrame'
    |> Frame.join JoinKind.Right dividends'
    |> Frame.fillMissingWith 0.

(joinedFrame,priceFrame'.ColumnKeys |> List.ofSeq)
||> List.fold (fun acc elem ->
    acc|> Frame.mapReplaceCol elem (fun row ->
        row.GetAs<float>("D" + elem) + row.GetAs<float>(elem))
    |> Frame.dropCol ("D" + elem))

要使用
zip
,两个框架将逐列匹配,并将两个系列添加到一起

在您的例子中,
divFrame
的观察值少于
priceFrame
。当两个序列的观测值数量不相同且加在一起时,将丢失不匹配的结果

这是我的解决方案,通过创建一个虚拟帧,使
divFrame
首先与
priceFrame
对齐

让divFrame2=
让假人=
priceFrame.RowKeys
|>Seq.collect(有趣的行->divFrame.ColumnKeys |>Seq.map(有趣的列->行,列,0))
|>价值观框架
(虚拟+divFrame)。缺少填充(0)
priceFrame+divFrame2

要使用
zip
,两个框架将逐列匹配,并将两个系列添加到一起

在您的例子中,
divFrame
的观察值少于
priceFrame
。当两个序列的观测值数量不相同且加在一起时,将丢失不匹配的结果

这是我的解决方案,通过创建一个虚拟帧,使
divFrame
首先与
priceFrame
对齐

让divFrame2=
让假人=
priceFrame.RowKeys
|>Seq.collect(有趣的行->divFrame.ColumnKeys |>Seq.map(有趣的列->行,列,0))
|>价值观框架
(虚拟+divFrame)。缺少填充(0)
priceFrame+divFrame2

我有类似的问题,我的解决方案类似于
zyzhu
,但可以处理多个帧

let zipAll(dfs:Frame[])=
让outerKeys=dfs |>Array.collect(fun-df->df.RowKeys |>Array.ofSeq)|>Array.distinct |>Array.sort
让dfsNew=
dfs
|>Array.map(Frame.indexRowsWith outerKeys>>Frame.mapRowValues(Series.fillMissingWith 0.)>>Frame.ofRows)
Array.fold(Frame.zip(+))(Array.head dfsNew)(Array.tail dfsNew)
[priceFrame;dividends2]zipAll

它可能很慢,但可以处理多个帧。

我有类似的问题,我的解决方案与
zyzhu类似,但可以处理多个帧

let zipAll(dfs:Frame[])=
让outerKeys=dfs |>Array.collect(fun-df->df.RowKeys |>Array.ofSeq)|>Array.distinct |>Array.sort
让dfsNew=
dfs
|>Array.map(Frame.indexRowsWith outerKeys>>Frame.mapRowValues(Series.fillMissingWith 0.)>>Frame.ofRows)
Array.fold(Frame.zip(+))(Array.head dfsNew)(Array.tail dfsNew)
[priceFrame;dividends2]zipAll

它可能很慢,但可以处理多个帧。

这是一个潜在的有用方法。但这项任务似乎并不罕见,应该是一个现成的功能,可能是在zip中烘烤的。同意。我第一次尝试使用Zip函数,但没有成功。这就是为什么我使用这种“混乱”的解决方法。这是一种潜在的有用的解决方法。但这项任务似乎并不罕见,应该是一个现成的功能,可能是在zip中烘烤的。同意。我第一次尝试使用Zip函数,但没有成功。这就是为什么我使用这种“混乱”的解决方法。