Dataframe 连接多个数据帧

Dataframe 连接多个数据帧,dataframe,julia,Dataframe,Julia,我想知道Julia DataFrames中是否有一种方法可以一次性连接多个数据帧 using DataFrames employer = DataFrame( ID = Array{Int64}([01,02,03,04,05,09,11,20]), name = Array{String}(["Matthews","Daniella", "Kofi", "Vladmir", "Jea

我想知道Julia DataFrames中是否有一种方法可以一次性连接多个数据帧

 using DataFrames

 employer = DataFrame(
    ID = Array{Int64}([01,02,03,04,05,09,11,20]),
    name = Array{String}(["Matthews","Daniella", "Kofi", "Vladmir", "Jean", "James", "Ayo", "Bill"])
    )

salary = DataFrame(
    ID = Array{Int64}([01,02,03,04,05,06,08,23]),
    amount = Array{Int64}([2050,3000,3500,3500,2500,3400,2700,4500])
)

hours = DataFrame(
    ID = Array{Int64}([01,02,03,04,05,08,09,23]),
    time = Array{Int64}([40,40,40,40,40,38,45,50])
)

# I tried adding them in an array but ofcoures that results in an error
empSalHrs = innerjoin([employer,salary,hours], on = :ID)

# In python you can achieve this using
import pandas as pd 
from functools import reduce

df = reduce(lambda l,r : pd.merge(l,r, on = "ID"), [employer, salary, hours])

在朱莉娅身上也有类似的方法吗?

你几乎做到了。正如在中所写的那样,您只需要将多个数据帧作为参数传递

using DataFrames

 employer = DataFrame(
    ID = [01,02,03,04,05,09,11,20],
    name = ["Matthews","Daniella", "Kofi", "Vladmir", "Jean", "James", "Ayo", "Bill"])
    

salary = DataFrame(
    ID = [01,02,03,04,05,06,08,23],
    amount = [2050,3000,3500,3500,2500,3400,2700,4500])


hours = DataFrame(
    ID = [01,02,03,04,05,08,09,23],
    time = [40,40,40,40,40,38,45,50]
)

empSalHrs = innerjoin(employer,salary,hours, on = :ID)
如果出于某种原因需要将数据帧放入
向量中
,则可以使用拆分来实现相同的结果

empSalHrs = innerjoin([employer,salary,hours]..., on = :ID)
另外,请注意,我稍微更改了数据帧的定义。由于
Array{Int}
是一种抽象类型,因此不应将其用于变量声明,因为它对性能有害。在这种情况下,这可能并不重要,但最好从一开始就养成好习惯。可以使用

  • 数组{Int,1}([1,2,3,4])
  • Vector{Int}([1,2,3,4])
  • Int[1,2,3]
  • [1,2,3]
最后一个是合法的,因为Julia可以在这个简单的场景中自己推断容器的类型