使用CSV.file和Dataframes读取完整的值表文件时出现问题

使用CSV.file和Dataframes读取完整的值表文件时出现问题,dataframe,julia,Dataframe,Julia,我是Julia的新手(今天才开始!),我正在尝试将带有值表的.data作为数据帧读入Julia,以便最终提取第42列和第43列下的两个值。这是输入文件的外观: 1 2 3 4 5

我是Julia的新手(今天才开始!),我正在尝试将带有值表的
.data
作为数据帧读入Julia,以便最终提取第42列和第43列下的两个值。这是输入文件的外观:

                           1                            2                            3                            4                            5                            6                            7 
              version_number                     compiler                        build             MESA_SDK_version                         date                    burn_min1                    burn_min2 
                       11701                   "gfortran"                      "9.2.0"        "x86_64-linux-20.3.2"                   "20200718"      5.0000000000000000E+001      1.0000000000000000E+003 

                                       1                                        2                                        3                                        4                                        5                                        6                                        7                                        8                                        9                                       10                                       11                                       12                                       13                                       14                                       15                                       16                                       17                                       18                                       19                                       20                                       21                                       22                                       23                                       24                                       25                                       26                                       27                                       28                                       29                                       30                                       31                                       32                                       33                                       34                                       35                                       36                                       37                                       38                                       39                                       40                                       41                                       42                                       43                                       44                                       45                                       46                                       47                                       48                                       49                                       50                                       51                                       52                                       53                                       54                                       55                                       56                                       57                                       58                                       59                                       60                                       61                                       62 
                            model_number                                 star_age                             star_age_day                                rsp_phase                                rsp_GREKM                        rsp_GREKM_avg_abs                               rsp_DeltaR                             rsp_DeltaMag                       rsp_period_in_days                          rsp_num_periods                               log_dt_sec                                   radius                                    log_R                              v_surf_km_s                      v_surf_div_escape_v                        v_div_csound_surf                         v_div_csound_max                         max_abs_v_div_cs                     dt_div_min_dr_div_cs                               luminosity                                    log_L                              effective_T                                    log_g                                 log_Teff                            photosphere_L                            photosphere_r                            photosphere_T                       photosphere_v_km_s                     photosphere_v_div_cs                              num_retries                                     bc_V                                     bc_I                                     bc_U                                     bc_B                                     bc_R                                     bc_J                                     bc_H                                     bc_K                                     bc_L                                bc_Lprime                                     bc_M                                abs_mag_V                                abs_mag_I                                abs_mag_U                                abs_mag_B                                abs_mag_R                                abs_mag_J                                abs_mag_H                                abs_mag_K                                abs_mag_L                           abs_mag_Lprime                                abs_mag_M                                  bc_bb_U                                  bc_bb_B                                  bc_bb_V                                  bc_bb_R                                  bc_bb_I                             abs_mag_bb_U                             abs_mag_bb_B                             abs_mag_bb_V                             abs_mag_bb_R                             abs_mag_bb_I 
                                       1                  6.7389636078418866E-007                  2.4614493550114818E-004                  1.9999999999999998E-005                  0.0000000000000000E+000                  0.0000000000000000E+000                  0.0000000000000000E+000                  0.0000000000000000E+000                  1.2307246775057409E+001                                        0                  1.3277046469527609E+000                  1.0355852690247899E+002                  2.0151858640622273E+000                  9.9999999081613875E-002                  6.3180115760696077E-004                  1.5454144687980833E-002                  1.5454144687984667E-002                  1.5454144687984667E-002                  1.6320949498828532E-002                  3.9999980998388105E+003                  3.6020597850205331E+000                  4.5100140752805628E+003                  1.2400265636241821E+000                  3.6541778972675703E+000                  4.0000005694630740E+003                  1.0276598054436276E+002                  4.5100139503744604E+003                  9.5722781222727801E-002                  1.3625500625299287E-002                                        0                 -5.0444989520674666E-001                  5.9105931711038406E-001                 -3.0557187404691080E+000                 -1.7462239207128374E+000                  8.3956903708770375E-002                  1.4560380234244312E+000                  2.0946680279357213E+000                  1.9906767067550131E+000                  2.2200875064387224E+000                  2.2276156474047601E+000                  2.6456661780258299E+000                  8.0200636642834709E+001                  7.9105127430517584E+001                  8.2751905488097066E+001                  8.1442410668340798E+001                  7.9612229843919195E+001                  7.8240148724203536E+001                  7.7601518719692237E+001                  7.7705510040872952E+001                  7.7476099241189246E+001                  7.7468571100223201E+001                  7.7050520569602128E+001                 -1.5853854762609063E+000                 -1.3814794330855515E+000                 -4.1347249807174019E-001                  2.1272797873251398E-001                  7.9829238167607985E-001                  8.1281572223888872E+001                  8.1077666180713507E+001                  8.0109659245699703E+001                  7.9483458768895446E+001                  7.8897894365951885E+001 
下面是一个可重复性最低的代码版本,用于测试我是否能够成功地将文件读入数据帧:

using DataFrames
using CSV


cwd = pwd()
ParentDir = joinpath(cwd,"LOGS_A")
dirs = readdir(ParentDir, join=true)


CurrentFile = joinpath(dirs[66],"history.data")
println(CurrentFile)
df = DataFrame(CSV.File(CurrentFile, skipto=5,header=1)) 
println(df)
println(df[:, :42])
println(df[:, :43])
CSV.write("julia_test_file", df)
然而,数据帧中的读取并不像我预期的那样进行;当我将其写入CSV文件并打开该文件时,我会看到许多问题:

Column1,Column2,Column3,Column4,Column5,Column6,Column7,Column8,Column9,Column10,Column11,Column12,Column13,Column14,Column15,Column16,Column17,Column18,Column19,Column20,Column21,Column22,Column23,Column24,Column25,Column26,Column27,1,Column29,Column30,Column31,Column32,Column33,Column34,Column35,Column36,Column37,Column38,Column39,Column40,Column41,Column42,Column43,Column44,Column45,Column46,Column47,Column48,Column49,Column50,Column51,Column52,Column53,Column54,Column55,2,Column57,Column58,Column59,Column60,Column61,Column62,Column63,Column64,Column65,Column66,Column67,Column68,Column69,Column70,Column71,Column72,Column73,Column74,Column75,Column76,Column77,Column78,Column79,Column80,Column81,Column82,Column83,3,Column85,Column86,Column87,Column88,Column89,Column90,Column91,Column92,Column93,Column94,Column95,Column96,Column97,Column98,Column99,Column100,Column101,Column102,Column103,Column104,Column105,Column106,Column107,Column108,Column109,Column110,Column111,4,Column113,Column114,Column115,Column116,Column117,Column118,Column119,Column120,Column121,Column122,Column123,Column124,Column125,Column126,Column127,Column128,Column129,Column130,Column131,Column132,Column133,Column134,Column135,Column136,Column137,Column138,Column139,5,Column141,Column142,Column143,Column144,Column145,Column146,Column147,Column148,Column149,Column150,Column151,Column152,Column153,Column154,Column155,Column156,Column157,Column158,Column159,Column160,Column161,Column162,Column163,Column164,Column165,Column166,Column167,6,Column169,Column170,Column171,Column172,Column173,Column174,Column175,Column176,Column177,Column178,Column179,Column180,Column181,Column182,Column183,Column184,Column185,Column186,Column187,Column188,Column189,Column190,Column191,Column192,Column193,Column194,Column195,7,Column197
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,3,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,4,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,model_number,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,star_age,,,,,,,,,,,,,,,,,,,,,,,,,,,,,star_age_day,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,rsp_phase,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,rsp_GREKM,,,,,,,,,,,,,,,,,,,,,,,,rsp_GREKM_avg_abs,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1,,,,,,,,,,,,,,,,,,6.738963607841887e-7,,,,,,,,,,,,,,,,,,0.0002461449355011482,,,,,,,,,,,,,,,,,,1.9999999999999998e-5,,,,,,,,,,,,,,,,,,0.0,,,,,,,,,,,,,,,,,,0.0,,,,,,,,,,,,,,,,,,0.0,,,,,,,,,,,,,,,,,,0.0,,,,,,,,,,,,,,,,,,12.30724677505741,,,,,,,,,,,,,


如何改进代码以成功完成任务?

似乎未正确选择
CSV.File
参数。 通过您的帖子创建CSV文件,可以执行以下操作:

julia> df = DataFrame(CSV.File("tmp.csv", skipto = 7, header = 6, delim=' ', ignorerepeated=true))
1×62 DataFrame. Omitted printing of 53 columns
│ Row │ model_number │ star_age   │ star_age_day │ rsp_phase │ rsp_GREKM │ rsp_GREKM_avg_abs │ rsp_DeltaR │ rsp_DeltaMag │ rsp_period_in_days │
│     │ Int64        │ Float64    │ Float64      │ Float64   │ Float64   │ Float64           │ Float64    │ Float64      │ Float64            │
├─────┼──────────────┼────────────┼──────────────┼───────────┼───────────┼───────────────────┼────────────┼──────────────┼────────────────────┤
│ 1   │ 1            │ 6.73896e-7 │ 0.000246145  │ 2.0e-5    │ 0.0       │ 0.0               │ 0.0        │ 0.0          │ 12.3072            │

julia> df[:,43]
1-element Array{Float64,1}:
 79.10512743051758

julia> names(df)[43]
"abs_mag_I"

julia> df.abs_mag_I
1-element Array{Float64,1}:
 79.10512743051758

为了更好地回答您的问题,最好能够访问您正在使用的确切CSV。

您的逗号分隔值文件似乎格式不正确。我猜CSV.File看不到逗号,而分隔字符是一个空格。结果,会得到一堆空列。