Excel 从包含CSV文件的文件夹中筛选Power Query中多个条件的方法

Excel 从包含CSV文件的文件夹中筛选Power Query中多个条件的方法,excel,vba,performance,powerbi,powerquery,Excel,Vba,Performance,Powerbi,Powerquery,我需要您的帮助,以更正/建议我用于从CSV格式的文件夹中获取数据的查询预先警告:我不知道,短期内如何写这篇文章。 首先是一些信息: 工具仅限于Power Query、Excel、VBA 数据查询将每月运行一次,所以较大的加载时间不是一个大问题,尽管较短的加载时间更可取 我选择了powerquery方法,因为源数据必须在另一个Excel文件中使用,但使用不同的规则集(这是我当前问题的一部分) 我的代码的基本问题是,它运行了很长时间,需要满足大量的条件,出于另一个原因/工具/文件,我不得不使用类似

我需要您的帮助,以更正/建议我用于从CSV格式的文件夹中获取数据的查询预先警告:我不知道,短期内如何写这篇文章。

首先是一些信息:

  • 工具仅限于Power Query、Excel、VBA

  • 数据查询将每月运行一次,所以较大的加载时间不是一个大问题,尽管较短的加载时间更可取

  • 我选择了powerquery方法,因为源数据必须在另一个Excel文件中使用,但使用不同的规则集(这是我当前问题的一部分)
  • 我的代码的基本问题是,它运行了很长时间,需要满足大量的条件,出于另一个原因/工具/文件,我不得不使用类似的方法。我希望人们只需按“刷新”即可获得所需信息

说明:

我有一个文件夹中CSV文件中的数据源。命名约定不存在,因为有多个人从系统中导出数据。正因为如此,我在PQ中使用了文件夹选项

数据的大小目前约为400-600 MB。列的名称可能正在更改,这是M代码中要绕过的第一行

我的主要奋斗目标是:

有几个条件需要执行。我不想编写多个
if
语句,因为代码会变得非常难看,而且条件的数量是十分之一,跨越多个列。出于这个原因,我已经实现了(让我们称之为TT)转换表,其中我有可以使用过滤的所有列,TT的最后一列是所有列的串联。如果在条件中我不关心其中一列,我将用通配符“*”填充它

所以TT可能看起来像:

| PC | CLIENT | FN  | TC | STRING      |

|----|--------|-----|----|-------------|

| 11 | *      | NEW | AC | 11*NEWAC    |

| 47 | 000001 | NEW | *  | 47000001NEW*|
等等

PC是PoC,FN是功能,TC是事务代码(在下面的代码中)

然后在代码中,我用PQ中相应列的值替换通配符,并检查PQ中相同列的连接字符串是否包含在TT中(最后一列被制成列表)。 下面的代码适用于更简单的解决方案,但它相当硬编码,因为我想知道它是否可能

在数据更新之后,我运行VBA宏将数据附加到“数据库”表中(ofc检查是否存在现有值),这样可以最小化数据负载。因此,使用第一部分代码。 基本上,我可以将代码分为三部分:

  • 基本转换:从文件夹加载,去掉非常规名称,并检查其他文件夹是否包含相同的命名文件,以最小化负载
  • 过滤数据:包括将PQ表与TT表合并,用正确的列替换通配符,然后创建过滤字符串,以检查连接的PQ表中的文本是否至少包含TT列表中的一个值
  • 对使用过的数据进行最终转换,以获得我需要的信息(主要是关于市场的后期结算)

  • 带注释的完整M代码

    let
        /*Here starts basic data transformation to limit errors in CSV files due to 
        different conventions */
        Source = Folder.Files(source),
        #"Uppercased Text1" = Table.TransformColumns(Source,{{"Name", Text.Upper, type text}}),
        #"Merged Queries2" = Table.NestedJoin(#"Uppercased Text1", {"Name"}, q_Archive, {"Name"}, "q_Archive", JoinKind.LeftAnti),
        #"Added Custom" = Table.AddColumn(#"Merged Queries2", "Data", each Csv.Document(File.Contents([Folder Path] & "\" & [Name]),[Delimiter=";", Encoding = 1252, QuoteStyle = QuoteStyle.None])),
        #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Data"}),
        #"Added Custom1" = Table.AddColumn(#"Removed Other Columns", "Table", each Table.PromoteHeaders([Data])),
        #"Removed Other Columns1" = Table.SelectColumns(#"Added Custom1",{"Table"}),
        #"Added Custom2" = Table.AddColumn(#"Removed Other Columns1", "Upper", each Table.TransformColumnNames([Table],Text.Upper)),
        #"Removed Other Columns2" = Table.SelectColumns(#"Added Custom2",{"Upper"}),
        #"Expanded Upper" = Table.ExpandTableColumn(#"Removed Other Columns2", "Upper", {"19A AMOUNT", "19A CURRENCY CODE", "35B ISIN", "CLIENT", "EXP.SETTL.DATE", "FUNCTION", "INSTR.ID", "MESSAGE FUNCTION", "POC", "RECEPTION DATE", "SETTL.AMOUNT", "SETTL.CUR.", "TRANSACTION CODE"}, {"19A AMOUNT", "19A CURRENCY CODE", "35B ISIN", "CLIENT", "EXP.SETTL.DATE", "FUNCTION", "INSTR.ID", "MESSAGE FUNCTION", "POC", "RECEPTION DATE", "SETTL.AMOUNT", "SETTL.CUR.", "TRANSACTION CODE"}),
        #"Renamed Columns1" = Table.RenameColumns(#"Expanded Upper",{{"SETTL.AMOUNT", "SETTL.AMOUNT2"}, {"SETTL.CUR.", "SETTL.CUR.2"}, {"19A CURRENCY CODE", "19A CURRENCY CODE2"}, {"19A AMOUNT", "19A AMOUNT2"}}),
        #"Added Custom10" = Table.AddColumn(#"Renamed Columns1", "19A AMOUNT", each if[SETTL.AMOUNT2]=null then [19A AMOUNT2] else [SETTL.AMOUNT2]),
        #"Added Custom11" = Table.AddColumn(#"Added Custom10", "19A CURRENCY CODE", each if [SETTL.CUR.2] = null then [19A CURRENCY CODE2] else [SETTL.CUR.2]),
        #"Renamed Columns" = Table.RenameColumns(#"Added Custom11",{{"FUNCTION", "FUNCTION2"}}),
        #"Added Custom8" = Table.AddColumn(#"Renamed Columns", "FUNCTION", each if[FUNCTION2]=null then [MESSAGE FUNCTION] else[FUNCTION2]),
        #"Removed Other Columns3" = Table.SelectColumns(#"Added Custom8",{"35B ISIN", "CLIENT", "EXP.SETTL.DATE", "INSTR.ID", "POC", "RECEPTION DATE", "TRANSACTION CODE", "19A AMOUNT", "19A CURRENCY CODE", "FUNCTION"}),
        #"Reordered Columns" = Table.ReorderColumns(#"Removed Other Columns3",{"POC", "CLIENT", "FUNCTION", "TRANSACTION CODE", "EXP.SETTL.DATE", "RECEPTION DATE", "19A AMOUNT", "19A CURRENCY CODE"}),
        #"Replaced Value" = Table.ReplaceValue(#"Reordered Columns","""","",Replacer.ReplaceText,{"POC", "CLIENT", "INSTR.ID", "35B ISIN"}),
        #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value","=","",Replacer.ReplaceText,{"POC", "CLIENT", "INSTR.ID", "35B ISIN"}),
        #"Uppercased Text" = Table.TransformColumns(#"Replaced Value1",{{"POC", Text.Upper, type text}, {"CLIENT", Text.Upper, type text}, {"FUNCTION", Text.Upper, type text}, {"TRANSACTION CODE", Text.Upper, type text}}),
        #"Filtered Rows" = Table.SelectRows(#"Uppercased Text", each ([FUNCTION] = "NEWM")),
        #"Merged Queries" = Table.NestedJoin(#"Filtered Rows", {"POC"}, tbl_setup_pocList, {"PocList"}, "tbl_setup_pocList", JoinKind.Inner),
            #"Removed Columns" = Table.RemoveColumns(#"Merged Queries",{"tbl_setup_pocList"}),
    
    
    /* Here ends the data transformation part
       and the part for list transformations start*/
            #"Added condition" = Table.AddColumn(#"Removed Columns","COND", each (
                ((Table.FromRecords({
                    [PC = List.ReplaceValue(Table.Column(tbl_filtering_string, "POC"),"*",[POC], Replacer.ReplaceText),
                    CL = List.ReplaceValue(Table.Column(tbl_filtering_string, "CLIENT"),"*",[CLIENT], Replacer.ReplaceText),
                    FN = List.ReplaceValue(Table.Column(tbl_filtering_string, "FUNCTION"),"*",[FUNCTION], Replacer.ReplaceText),
                    TC = List.ReplaceValue(Table.Column(tbl_filtering_string, "TRANSACTION CODE"),"*",[TRANSACTION CODE], Replacer.ReplaceText)]}
                ))))),
            #"Expanded COND" = Table.ExpandTableColumn(#"Added condition", "COND", {"PC", "CL", "FN", "TC"}, {"PC", "CL", "FN", "TC"}),
            #"Added Custom3" = Table.AddColumn(#"Expanded COND", "Test",  each (List.Combine(
                {
                    {_[PC]},{_[CL]},{_[FN]},{_[TC]}
                }
            ))),
            #"Expanded Test" = Table.AddColumn(#"Added Custom3", "Test2", each (Table.FromColumns(_[Test],null))),
            #"Removed Columns2" = Table.RemoveColumns(#"Expanded Test",{"PC", "CL", "FN", "TC", "Test"}),
            #"Added Custom4" = Table.AddColumn(#"Removed Columns2", "String", each Table.ToList([Test2],Combiner.CombineTextByDelimiter(""))),
            #"Removed Columns3" = Table.RemoveColumns(#"Added Custom4",{"Test2"}),
            #"Added Custom6" = Table.AddColumn(#"Removed Columns3", "CONTAIN_STR", each [POC]&[CLIENT]&[FUNCTION]&[TRANSACTION CODE]),
            #"Added Custom5" = Table.AddColumn(#"Added Custom6", "Cond", each List.Contains(_[String],[CONTAIN_STR])),
            #"Filtered Rows1" = Table.SelectRows(#"Added Custom5", each ([Cond] = false)),
    
            /*Here the code for filtering ends and final transformations occur */
    
            #"Removed Columns4" = Table.RemoveColumns(#"Filtered Rows1",{"String", "CONTAIN_STR", "Cond"}),
            #"Merged Queries1" = Table.NestedJoin(#"Removed Columns4", {"POC"}, tbl_setup_exotics, {"Exotic_PoC"}, "tbl_setup_exotics", JoinKind.LeftOuter),
            #"Expanded tbl_setup_exotics" = Table.ExpandTableColumn(#"Merged Queries1", "tbl_setup_exotics", {"Exotic_PoC"}, {"Exotic_PoC"}),
            #"Replaced Value2" = Table.ReplaceValue(#"Expanded tbl_setup_exotics",null, "Non Exotic",Replacer.ReplaceValue,{"Exotic_PoC"}),
            #"Removed Errors" = Table.RemoveRowsWithErrors(#"Replaced Value2", {"EXP.SETTL.DATE", "RECEPTION DATE"}),
            #"Changed Type" = Table.TransformColumnTypes(#"Removed Errors",{{"EXP.SETTL.DATE", type date}, {"RECEPTION DATE", type date}}),
            #"Added Custom7" = Table.AddColumn(#"Changed Type", "RD", each (if [Exotic_PoC] <> "Non Exotic" then Date.AddDays([RECEPTION DATE],1)else [RECEPTION DATE])),
            #"Filtered Rows2" = Table.AddColumn(#"Added Custom7", "LB" , each if [RD]>=[EXP.SETTL.DATE] then "Late" else "Not"),
            #"Added Custom9" = Table.AddColumn(#"Filtered Rows2", "DAYS_LATE", each [RD]-[EXP.SETTL.DATE]),
            #"Inserted Year" = Table.AddColumn(#"Added Custom9", "Year", each Date.Year([EXP.SETTL.DATE]), Int64.Type),
            #"Inserted Month" = Table.AddColumn(#"Inserted Year", "Month", each Date.Month([EXP.SETTL.DATE]), Int64.Type),
            #"Changed Type1" = Table.TransformColumnTypes(#"Inserted Month",{{"19A AMOUNT", type number}}),
            #"Grouped Rows" = Table.Group(#"Changed Type1", {"Year", "Month", "POC", "19A CURRENCY CODE", "DAYS_LATE", "LB"}, {{"Count", each Table.RowCount(_), type number}, {"Countervalue", each List.Sum([19A AMOUNT]), type text}, {"ISIN", each Text.Combine([35B ISIN],";"), type text}, {"INSTR.ID", each Text.Combine([INSTR.ID], ";"), type text}}),
            #"Merged Queries3" = Table.NestedJoin(#"Grouped Rows", {"Year", "Month", "19A CURRENCY CODE"}, q_Xrates, {"Year", "Month", "Currency"}, "q_Xrates", JoinKind.LeftOuter),
            #"Expanded q_Xrates" = Table.ExpandTableColumn(#"Merged Queries3", "q_Xrates", {"Rate"}, {"Rate"}),
            #"Replaced Value3" = Table.ReplaceValue(#"Expanded q_Xrates",null,1,Replacer.ReplaceValue,{"Rate"}),
            #"Added Col" = Table.AddColumn(#"Replaced Value3", "CV", each [Countervalue]/[Rate]),
            #"Remove Countervalue" = Table.RemoveColumns(#"Added Col", {"Countervalue"})
        in
            #"Remove Countervalue"
    

    如果要执行此操作,我将编写一个函数,将筛选条件表编译为函数,然后将其应用于table.SelectRows

    // Compile the condition table into a function that can be applied in row filtering.
    filterCondition = compileFilterConditionTable(tbl_filtering_string),
    
    #"Filtered Rows" = Table.SelectRows(#"Table after Preceding Steps", filterCondition)
    
    这看起来不是更容易追踪步骤吗

    下面是将条件表编译为逻辑函数的函数的示例代码。我不确定这是否适用于您的案例,因为我不完全理解需求

    compileFilterConditionTable =
    
        let compileFilterConditionTable = (filterConditionTable as table) as function =>
                let recordConditions = List.Transform(
                        Table.ToRecords(filterConditionTable),
                        compileFilterConditionRecord)
                in applyCombine(recordConditions, List.AnyTrue),
    
            compileFilterConditionRecord = (cond as record) as function =>
                let fieldNameValues = List.Transform(
                        Record.FieldNames(cond),
                        each [Name = _, Value = Record.Field(cond, Name)]
                    ),
                    fieldConditions = List.Transform(fieldNameValues, compileFieldCondition)
                in applyCombine(fieldConditions, List.AllTrue),
    
            compileFieldCondition = (fieldNameValue as record) as function =>
                let name = fieldNameValue[Name],
                    value = fieldNameValue[Value]
                in
                    if value = "*" then (record as record) as logical => true
                    else (record as record) as logical => Record.Field(record, name) = value,
    
            applyCombine = (functions as list, combiner as function) as function =>
                (value) => combiner(List.Transform(functions, (f) => f(value)))
    
        in compileFilterConditionTable
    

    无论如何,M是一种函数式编程语言,所以用函数式的方式来思考和编写它会有所帮助。将整个逻辑分解为几个小部分,以便每个小部分都足够容易理解。将您的代码编写为可重用的小函数,并将它们组合起来构建整体。

    谢谢Kosuke Sakai。首先,我很抱歉没有及时回复,我真的无法理解您的代码,因为它非常抽象,而且我更习惯于OOP(VBA,Python)。我接受了is作为答案,虽然它不是我所需要的东西(基本上您的fn返回“true”值,我需要false),但它确实很有用。我在定位“Table.Buffer”时做了一些调整,现在查询运行大约5分钟,而不是60分钟以上。你会一直推荐这种方法吗?或者你认为多个条件(50+)也可以通过其他方式过滤?感谢you@MarekKlu谢谢你的评论!当然会有办法做到这一点,但实际上,我没有比我上面所展示的更好的想法。
    compileFilterConditionTable =
    
        let compileFilterConditionTable = (filterConditionTable as table) as function =>
                let recordConditions = List.Transform(
                        Table.ToRecords(filterConditionTable),
                        compileFilterConditionRecord)
                in applyCombine(recordConditions, List.AnyTrue),
    
            compileFilterConditionRecord = (cond as record) as function =>
                let fieldNameValues = List.Transform(
                        Record.FieldNames(cond),
                        each [Name = _, Value = Record.Field(cond, Name)]
                    ),
                    fieldConditions = List.Transform(fieldNameValues, compileFieldCondition)
                in applyCombine(fieldConditions, List.AllTrue),
    
            compileFieldCondition = (fieldNameValue as record) as function =>
                let name = fieldNameValue[Name],
                    value = fieldNameValue[Value]
                in
                    if value = "*" then (record as record) as logical => true
                    else (record as record) as logical => Record.Field(record, name) = value,
    
            applyCombine = (functions as list, combiner as function) as function =>
                (value) => combiner(List.Transform(functions, (f) => f(value)))
    
        in compileFilterConditionTable