将重复数据从行转换为列Excel
我有以下格式的基本住房数据集: 现有数据格式: 这种格式是相同的,并且适用于数百个属性。我想将其转换为如下示例所示的表格格式: 属性类型 价格 位置 区域 附加信息 地区 房子 252000 伦敦 肯辛顿 4500平方米 ... ... ... ... ... 等将重复数据从行转换为列Excel,excel,Excel,我有以下格式的基本住房数据集: 现有数据格式: 这种格式是相同的,并且适用于数百个属性。我想将其转换为如下示例所示的表格格式: 属性类型 价格 位置 区域 附加信息 地区 房子 252000 伦敦 肯辛顿 4500平方米 ... ... ... ... ... 等 根据我下面的截图excel365我使用了以下公式 C2=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$
根据我下面的截图
excel365
我使用了以下公式
C2=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,1,4)),": ","</s><s>")&"</s></t>","//s[last()]")
D2=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,2,4)),": ","</s><s>")&"</s></t>","//s[last()]")
E2=FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,3,4)),",","</s><s>"),":","</s><s>")&"</s></t>","//s[2]")
F2=FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,3,4)),",","</s><s>"),":","</s><s>")&"</s></t>","//s[last()-1]")
H2=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,4,4)),": ","</s><s>")&"</s></t>","//s[last()]")
基本上,=ROW(A1)+(ROW(A1)-1)*3将生成一个行号序列,索引($a:$a,ROW($A1)+(ROW($A1)-1)*3)
将根据该序列从列a
返回值。然后FILTERXML()
将返回xPath
参数中指定的预期值
要知道,FILTERXML()
如何工作,您可以从JvdV读取。这是一篇非常适合FILTERXML()
lover的文章
根据我下面的屏幕截图excel365
我使用了以下公式
C2=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,1,4)),": ","</s><s>")&"</s></t>","//s[last()]")
D2=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,2,4)),": ","</s><s>")&"</s></t>","//s[last()]")
E2=FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,3,4)),",","</s><s>"),":","</s><s>")&"</s></t>","//s[2]")
F2=FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,3,4)),",","</s><s>"),":","</s><s>")&"</s></t>","//s[last()-1]")
H2=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,4,4)),": ","</s><s>")&"</s></t>","//s[last()]")
基本上,=ROW(A1)+(ROW(A1)-1)*3将生成一个行号序列,索引($a:$a,ROW($A1)+(ROW($A1)-1)*3)
将根据该序列从列a
返回值。然后FILTERXML()
将返回xPath
参数中指定的预期值
要知道,FILTERXML()
如何工作,您可以从JvdV读取。这是一篇非常适合FILTERXML()
lover的文章
您可以使用Windows Excel 2010+和Office 365 Excel中提供的电源查询
获得所需的输出
- 在原始表格中选择一些单元格
Data=>Get&Transform=>fromtable/Range
- 当PQ UI打开时,导航到
Home=>Advanced Editor
- 记下代码第2行中的表名
- 将现有代码替换为下面的M代码
- 将粘贴代码第2行中的表名更改为“真实”表名
- 检查所有注释,以及
应用步骤
窗口,以便更好地理解算法和步骤
注意:fnprovotall
函数是一个自定义函数,可用于创建非聚合数据透视表,其中每个数据透视列有多个值。从UI中,您可以从Blank
将其添加为新查询
,然后只需将该M代码粘贴到此处即可
M-Code(用于主查询)
您可以使用Windows Excel 2010+和Office 365 Excel中提供的电源查询
获得所需的输出
- 在原始表格中选择一些单元格
Data=>Get&Transform=>fromtable/Range
- 当PQ UI打开时,导航到
Home=>Advanced Editor
- 记下代码第2行中的表名
- 将现有代码替换为下面的M代码
- 将粘贴代码第2行中的表名更改为“真实”表名
- 检查所有注释,以及
应用步骤
窗口,以便更好地理解算法和步骤
注意:fnprovotall
函数是一个自定义函数,可用于创建非聚合数据透视表,其中每个数据透视列有多个值。从UI中,您可以从Blank
将其添加为新查询
,然后只需将该M代码粘贴到此处即可
M-Code(用于主查询)
您的excel版本是什么?您的excel版本是什么?
let
//Read in data
//Change table name in next line to your actural table name
Source = Excel.CurrentWorkbook(){[Name="Table1_2"]}[Content],
//Split by comma into new rows
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(Source, {{"Column1",
Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv),
let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1"),
//Remove the blank rows
#"Filtered Rows" = Table.SelectRows(#"Split Column by Delimiter", each ([Column1] <> "" and [Column1] <> " ")),
//Split by the rightmost colon only into new columns
#"Split Column by Delimiter1" = Table.SplitColumn(#"Filtered Rows", "Column1",
Splitter.SplitTextByEachDelimiter({":"}, QuoteStyle.Csv, true), {"Column1.1", "Column1.2"}),
//Split by the remaining colon into new rows
// So as to have empty rows under "Additional data"
//Then Trim the columns to remove leading/trailing spaces
#"Split Column by Delimiter2" = Table.ExpandListColumn(Table.TransformColumns(#"Split Column by Delimiter1", {{"Column1.1", Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1.1"),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter2",{{"Column1.1", type text}, {"Column1.2", type text}}),
#"Trimmed Text" = Table.TransformColumns(#"Changed Type",{{"Column1.1", Text.Trim, type text}, {"Column1.2", Text.Trim, type text}}),
//Create new column processing "Additional Data" to show a blank
// and Price to just show the numeric value, splitting from "EUR"
#"Added Custom" = Table.AddColumn(#"Trimmed Text", "Custom", each if [Column1.1] = "Additional data" then " "
else if [Column1.1] = "Price" then Text.Split([Column1.2]," "){1} else [Column1.2]),
//Remove unneeded column
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Column1.2"}),
//non-aggregated pivot
pivot = fnPivotAll(#"Removed Columns","Column1.1","Custom"),
//set data types (frequently a good idea in PQ
#"Changed Type1" = Table.TransformColumnTypes(pivot,{
{"Property type", type text},
{"Location", type text},
{"region", type text},
{"Additional data", type text},
{"Area", type text},
{"Price", Currency.Type}})
in
#"Changed Type1"
//credit: Cam Wallace https://www.dingbatdata.com/2018/03/08/non-aggregate-pivot-with-multiple-rows-in-powerquery/
(Source as table,
ColToPivot as text,
ColForValues as text)=>
let
PivotColNames = List.Buffer(List.Distinct(Table.Column(Source,ColToPivot))),
#"Pivoted Column" = Table.Pivot(Source, PivotColNames, ColToPivot, ColForValues, each _),
TableFromRecordOfLists = (rec as record, fieldnames as list) =>
let
PartialRecord = Record.SelectFields(rec,fieldnames),
RecordToList = Record.ToList(PartialRecord),
Table = Table.FromColumns(RecordToList,fieldnames)
in
Table,
#"Added Custom" = Table.AddColumn(#"Pivoted Column", "Values", each TableFromRecordOfLists(_,PivotColNames)),
#"Removed Other Columns" = Table.RemoveColumns(#"Added Custom",PivotColNames),
#"Expanded Values" = Table.ExpandTableColumn(#"Removed Other Columns", "Values", PivotColNames)
in
#"Expanded Values"