xmlstarlet提取子HTML元素的值,该元素以不同的值重复
目标: 使用xmlstarlet提取子HTML元素的值,该元素以不同的值重复并导出为CSV 数据:xmlstarlet提取子HTML元素的值,该元素以不同的值重复,xml,csv,xmlstarlet,Xml,Csv,Xmlstarlet,目标: 使用xmlstarlet提取子HTML元素的值,该元素以不同的值重复并导出为CSV 数据: <?xml version="1.0" encoding="UTF-8"?> <Library export_date="2020-01-15"> <Book id="1001"> <Title>Book 1</Title> <Date value="2019-05-16"/> <Author
<?xml version="1.0" encoding="UTF-8"?>
<Library export_date="2020-01-15">
<Book id="1001">
<Title>Book 1</Title>
<Date value="2019-05-16"/>
<Author value="Name 1"/>
<Author value="Name 2"/>
<Author value="Name 3"/>
<Author value="Name 4"/>
<Author value="Name 5"/>
<Author value="Name 6"/>
<Author value="Name 7"/>
<Author value="Name 8"/>
<Author value="Name 9"/>
<Author value="Name 10"/>
<Author value="Name 11"/>
<Author value="Name 12"/>
<Author value="Name 13"/>
</Book>
</Library>
xmlstarlet \
sel -T -t -m /Library/Book \
-v "concat('"','Title','","',Author/@value,'"')" \
-n library_books.xml \
> output.csv
# Remove xmlstarlet quotation bypass, convert to actual quotation
sed -i .bak 's|"|\"|g' output.csv
"Title","Name 1"
"Title","Name 1; Name 2; Name 3; Name 4; Name 5; Name 6; Name 7; Name 8; Name 9; Name 10; Name 11; Name 12; Name 13"
"Title","Name 1
Name 2
Name 3
Name 4
Name 5
Name 6
Name 7
Name 8
Name 9
Name 10
Name 11
Name 12
Name 13"
CSV输出:
<?xml version="1.0" encoding="UTF-8"?>
<Library export_date="2020-01-15">
<Book id="1001">
<Title>Book 1</Title>
<Date value="2019-05-16"/>
<Author value="Name 1"/>
<Author value="Name 2"/>
<Author value="Name 3"/>
<Author value="Name 4"/>
<Author value="Name 5"/>
<Author value="Name 6"/>
<Author value="Name 7"/>
<Author value="Name 8"/>
<Author value="Name 9"/>
<Author value="Name 10"/>
<Author value="Name 11"/>
<Author value="Name 12"/>
<Author value="Name 13"/>
</Book>
</Library>
xmlstarlet \
sel -T -t -m /Library/Book \
-v "concat('"','Title','","',Author/@value,'"')" \
-n library_books.xml \
> output.csv
# Remove xmlstarlet quotation bypass, convert to actual quotation
sed -i .bak 's|"|\"|g' output.csv
"Title","Name 1"
"Title","Name 1; Name 2; Name 3; Name 4; Name 5; Name 6; Name 7; Name 8; Name 9; Name 10; Name 11; Name 12; Name 13"
"Title","Name 1
Name 2
Name 3
Name 4
Name 5
Name 6
Name 7
Name 8
Name 9
Name 10
Name 11
Name 12
Name 13"
所需的CSV输出:
<?xml version="1.0" encoding="UTF-8"?>
<Library export_date="2020-01-15">
<Book id="1001">
<Title>Book 1</Title>
<Date value="2019-05-16"/>
<Author value="Name 1"/>
<Author value="Name 2"/>
<Author value="Name 3"/>
<Author value="Name 4"/>
<Author value="Name 5"/>
<Author value="Name 6"/>
<Author value="Name 7"/>
<Author value="Name 8"/>
<Author value="Name 9"/>
<Author value="Name 10"/>
<Author value="Name 11"/>
<Author value="Name 12"/>
<Author value="Name 13"/>
</Book>
</Library>
xmlstarlet \
sel -T -t -m /Library/Book \
-v "concat('"','Title','","',Author/@value,'"')" \
-n library_books.xml \
> output.csv
# Remove xmlstarlet quotation bypass, convert to actual quotation
sed -i .bak 's|"|\"|g' output.csv
"Title","Name 1"
"Title","Name 1; Name 2; Name 3; Name 4; Name 5; Name 6; Name 7; Name 8; Name 9; Name 10; Name 11; Name 12; Name 13"
"Title","Name 1
Name 2
Name 3
Name 4
Name 5
Name 6
Name 7
Name 8
Name 9
Name 10
Name 11
Name 12
Name 13"
可选CSV所需输出(新行上的值):
<?xml version="1.0" encoding="UTF-8"?>
<Library export_date="2020-01-15">
<Book id="1001">
<Title>Book 1</Title>
<Date value="2019-05-16"/>
<Author value="Name 1"/>
<Author value="Name 2"/>
<Author value="Name 3"/>
<Author value="Name 4"/>
<Author value="Name 5"/>
<Author value="Name 6"/>
<Author value="Name 7"/>
<Author value="Name 8"/>
<Author value="Name 9"/>
<Author value="Name 10"/>
<Author value="Name 11"/>
<Author value="Name 12"/>
<Author value="Name 13"/>
</Book>
</Library>
xmlstarlet \
sel -T -t -m /Library/Book \
-v "concat('"','Title','","',Author/@value,'"')" \
-n library_books.xml \
> output.csv
# Remove xmlstarlet quotation bypass, convert to actual quotation
sed -i .bak 's|"|\"|g' output.csv
"Title","Name 1"
"Title","Name 1; Name 2; Name 3; Name 4; Name 5; Name 6; Name 7; Name 8; Name 9; Name 10; Name 11; Name 12; Name 13"
"Title","Name 1
Name 2
Name 3
Name 4
Name 5
Name 6
Name 7
Name 8
Name 9
Name 10
Name 11
Name 12
Name 13"
在您的系统上尝试以下操作:
xmlstarlet sel -t -v "concat('Title: ',//Title, ' ')" -n -v "//Author/@value" min_library_books.xml > output.csv
我的输出是您的第二种选择。在您的系统上尝试以下方法:
xmlstarlet sel -t -v "concat('Title: ',//Title, ' ')" -n -v "//Author/@value" min_library_books.xml > output.csv
我的输出是您的第二种选择。不幸的是,此输出不正确,添加的Author元素值没有包装引号“Name 1 Name 2…”等,因此当打开CSV时,这是损坏的。不幸的是,此输出不正确,添加的Author元素值没有包装引号“Name 1 Name 2…”因此,当CSV打开时,这是损坏的