Mediawiki 如何通过WikipediaAPI获取特定部分的文本
我只想从维基百科页面中提取一个特定的设置: 例如: 我想从维基百科文章“房子”的“部分”中提取文本 由此产生的案文将是:Mediawiki 如何通过WikipediaAPI获取特定部分的文本,mediawiki,wikipedia,wikipedia-api,Mediawiki,Wikipedia,Wikipedia Api,我只想从维基百科页面中提取一个特定的设置: 例如: 我想从维基百科文章“房子”的“部分”中提取文本 由此产生的案文将是: Many houses have several large rooms ..... sections of the home (including in more recent eras a garage). 我们可以从如下文章中获取孔文本: 但是如何获取特定部分的文本呢?您是否需要将wikitext或解析器生成的HTML转换为纯文本 下面的示例为您提供了“布局
Many houses have several large rooms ..... sections of the home (including in more recent eras a garage).
我们可以从如下文章中获取孔文本:
但是如何获取特定部分的文本呢?您是否需要将wikitext或解析器生成的HTML转换为纯文本 下面的示例为您提供了“布局”部分(本文的第三部分,您也可以使用任何其他部分ID) 要检索特定节的已解析html时,应使用解析api: 或者,作为沙箱外部的API请求: 如果您想拥有特定部分的wikitext,只需使用wikitext道具而不是文本道具: 为了知道哪个节有什么索引,您可以使用“节”属性查询此信息,而不需要任何节索引: 因此,作为仅使用API检索布局节文本的完整示例,您可以:
{
"parse": {
"title": "House",
"pageid": 13590,
"sections": [
{
"toclevel": 1,
"level": "2",
"line": "Etymology",
"number": "1",
"index": "1",
"fromtitle": "House",
"byteoffset": 3549,
"anchor": "Etymology"
},
{
"toclevel": 1,
"level": "2",
"line": "Elements",
"number": "2",
"index": "2",
"fromtitle": "House",
"byteoffset": 4960,
"anchor": "Elements"
},
{
"toclevel": 2,
"level": "3",
"line": "Layout",
"number": "2.1",
"index": "3",
"fromtitle": "House",
"byteoffset": 4976,
"anchor": "Layout"
},
{
"toclevel": 2,
"level": "3",
"line": "Parts",
"number": "2.2",
"index": "4",
"fromtitle": "House",
"byteoffset": 6432,
"anchor": "Parts"
},
{
"toclevel": 2,
"level": "3",
"line": "History of the interior",
"number": "2.3",
"index": "5",
"fromtitle": "House",
"byteoffset": 7539,
"anchor": "History_of_the_interior"
},
{
"toclevel": 3,
"level": "4",
"line": "Communal rooms",
"number": "2.3.1",
"index": "6",
"fromtitle": "House",
"byteoffset": 8786,
"anchor": "Communal_rooms"
},
{
"toclevel": 3,
"level": "4",
"line": "Interconnecting rooms",
"number": "2.3.2",
"index": "7",
"fromtitle": "House",
"byteoffset": 9736,
"anchor": "Interconnecting_rooms"
},
{
"toclevel": 3,
"level": "4",
"line": "Corridor",
"number": "2.3.3",
"index": "8",
"fromtitle": "House",
"byteoffset": 11126,
"anchor": "Corridor"
},
{
"toclevel": 3,
"level": "4",
"line": "Employment-free house",
"number": "2.3.4",
"index": "9",
"fromtitle": "House",
"byteoffset": 13092,
"anchor": "Employment-free_house"
},
{
"toclevel": 2,
"level": "3",
"line": "Work location, technology and doctors",
"number": "2.4",
"index": "10",
"fromtitle": "House",
"byteoffset": 15969,
"anchor": "Work_location,_technology_and_doctors"
},
{
"toclevel": 3,
"level": "4",
"line": "Technology and privacy",
"number": "2.4.1",
"index": "11",
"fromtitle": "House",
"byteoffset": 17291,
"anchor": "Technology_and_privacy"
},
{
"toclevel": 1,
"level": "2",
"line": "Construction",
"number": "3",
"index": "12",
"fromtitle": "House",
"byteoffset": 18782,
"anchor": "Construction"
},
{
"toclevel": 2,
"level": "3",
"line": "Energy efficiency",
"number": "3.1",
"index": "13",
"fromtitle": "House",
"byteoffset": 21899,
"anchor": "Energy_efficiency"
},
{
"toclevel": 2,
"level": "3",
"line": "Earthquake protection",
"number": "3.2",
"index": "14",
"fromtitle": "House",
"byteoffset": 23057,
"anchor": "Earthquake_protection"
},
{
"toclevel": 1,
"level": "2",
"line": "Found materials",
"number": "4",
"index": "15",
"fromtitle": "House",
"byteoffset": 25172,
"anchor": "Found_materials"
},
{
"toclevel": 1,
"level": "2",
"line": "Legal issues",
"number": "5",
"index": "16",
"fromtitle": "House",
"byteoffset": 26235,
"anchor": "Legal_issues"
},
{
"toclevel": 2,
"level": "3",
"line": "United Kingdom",
"number": "5.1",
"index": "17",
"fromtitle": "House",
"byteoffset": 26644,
"anchor": "United_Kingdom"
},
{
"toclevel": 1,
"level": "2",
"line": "Identifying houses",
"number": "6",
"index": "18",
"fromtitle": "House",
"byteoffset": 26922,
"anchor": "Identifying_houses"
},
{
"toclevel": 1,
"level": "2",
"line": "Animal houses",
"number": "7",
"index": "19",
"fromtitle": "House",
"byteoffset": 27397,
"anchor": "Animal_houses"
},
{
"toclevel": 1,
"level": "2",
"line": "Houses and symbolism",
"number": "8",
"index": "20",
"fromtitle": "House",
"byteoffset": 27826,
"anchor": "Houses_and_symbolism"
},
{
"toclevel": 1,
"level": "2",
"line": "See also",
"number": "9",
"index": "21",
"fromtitle": "House",
"byteoffset": 28620,
"anchor": "See_also"
},
{
"toclevel": 1,
"level": "2",
"line": "References",
"number": "10",
"index": "22",
"fromtitle": "House",
"byteoffset": 29690,
"anchor": "References"
},
{
"toclevel": 1,
"level": "2",
"line": "External links",
"number": "11",
"index": "23",
"fromtitle": "House",
"byteoffset": 29720,
"anchor": "External_links"
}
]
}
}
{
"parse": {
"title": "House",
"pageid": 13590,
"wikitext": {
"*": "=== Layout ===\n[[File:Gingerbread House Essex CT.jpg|thumb|Example of an early [[Victorian architecture|Victorian]] \"Gingerbread House\" in [[Connecticut]], United States, built in 1855]]\n\nIdeally, [[architect]]s of houses design [[room]]s to meet the needs of the people who will live in the house. [[Feng shui]], originally a [[China|Chinese]] method of moving houses according to such factors as rain and micro-climates, has recently expanded its scope to address the design of interior spaces, with a view to promoting harmonious effects on the people living inside the house, although no actual effect has ever been demonstrated. Feng shui can also mean the \"aura\" in or around a dwelling, making it comparable to the [[real estate|real-estate]] sales concept of \"indoor-outdoor flow\".\n\nThe [[square footage]] of a house in the United States reports the area of \"living space\", excluding the garage and other non-living spaces. The \"square metres\" figure of a house in Europe <!-- including Malta ? --> reports the area of the walls enclosing the home, and thus includes any attached garage and non-living spaces.<ref>{{Cite book|title=Land Management: Challenges and Strategies (First Edition)|last=Iyyer|first=Chaitanya|publisher=Global India Publications Pvt Ltd|year=2009|isbn=978-9380228488|location=|pages=}}</ref>{{Citation needed|date=February 2007}} The number of floors or levels making up the house can affect the square footage of a home."
}
}
}
{
"parse": {
"title": "House",
"pageid": 13590,
"sections": [
{
"toclevel": 1,
"level": "2",
"line": "Etymology",
"number": "1",
"index": "1",
"fromtitle": "House",
"byteoffset": 3549,
"anchor": "Etymology"
},
{
"toclevel": 1,
"level": "2",
"line": "Elements",
"number": "2",
"index": "2",
"fromtitle": "House",
"byteoffset": 4960,
"anchor": "Elements"
},
{
"toclevel": 2,
"level": "3",
"line": "Layout",
"number": "2.1",
"index": "3",
"fromtitle": "House",
"byteoffset": 4976,
"anchor": "Layout"
},
{
"toclevel": 2,
"level": "3",
"line": "Parts",
"number": "2.2",
"index": "4",
"fromtitle": "House",
"byteoffset": 6432,
"anchor": "Parts"
},
{
"toclevel": 2,
"level": "3",
"line": "History of the interior",
"number": "2.3",
"index": "5",
"fromtitle": "House",
"byteoffset": 7539,
"anchor": "History_of_the_interior"
},
{
"toclevel": 3,
"level": "4",
"line": "Communal rooms",
"number": "2.3.1",
"index": "6",
"fromtitle": "House",
"byteoffset": 8786,
"anchor": "Communal_rooms"
},
{
"toclevel": 3,
"level": "4",
"line": "Interconnecting rooms",
"number": "2.3.2",
"index": "7",
"fromtitle": "House",
"byteoffset": 9736,
"anchor": "Interconnecting_rooms"
},
{
"toclevel": 3,
"level": "4",
"line": "Corridor",
"number": "2.3.3",
"index": "8",
"fromtitle": "House",
"byteoffset": 11126,
"anchor": "Corridor"
},
{
"toclevel": 3,
"level": "4",
"line": "Employment-free house",
"number": "2.3.4",
"index": "9",
"fromtitle": "House",
"byteoffset": 13092,
"anchor": "Employment-free_house"
},
{
"toclevel": 2,
"level": "3",
"line": "Work location, technology and doctors",
"number": "2.4",
"index": "10",
"fromtitle": "House",
"byteoffset": 15969,
"anchor": "Work_location,_technology_and_doctors"
},
{
"toclevel": 3,
"level": "4",
"line": "Technology and privacy",
"number": "2.4.1",
"index": "11",
"fromtitle": "House",
"byteoffset": 17291,
"anchor": "Technology_and_privacy"
},
{
"toclevel": 1,
"level": "2",
"line": "Construction",
"number": "3",
"index": "12",
"fromtitle": "House",
"byteoffset": 18782,
"anchor": "Construction"
},
{
"toclevel": 2,
"level": "3",
"line": "Energy efficiency",
"number": "3.1",
"index": "13",
"fromtitle": "House",
"byteoffset": 21899,
"anchor": "Energy_efficiency"
},
{
"toclevel": 2,
"level": "3",
"line": "Earthquake protection",
"number": "3.2",
"index": "14",
"fromtitle": "House",
"byteoffset": 23057,
"anchor": "Earthquake_protection"
},
{
"toclevel": 1,
"level": "2",
"line": "Found materials",
"number": "4",
"index": "15",
"fromtitle": "House",
"byteoffset": 25172,
"anchor": "Found_materials"
},
{
"toclevel": 1,
"level": "2",
"line": "Legal issues",
"number": "5",
"index": "16",
"fromtitle": "House",
"byteoffset": 26235,
"anchor": "Legal_issues"
},
{
"toclevel": 2,
"level": "3",
"line": "United Kingdom",
"number": "5.1",
"index": "17",
"fromtitle": "House",
"byteoffset": 26644,
"anchor": "United_Kingdom"
},
{
"toclevel": 1,
"level": "2",
"line": "Identifying houses",
"number": "6",
"index": "18",
"fromtitle": "House",
"byteoffset": 26922,
"anchor": "Identifying_houses"
},
{
"toclevel": 1,
"level": "2",
"line": "Animal houses",
"number": "7",
"index": "19",
"fromtitle": "House",
"byteoffset": 27397,
"anchor": "Animal_houses"
},
{
"toclevel": 1,
"level": "2",
"line": "Houses and symbolism",
"number": "8",
"index": "20",
"fromtitle": "House",
"byteoffset": 27826,
"anchor": "Houses_and_symbolism"
},
{
"toclevel": 1,
"level": "2",
"line": "See also",
"number": "9",
"index": "21",
"fromtitle": "House",
"byteoffset": 28620,
"anchor": "See_also"
},
{
"toclevel": 1,
"level": "2",
"line": "References",
"number": "10",
"index": "22",
"fromtitle": "House",
"byteoffset": 29690,
"anchor": "References"
},
{
"toclevel": 1,
"level": "2",
"line": "External links",
"number": "11",
"index": "23",
"fromtitle": "House",
"byteoffset": 29720,
"anchor": "External_links"
}
]
}
}
{
"parse": {
"title": "House",
"pageid": 13590,
"wikitext": {
"*": "=== Layout ===\n[[File:Gingerbread House Essex CT.jpg|thumb|Example of an early [[Victorian architecture|Victorian]] \"Gingerbread House\" in [[Connecticut]], United States, built in 1855]]\n\nIdeally, [[architect]]s of houses design [[room]]s to meet the needs of the people who will live in the house. [[Feng shui]], originally a [[China|Chinese]] method of moving houses according to such factors as rain and micro-climates, has recently expanded its scope to address the design of interior spaces, with a view to promoting harmonious effects on the people living inside the house, although no actual effect has ever been demonstrated. Feng shui can also mean the \"aura\" in or around a dwelling, making it comparable to the [[real estate|real-estate]] sales concept of \"indoor-outdoor flow\".\n\nThe [[square footage]] of a house in the United States reports the area of \"living space\", excluding the garage and other non-living spaces. The \"square metres\" figure of a house in Europe <!-- including Malta ? --> reports the area of the walls enclosing the home, and thus includes any attached garage and non-living spaces.<ref>{{Cite book|title=Land Management: Challenges and Strategies (First Edition)|last=Iyyer|first=Chaitanya|publisher=Global India Publications Pvt Ltd|year=2009|isbn=978-9380228488|location=|pages=}}</ref>{{Citation needed|date=February 2007}} The number of floors or levels making up the house can affect the square footage of a home."
}
}
}
{
“解析”:{
“头衔”:“房子”,
“pageid”:13590,
“维基文本”:{
“*”:“===布局==\n[[文件:Gingerbread House Essex CT.jpg | thumb |早期[[维多利亚式建筑|维多利亚式]]\“Gingerbread House\”的一个例子,位于美国[[康涅狄格州]],建于1855年]\n\n实际上,[[建筑师]的房屋设计[[房间]]是为了满足将要住在房子里的人的需要。[[风水],最初是一个建筑设计师[[China | Chinese]]根据雨水和微气候等因素搬家的方法,最近扩大了其范围,以解决室内空间的设计问题,以促进对居住在房子内的人产生和谐的影响,尽管没有实际效果。风水也可以指“光环”\“在住宅内或住宅周围,使其与[[房地产|房地产]]销售概念的“室内外流动”\n\n美国住宅的[[平方英尺]]报告了“生活空间”的面积,不包括车库和其他非生活空间。“平方米”欧洲的房屋图显示了房屋周围墙壁的面积,因此包括任何附属车库和非生活空间。{引用书籍|标题=土地管理:挑战和战略(第一版)| last=Iyyer | First=Chaitanya | publisher=Global India Publications Pvt Ltd | year=2009 | isbn=978-9380228488 |位置=|页码=}{{需要引证|日期=2007年2月}构成房屋的楼层数或楼层数会影响房屋的平方英尺。”
}
}
}
背景:
页面中的节的概念并未整合到修订中(但是),修订“只是”整个页面的内容和附加元数据(例如,在多个其他槽中),但节是内容的一部分(仅为修订中的一个槽)。这就是为什么在使用修订查询API时,您只能获取整个文本。需要对页面进行分析,以了解节是什么,因为节是wikitext的概念,因此涉及到解析器