Python 从Excel手机中提取数据,如电话、电子邮件、地址、ID等
我在单个excel单元格中有如下数据Python 从Excel手机中提取数据,如电话、电子邮件、地址、ID等,python,excel,vba,pandas,Python,Excel,Vba,Pandas,我在单个excel单元格中有如下数据 56. MEMBER ID 2100343-219 ZAHID BROTHERS MONTGOMERY BAZAR FAISALABAD TEL : 041-2646252 MOBILE : 0300-0321-9663180 FAX : E-MAIL : REP : HAJI MUHAMMAD ABID 我正在寻找如何提取每个细节和形成一个适当的excel表格的想法。我更喜欢熊猫。但任何可行的解决方案都是可以接受的 编辑 我使用了以下代码来
56. MEMBER ID 2100343-219
ZAHID BROTHERS
MONTGOMERY BAZAR FAISALABAD TEL : 041-2646252
MOBILE : 0300-0321-9663180 FAX :
E-MAIL :
REP : HAJI MUHAMMAD ABID
我正在寻找如何提取每个细节和形成一个适当的excel表格的想法。我更喜欢熊猫。但任何可行的解决方案都是可以接受的
编辑
我使用了以下代码来提取所需的信息。基于第2列中的字体和标记名(单独生成)
Sub convert()
On Error Resume Next
Dim x As Long
Dim i As Long
Dim addressString
i = 1
For x = 2 To 37093
If Sheet1.Cells(x, 1).Font.Name = "Arial Bold" Then
i = i + 1
Sheet2.Cells(i, 1) = Sheet1.Cells(x - 1, 1)
Sheet2.Cells(i, 2) = Sheet1.Cells(x, 1)
Else
If Sheet1.Cells(x, 2) = "TEL " Then Sheet2.Cells(i, 3) = " " & Sheet1.Cells(x, 1)
If Sheet1.Cells(x, 2) = "MOBILE " Then Sheet2.Cells(i, 4) = " " & Sheet1.Cells(x, 1)
If Sheet1.Cells(x, 2) = "FAX " Then Sheet2.Cells(i, 5) = " " & Sheet1.Cells(x, 1)
If Sheet1.Cells(x, 2) = "E-MAIL " Then Sheet2.Cells(i, 6) = " " & Sheet1.Cells(x, 1)
If Sheet1.Cells(x, 2) = "REP " Then Sheet2.Cells(i, 7) = " " & Sheet1.Cells(x, 1)
If Sheet1.Cells(x, 2) = "" Then Sheet2.Cells(i, 8) = Sheet2.Cells(i, 8) & " " & Sheet1.Cells(x, 1)
End If
Next x
'TEL
'MOBILE
'FAX
'E-MAIL
'REP
End Sub
请尝试下一个功能:
Function ExtractDataFromCell(x As String) As Variant
Dim arr As Variant, arrfin(3) As String, i As Long, start As Long, length As Long
Dim strMembID As String, strTel As String, strMob As String, strRep As String
arr = Split(x, vbLf)
For i = 0 To UBound(arr)
If i = 0 Then strMembID = Right(arr(i), Len(arr(i)) - InStrRev(arr(i), " "))
If i = 2 Then strTel = Right(arr(i), Len(arr(i)) - InStrRev(arr(i), " "))
If i = 3 Then
start = InStr(arr(i), ":") + 1
length = InStr(arr(i), " FAX") - start
strMob = Mid(arr(i), start + 1, length):
End If
If i = 5 Then strREP = Right(arr(i), Len(arr(i)) - InStrRev(arr(i), " ") - 1)
Next i
arrfin(0) = strMembID: arrfin(1) = strTel: arrfin(2) = strMob: arrfin(3) = strREP
ExtractDataFromCell = arrfin
End Function
它可以这样调用:
Sub testExtractData()
Dim arr As Variant
arr = ExtractDataFromCell(ActiveCell.Value)
Debug.Print "MembID: " & arr(0)
Debug.Print "Tel: " & arr(1)
Debug.Print "Mob: " & arr(2)
Debug.Print "REP: " & arr(3)
End Sub
假设您的数据采用固定格式,其中6行由
lineFeed
分隔,并且数据采用您显示的格式,那么您可以使用powerquery
(Excel 2010+中提供)执行此操作
算法
- 按
分隔符将列拆分为新行lf
- 添加一个
列,然后添加一个索引
列(6)以获得一个数字序列,我们可以在其中对原始数字进行分组(序列类似于整数/除法
{0,0,0,0,0,0,1,1,1,1,1,1,1,1,2,}
- 按整数/除法列对原始数据进行分组
- 提取每个元素,使用各种分割函数和索引返回所需内容
lf
,与所示示例位于同一位置
M代码
如果将光标放置在数据列中,并从表/范围data-->Get&Transform-->中
,它将创建一个表,并标记列Column1
。如果随后将M代码粘贴到PQ中的高级编辑器中,并将第2行中的表名更改为分配给数据的任何名称,则查询应该可以工作。您可以熟悉步骤
区域以了解正在发生的事情
来源
结果
为了用一种简单的方法解决这个问题,我做了很多尝试。我想出了下面的解决方案
Sub convert()
On Error Resume Next
Dim x As Long
Dim i As Long
Dim addressString
i = 1
For x = 2 To 37093
If Sheet1.Cells(x, 1).Font.Name = "Arial Bold" Then
i = i + 1
Sheet2.Cells(i, 1) = Sheet1.Cells(x - 1, 1)
Sheet2.Cells(i, 2) = Sheet1.Cells(x, 1)
Else
If Sheet1.Cells(x, 2) = "TEL " Then Sheet2.Cells(i, 3) = " " & Sheet1.Cells(x, 1)
If Sheet1.Cells(x, 2) = "MOBILE " Then Sheet2.Cells(i, 4) = " " & Sheet1.Cells(x, 1)
If Sheet1.Cells(x, 2) = "FAX " Then Sheet2.Cells(i, 5) = " " & Sheet1.Cells(x, 1)
If Sheet1.Cells(x, 2) = "E-MAIL " Then Sheet2.Cells(i, 6) = " " & Sheet1.Cells(x, 1)
If Sheet1.Cells(x, 2) = "REP " Then Sheet2.Cells(i, 7) = " " & Sheet1.Cells(x, 1)
If Sheet1.Cells(x, 2) = "" Then Sheet2.Cells(i, 8) = Sheet2.Cells(i, 8) & " " & Sheet1.Cells(x, 1)
End If
Next x
'TEL
'MOBILE
'FAX
'E-MAIL
'REP
End Sub
Sub setit()
Dim x As Long
For x = 2 To 37093
If Sheet1.Cells(x, 1).Font.Name = "Arial Bold" Then Sheet1.Cells(x + 1, 1).Font.Name = "Arial"
Next x
End Sub
我将“名称”行的字体设置为一种单一字体,并将类别类型放在每一行上。例如,如果一行有“TEL”,则B列必须在“TEL”中声明它,对于其他类别也是如此。上述公式对我有效。@Husnain这是一项非常简单的任务,stackoverflow上已经存在许多答案,您能告诉我们您尝试了什么吗?=REPLACE(MAX(iError(MID(替换为(A19,“,”)),ROW(间接(“1:”&LEN(替换为(A19,“,”)))),8)+0,”),5,,“-”)来提取电话number@Husnain伊克巴尔:你有时间检查上面的功能吗?我正在尝试。我是一个newbie@HusnainIqbal:那么,最简单的测试方法就是下一个。复制上面的代码(函数和子函数)在标准模块中。选择包含所述字符串的单元格以提取所需内容,然后运行
testExtractData()
。您可以在即时窗口中看到结果(在VBE中,按Ctrl+G
)。在测试后,您可以在迭代中实现函数调用。如果需要,在确认函数按您的要求工作后,我可以向您展示如何执行该操作…感谢您的帮助。代码在以下行生成错误stromb=Mid(arr(I),start+1,length):
@Husnain Iqbal:请尝试在一个单元格中测试代码,该单元格的内容与您在问题中输入的内容完全相同。您只需从问题中复制该代码即可……如果该代码运行良好,请在返回错误的单元格字符串处发布该代码。我将尝试调整该代码以处理这两种情况。如果可能,当然……您应该发布所有可能的内容可重复的字符串组合。。。
Sub convert()
On Error Resume Next
Dim x As Long
Dim i As Long
Dim addressString
i = 1
For x = 2 To 37093
If Sheet1.Cells(x, 1).Font.Name = "Arial Bold" Then
i = i + 1
Sheet2.Cells(i, 1) = Sheet1.Cells(x - 1, 1)
Sheet2.Cells(i, 2) = Sheet1.Cells(x, 1)
Else
If Sheet1.Cells(x, 2) = "TEL " Then Sheet2.Cells(i, 3) = " " & Sheet1.Cells(x, 1)
If Sheet1.Cells(x, 2) = "MOBILE " Then Sheet2.Cells(i, 4) = " " & Sheet1.Cells(x, 1)
If Sheet1.Cells(x, 2) = "FAX " Then Sheet2.Cells(i, 5) = " " & Sheet1.Cells(x, 1)
If Sheet1.Cells(x, 2) = "E-MAIL " Then Sheet2.Cells(i, 6) = " " & Sheet1.Cells(x, 1)
If Sheet1.Cells(x, 2) = "REP " Then Sheet2.Cells(i, 7) = " " & Sheet1.Cells(x, 1)
If Sheet1.Cells(x, 2) = "" Then Sheet2.Cells(i, 8) = Sheet2.Cells(i, 8) & " " & Sheet1.Cells(x, 1)
End If
Next x
'TEL
'MOBILE
'FAX
'E-MAIL
'REP
End Sub
Sub setit()
Dim x As Long
For x = 2 To 37093
If Sheet1.Cells(x, 1).Font.Name = "Arial Bold" Then Sheet1.Cells(x + 1, 1).Font.Name = "Arial"
Next x
End Sub