Linux kernel 添加计时器导致多个PCI板的内核堆栈转储

Linux kernel 添加计时器导致多个PCI板的内核堆栈转储,linux-kernel,linux-device-driver,kernel-module,Linux Kernel,Linux Device Driver,Kernel Module,我们使用带有PCI express驱动程序的FPGA卡通过DMA引擎移动数据。这一切都适用于机器中的一张卡,但如果有两张卡,则会出现故障。作为初步调查,我已将错误缩小到用于设置轮询机制的add_timer函数。当insmod添加驱动程序模块时,会生成堆栈跟踪,因为两个实例的轮询计时器例程相同。代码已缩减为 static int dat_probe(struct pci_dev *pdev, const struct pci_device_id *ent) { struct timer_

我们使用带有PCI express驱动程序的FPGA卡通过DMA引擎移动数据。这一切都适用于机器中的一张卡,但如果有两张卡,则会出现故障。作为初步调查,我已将错误缩小到用于设置轮询机制的add_timer函数。当insmod添加驱动程序模块时,会生成堆栈跟踪,因为两个实例的轮询计时器例程相同。代码已缩减为

static int  dat_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
{
    struct timer_list * timer = &poll_timer;
    int i;

    /* Start polling routine */
    log_normal(KERN_INFO "DEBUG ADD TIMER: Starting poll routine with %x\n", pdev);
    init_timer(timer);

    // random number added so that expires value is different for both instances of timer
    get_random_bytes(&i, 1);
    timer->expires=jiffies+HZ+i;
    timer->data=(unsigned long) pdev;
    timer->function = poll_routine;

    log_verbose("DEBUG ADD TIMER: Timer expires %x\n", timer->expires);
    log_verbose("DEBUG ADD TIMER: Timer data %x\n", timer->data);
    log_verbose("DEBUG ADD TIMER: Timer function %x\n", timer->function);

    // ***** THIS IS WHERE STACK TRACE OCCURS (WHEN CALLED FOR SECOND TIME)
    add_timer(timer);

    log_verbose("DEBUG ADD TIMER: Value of HZ is %d\n", HZ);
    log_verbose("DEBUG ADD TIMER: End of probe\n");

    return 0;
}
堆栈跟踪生成
list\u添加损坏。上一个->下一个应该是下一个(FFFFFF81F76228),但是是(null)。(prev=ffffffffa050a3c0)。

list\u add double add:new=ffffffffff a050a3c0,prev=ffffffff a050a3c0,next=ffffffffff 81f76228。

查看printk语句,很明显add_计时器正试图将相同的例程添加到链表中。这是正确的吗

DEBUG ADD TIMER: Timer expires fffd9cd3
DEBUG ADD TIMER: Timer data 6c0ac000
DEBUG ADD TIMER: Timer function **a0508150**
DEBUG ADD TIMER: Value of HZ is 1000
DEBUG ADD TIMER: End of probe
DEBUG ADD TIMER: Starting poll routine with 6c0ad000
DEBUG ADD TIMER: Timer expires fffd9c7d
DEBUG ADD TIMER: Timer data 6c0ad000
DEBUG ADD TIMER: Timer function **a0508150**
所以我的问题是,我应该如何为同一个驱动程序的多个实例配置计时器?(假设将多块电路板插入机器时就是这样)

全堆栈跟踪

DEBUG ADD TIMER: Inserting driver into kernel.
DEBUG ADD TIMER: Starting poll routine with 6c0ac000
DEBUG ADD TIMER: Timer expires fffd9cd3
DEBUG ADD TIMER: Timer data 6c0ac000
DEBUG ADD TIMER: Timer function a0508150
DEBUG ADD TIMER: Value of HZ is 1000
DEBUG ADD TIMER: End of probe
DEBUG ADD TIMER: Starting poll routine with 6c0ad000
DEBUG ADD TIMER: Timer expires fffd9c7d
DEBUG ADD TIMER: Timer data 6c0ad000
DEBUG ADD TIMER: Timer function a0508150
------------[ cut here ]------------
WARNING: CPU: 0 PID: 2201 at lib/list_debug.c:33 __list_add+0xa0/0xd0()
list_add corruption. prev->next should be next (ffffffff81f76228), but was           (null). (prev=ffffffffa050a3c0).
Modules linked in: xdma_v7(POE+) xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw intel_rapl iosf_mbi x86_pkg_temp_thermal coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_controller crc32c_intel eeepc_wmi ghash_clmulni_intel asus_wmi ftdi_sio iTCO_wdt snd_hda_codec sparse_keymap raid0 iTCO_vendor_support
 snd_hda_core rfkill sb_edac ipmi_ssif video mxm_wmi edac_core snd_hwdep mei_me snd_seq snd_seq_device ipmi_msghandler snd_pcm mei acpi_pad tpm_infineon lpc_ich mfd_core snd_timer tpm_tis shpchp tpm snd soundcore i2c_i801 wmi nfsd auth_rpcgss nfs_acl lockd grace sunrpc ast drm_kms_helper ttm drm igb serio_raw ptp pps_core dca i2c_algo_bit
CPU: 0 PID: 2201 Comm: insmod Tainted: P           OE   4.1.8-100.fc21.x86_64 #1
Hardware name: ASUSTeK COMPUTER INC. Z10PE-D8 WS/Z10PE-D8 WS, BIOS 1001 03/17/2015
 0000000000000000 00000000ec73155d ffff880457123928 ffffffff81792065
 0000000000000000 ffff880457123980 ffff880457123968 ffffffff810a163a
 0000000000000246 ffffffffa050a3c0 ffffffff81f76228 ffffffffa050a3c0
Call Trace:
 [<ffffffff81792065>] dump_stack+0x45/0x57
 [<ffffffff810a163a>] warn_slowpath_common+0x8a/0xc0
 [<ffffffff810a16c5>] warn_slowpath_fmt+0x55/0x70
 [<ffffffff810f8250>] ? vprintk_emit+0x3b0/0x560
 [<ffffffff813c7c30>] __list_add+0xa0/0xd0
 [<ffffffff81108412>] __internal_add_timer+0xb2/0x130
 [<ffffffff811084bf>] internal_add_timer+0x2f/0xb0
 [<ffffffff8110a1ca>] mod_timer+0x12a/0x210
 [<ffffffff8110a2c8>] add_timer+0x18/0x30
 [<ffffffffa050810f>] dat_probe+0xbf/0x100 [xdma_v7]
 [<ffffffff813f6da5>] local_pci_probe+0x45/0xa0
 [<ffffffff812a8da2>] ? sysfs_do_create_link_sd.isra.2+0x72/0xc0
 [<ffffffff813f8109>] pci_device_probe+0xf9/0x150
 [<ffffffff814e7e59>] driver_probe_device+0x209/0x4b0
 [<ffffffff814e81db>] __driver_attach+0x9b/0xa0
 [<ffffffff814e8140>] ? __device_attach+0x40/0x40
 [<ffffffff814e5973>] bus_for_each_dev+0x73/0xc0
 [<ffffffff814e772e>] driver_attach+0x1e/0x20
 [<ffffffff814e72e0>] bus_add_driver+0x180/0x250
 [<ffffffffa000a000>] ? 0xffffffffa000a000
 [<ffffffff814e89d4>] driver_register+0x64/0xf0
 [<ffffffff813f662c>] __pci_register_driver+0x4c/0x50
 [<ffffffffa000a02c>] dat_init+0x2c/0x1000 [xdma_v7]
 [<ffffffff81002148>] do_one_initcall+0xd8/0x210
 [<ffffffff812094f9>] ? kmem_cache_alloc_trace+0x1a9/0x230
 [<ffffffff817911bc>] ? do_init_module+0x28/0x1cc
 [<ffffffff817911f5>] do_init_module+0x61/0x1cc
 [<ffffffff811270bb>] load_module+0x20db/0x2550
 [<ffffffff81122990>] ? store_uevent+0x70/0x70
 [<ffffffff8122e860>] ? kernel_read+0x50/0x80
 [<ffffffff81127766>] SyS_finit_module+0xa6/0xe0
 [<ffffffff8179892e>] system_call_fastpath+0x12/0x71
---[ end trace 340e5d7ba2d89081 ]---
------------[ cut here ]------------
WARNING: CPU: 0 PID: 2201 at lib/list_debug.c:36 __list_add+0xcb/0xd0()
list_add double add: new=ffffffffa050a3c0, prev=ffffffffa050a3c0, next=ffffffff81f76228.
Modules linked in: xdma_v7(POE+) xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw intel_rapl iosf_mbi x86_pkg_temp_thermal coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_controller crc32c_intel eeepc_wmi ghash_clmulni_intel asus_wmi ftdi_sio iTCO_wdt snd_hda_codec sparse_keymap raid0 iTCO_vendor_support
 snd_hda_core rfkill sb_edac ipmi_ssif video mxm_wmi edac_core snd_hwdep mei_me snd_seq snd_seq_device ipmi_msghandler snd_pcm mei acpi_pad tpm_infineon lpc_ich mfd_core snd_timer tpm_tis shpchp tpm snd soundcore i2c_i801 wmi nfsd auth_rpcgss nfs_acl lockd grace sunrpc ast drm_kms_helper ttm drm igb serio_raw ptp pps_core dca i2c_algo_bit
CPU: 0 PID: 2201 Comm: insmod Tainted: P        W  OE   4.1.8-100.fc21.x86_64 #1
Hardware name: ASUSTeK COMPUTER INC. Z10PE-D8 WS/Z10PE-D8 WS, BIOS 1001 03/17/2015
 0000000000000000 00000000ec73155d ffff880457123928 ffffffff81792065
 0000000000000000 ffff880457123980 ffff880457123968 ffffffff810a163a
 0000000000000246 ffffffffa050a3c0 ffffffff81f76228 ffffffffa050a3c0
Call Trace:
 [<ffffffff81792065>] dump_stack+0x45/0x57
 [<ffffffff810a163a>] warn_slowpath_common+0x8a/0xc0
 [<ffffffff810a16c5>] warn_slowpath_fmt+0x55/0x70
 [<ffffffff810f8250>] ? vprintk_emit+0x3b0/0x560
 [<ffffffff813c7c5b>] __list_add+0xcb/0xd0
 [<ffffffff81108412>] __internal_add_timer+0xb2/0x130
 [<ffffffff811084bf>] internal_add_timer+0x2f/0xb0
 [<ffffffff8110a1ca>] mod_timer+0x12a/0x210
 [<ffffffff8110a2c8>] add_timer+0x18/0x30
 [<ffffffffa050810f>] dat_probe+0xbf/0x100 [xdma_v7]
 [<ffffffff813f6da5>] local_pci_probe+0x45/0xa0
 [<ffffffff812a8da2>] ? sysfs_do_create_link_sd.isra.2+0x72/0xc0
 [<ffffffff813f8109>] pci_device_probe+0xf9/0x150
 [<ffffffff814e7e59>] driver_probe_device+0x209/0x4b0
 [<ffffffff814e81db>] __driver_attach+0x9b/0xa0
 [<ffffffff814e8140>] ? __device_attach+0x40/0x40
 [<ffffffff814e5973>] bus_for_each_dev+0x73/0xc0
 [<ffffffff814e772e>] driver_attach+0x1e/0x20
 [<ffffffff814e72e0>] bus_add_driver+0x180/0x250
 [<ffffffffa000a000>] ? 0xffffffffa000a000
 [<ffffffff814e89d4>] driver_register+0x64/0xf0
 [<ffffffff813f662c>] __pci_register_driver+0x4c/0x50
 [<ffffffffa000a02c>] dat_init+0x2c/0x1000 [xdma_v7]
 [<ffffffff81002148>] do_one_initcall+0xd8/0x210
 [<ffffffff812094f9>] ? kmem_cache_alloc_trace+0x1a9/0x230
 [<ffffffff817911bc>] ? do_init_module+0x28/0x1cc
 [<ffffffff817911f5>] do_init_module+0x61/0x1cc
 [<ffffffff811270bb>] load_module+0x20db/0x2550
 [<ffffffff81122990>] ? store_uevent+0x70/0x70
 [<ffffffff8122e860>] ? kernel_read+0x50/0x80
 [<ffffffff81127766>] SyS_finit_module+0xa6/0xe0
 [<ffffffff8179892e>] system_call_fastpath+0x12/0x71
---[ end trace 340e5d7ba2d89082 ]---
DEBUG ADD TIMER: Value of HZ is 1000
DEBUG ADD TIMER: End of probe
调试添加计时器:将驱动程序插入内核。
调试添加计时器:使用6c0ac000启动轮询例程
调试添加计时器:计时器过期fffd9cd3
调试添加计时器:计时器数据6c0ac000
调试添加计时器:计时器功能a0508150
调试添加计时器:HZ的值为1000
调试添加计时器:探测结束
调试添加计时器:使用6c0ad000启动轮询例程
调试添加计时器:计时器过期fffd9c7d
调试添加计时器:计时器数据6c0ad000
调试添加计时器:计时器功能a0508150
------------[点击此处]------------
警告:CPU:0 PID:2201位于lib/list\u debug.c:33\u list\u add+0xa0/0xd0()
列出并添加损坏。上一个->下一个应该是下一个(FFFFFF81F76228),但是是(null)。(prev=ffffffffa050a3c0)。
模块链接到:xdma_v7(POE+)校验和ipt拒绝nf拒绝ipv6文本连接ebtable nat ebtable broute BROWTE stp llc ebtable ebtable ip6过滤器ebtable ip6iptable(可连接的)iptable(可连接的)iptable(可连接的)iptable(可管理的)iptable(可安全的)iptable(可安全的)原始英特尔rapl iosf(可连接的)mbi(可连接的)x86(可连接的)pkg(可编程的)temp(热芯温度)kvm(英特尔kvm)kvm(英特尔kvm)crct10dif(英特尔)pclmul)crc32(英特尔)PCul)snd(英特尔)hda(英特尔)编解码器(英特尔)realtek snd(英特尔)hda)编解码器(英特尔)通用的通用snd(英特尔)芯片控制器ftdi sio iTCO wdt snd hda编解码器稀疏密钥映射raid0 iTCO供应商支持
snd_hda_core rfkill sb_edac ipmi_ssif video mxm_wmi edac_core snd_hwdep mei_seq snd_seq snd_设备ipmi_msghandler snd_pcm mei acpi_pad tpm infineon lpc ich mfd_core snd_timer tpm tpm shpchp tp snd soundcore i2c i801; nf nfsd auth rpcgsnfs nfs nfs acl grace sundru kms igu high dru igu seri dru bit dru bit tpu bit dru bit dru bit icu
CPU:0 PID:2201通信:insmod受污染:P OE 4.1.8-100.fc21.x86_64#1
硬件名称:华硕电脑股份有限公司Z10PE-D8 WS/Z10PE-D8 WS,BIOS 1001 03/17/2015
00000000000000000000 EC73155D FF880457123928 ffffffff81792065
0000000000000000 ffff880457123980 ffff880457123968 ffffffff810a163a
0000000000000 246 FFFFFF A050A3C0 FFFFFFFF 81F76228 FFFFFFFF A050A3C0
呼叫跟踪:
[]转储\u堆栈+0x45/0x57
[]警告慢路径公共+0x8a/0xc0
[]警告慢路径fmt+0x55/0x70
[] ? vprintk_发射+0x3b0/0x560
[]列表添加+0xa0/0xd0
[]内部添加计时器+0xb2/0x130
[]内部添加计时器+0x2f/0xb0
[]模块定时器+0x12a/0x210
[]添加计时器+0x18/0x30
[]数据探头+0xbf/0x100[xdma\U v7]
[]本地pci探头+0x45/0xa0
[] ? sysfs\u do\u create\u link\u sd.isra.2+0x72/0xc0
[]pci_设备_探测器+0xf9/0x150
[]驱动程序探测设备+0x209/0x4b0
[]驱动程序连接+0x9b/0xa0
[] ? __设备连接+0x40/0x40
[]每个设备的总线+0x73/0xc0
[]驱动程序连接+0x1e/0x20
[]总线添加驱动程序+0x180/0x250
[] ? 0xFFFFFF000A000
[]驱动程序寄存器+0x64/0xf0
[]pci寄存器驱动程序+0x4c/0x50
[]数据初始化+0x2c/0x1000[xdma\U v7]
[]do_one_initcall+0xd8/0x210
[] ? kmem_缓存_分配_跟踪+0x1a9/0x230
[] ? do_init_模块+0x28/0x1cc
[]do_init_模块+0x61/0x1cc
[]加载模块+0x20db/0x2550
[] ? 存储事件+0x70/0x70
[] ? 内核读取+0x50/0x80
[]系统限定模块+0xa6/0xe0
[]系统调用快速路径+0x12/0x71
---[结束记录道340e5d7ba2d89081]---
------------[点击此处]------------
警告:CPU:0 PID:2201位于lib/list\u debug.c:36\u list\u add+0xcb/0xd0()
列表添加双添加:新建=ffffffffa050a3c0,上一个=ffffffffa050a3c0,下一个=FFFFFFFFFF81F76228。
模块链接到:xdma_v7(POE+)校验和ipt拒绝nf拒绝ipv6文本连接ebtable nat ebtable broute BROWTE stp llc ebtable ebtable ip6过滤器ebtable ip6iptable(可连接的)iptable(可连接的)iptable(可连接的)iptable(可管理的)iptable(可安全的)iptable(可安全的)原始英特尔rapl iosf(可连接的)mbi(可连接的)x86(可连接的)pkg(可编程的)temp(热芯温度)kvm(英特尔kvm)kvm(英特尔kvm)crct10dif(英特尔)pclmul)crc32(英特尔)PCul)snd(英特尔)hda(英特尔)编解码器(英特尔)realtek snd(英特尔)hda)编解码器(英特尔)通用的通用snd(英特尔)芯片控制器ftdi sio iTCO wdt snd hda编解码器稀疏密钥映射raid0 iTCO供应商支持
snd_hda_core rfkill sb_edac ipmi_ssif video mxm_wmi edac_core snd_hwdep mei_seq snd_seq snd_设备ipmi_msghandler snd_pcm mei acpi_pad tpm infineon lpc ich mfd_core snd_timer tpm tpm shpchp tp snd soundcore i2c i801; nf nfsd auth rpcgsnfs nfs nfs acl grace sundru kms igu high dru igu seri dru bit dru bit tpu bit dru bit dru bit icu
CPU:0 PID:2201通信:insmod受污染:P W OE 4.1.8-100.fc21.x86#U 64#1
硬件名称:华硕电脑股份有限公司Z10PE-D8 WS/Z10PE-D8 WS,BIOS 1001 03/17/2015
00000000000000000000 EC73155D FF880457123928 ffffffff817920