0

I am backporting a driver called GTP from 4.9 Linux kernel to 3.18 version. I fixed the errors in the source code of the driver and compiled the kernel successfully. I am also using Ubuntu Server 14.04 to test the kernel with that driver. (There is also a netlink library (libgtpnl) to control the driver. I do modprobe gtp and there is no problem. I use the library to control some functions of the driver, and there is no problem. However, strangely, when I enable forwarding: (echo 1 > /proc/sys/net/ipv4/ip_forward) The kernel somehow haults. I am also unable to do many network commands (ifconfig, iproute, etc)

the dmesg shows:

[   33.032315] gtp: GTP module loaded (pdp ctx size 88 bytes)
[   62.269996] general protection fault: 0000 [#1] SMP 
[   62.270028] Modules linked in: gtp(E) udp_tunnel(E) pci_stub(E) vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) nfnetlink_queue(E) nfnetlink_log(E) nfnetlink(E) coretemp(E) crct10dif_pclmul(E) ppdev(E) crc32_pclmul(E) ghash_clmulni_intel(E) vmw_balloon(E) btusb(E) bluetooth(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E) microcode(E) serio_raw(E) snd_ens1371(E) snd_ac97_codec(E) ac97_bus(E) gameport(E) snd_rawmidi(E) snd_seq_device(E) snd_pcm(E) snd_timer(E) snd(E) soundcore(E) vmwgfx(E) ttm(E) drm_kms_helper(E) drm(E) shpchp(E) vmw_vmci(E) i2c_piix4(E) parport_pc(E) mac_hid(E) lp(E) parport(E) hid_generic(E) usbhid(E) hid(E) psmouse(E) ahci(E) libahci(E) e1000(E) mptspi(E) mptscsih(E) mptbase(E) floppy(E) vmw_pvscsi(E) vmxnet3(E) pata_acpi(E)
[   62.271080] CPU: 1 PID: 1323 Comm: bash Tainted: G           OE  3.18.20-bondingv4 #7
[   62.271121] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013
[   62.271293] task: ffff8800772c9920 ti: ffff880079390000 task.ti: ffff880079390000
[   62.271329] RIP: 0010:[<ffffffff81669a5d>]  [<ffffffff81669a5d>] dev_disable_lro+0x2d/0x130
[   62.271465] RSP: 0018:ffff880079393df8  EFLAGS: 00010206
[   62.271492] RAX: 0000000000200000 RBX: 0002000000000000 RCX: 0000000000000000
[   62.271595] RDX: ffffffff81ce6d98 RSI: 0000000000000000 RDI: ffff8800795ed000
[   62.271630] RBP: ffff880079393e18 R08: 0000000000000000 R09: ffff88007e4361a0
[   62.271664] R10: ffffffff8165266f R11: ffffea0001dd5480 R12: ffffffff81ce2e00
[   62.271703] R13: ffffffff81ce2ec8 R14: ffff880079393f50 R15: ffff8800795ed000
[   62.271798] FS:  00007fcfa02e8740(0000) GS:ffff88007e420000(0000) knlGS:0000000000000000
[   62.271838] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   62.271866] CR2: 00007ffe943a5ac8 CR3: 00000000782e6000 CR4: 00000000001407e0
[   62.271964] Stack:
[   62.271978]  ffffffff81ce2e00 0000000000000001 ffffffff81ce2e00 ffffffff81ce2ec8
[   62.272078]  ffff880079393e68 ffffffff816da5a2 0000000000000000 0000000000000002
[   62.272774]  ffffffff81ce8b40 0000000000000001 ffffffffffffffea ffffffff81ce8b40
[   62.273379] Call Trace:
[   62.274042]  [<ffffffff816da5a2>] devinet_sysctl_forward+0x1c2/0x1f0
[   62.274940]  [<ffffffff8124bf23>] proc_sys_call_handler+0xb3/0xc0
[   62.275504]  [<ffffffff8124bf44>] proc_sys_write+0x14/0x20
[   62.276157]  [<ffffffff811db457>] vfs_write+0xb7/0x1f0
[   62.276795]  [<ffffffff811dbed6>] SyS_write+0x46/0xb0
[   62.277394]  [<ffffffff81786914>] ? int_check_syscall_exit_work+0x34/0x3d
[   62.278080]  [<ffffffff8178668d>] system_call_fastpath+0x16/0x1b
[   62.278753] Code: 44 00 00 55 48 89 e5 41 55 41 54 53 48 89 fb 48 83 ec 08 8b 87 0c 02 00 00 a8 01 75 39 a9 00 00 20 00 74 07 48 8b 9b f0 08 00 00 <48> 81 a3 e0 00 00 00 ff 7f ff ff 48 89 df e8 80 ff ff ff f6 83 
[   62.282203] RIP  [<ffffffff81669a5d>] dev_disable_lro+0x2d/0x130
[   62.282925]  RSP <ffff880079393df8>
[   62.283664] ---[ end trace f2a66d8a2b1df752 ]---
[   62.289214] init: tty2 main process ended, respawning

There are some answers related to kernel taint as the driver seems have no license. But it has, I dont why the kernel didnt realize it. I couldn't find how the driver uses dev_disable_lro. I couldnt move through. Any help to identify my problem would be appreciated. Thanks

  • 1
    `I couldn't find how the driver uses dev_disable_lro.` - Backtrace shown **doesn't involve your driver** at all, probably, your driver had corrupted memory before. Debugging that is out of the scope of Stack Overflow (debugging should be performed on your PC, and we have no access to it). The function `devinet_sysctl_forward` which calls `dev_disable_lro` is implemented in [net/ipv4/devinet.c](http://elixir.free-electrons.com/linux/latest/source/net/ipv4/devinet.c#L2120). – Tsyvarev Jul 28 '17 at 16:16
  • Firstly, sorry for late response. I have been trying to debug the module. I also want to ask whether it is related to module signature. I am still debugging the code, but do you think if I set `CONFIG_MODULE_SIG` and `CONFIG_MODULE_SIG_ALL`to n? @Tsyvarev – firatsonmez Aug 03 '17 at 12:06
  • Your problem is "general protection fault". It is very **unlikely** related with signing. – Tsyvarev Aug 03 '17 at 17:56
  • Sorry for late update. The problem was a private flag `IFF_NO_QUEUE` which does not even exist in the 3.18 kernel, was set to a bit (1<<21) that corresponds to `IFF_MACVLAN` in 3.18. So, the kernel was haluted when calling `dev_disable_lro` for two times as it sees my driver as MACVLAN device, too. – firatsonmez Oct 11 '17 at 07:07

0 Answers0