如何調試出現 NULL 指針的核心模組?
我有一個從這個更新檔
logitech G19
編譯的自定義核心模組,它在其他 G 系列設備中添加了對鍵盤的支持。我針對 Ubuntu 的特立獨行核心的主分支(2.6.35)編譯它就好了。我可以啟動並載入模組,但我遇到了一個非常奇怪的情況。一旦我載入模組(無論是在啟動時還是通過 modprobe),我都會得到一個黑屏並且我的控制台鎖定。
奇怪的是它不會鎖定我的系統,它只是目前的控制台會話。我可以通過 SSH 進入我的盒子,它給了我一個終端和一個會話。我可以打字,甚至可以執行一個命令,它會給我輸出。然後它會繪製我的下一個提示並立即鎖定。
我看到
dmesg
有一個空指針,我得到以下堆棧跟踪:[ 956.215836] input: Logitech G19 Gaming Keyboard as /devices/pci0000:00/0000:00:1d.7/usb1/1-2/1-2.1/1-2.1.2/1-2.1.2:1.1/input/input5 [ 956.216023] hid-g19 0003:046D:C229.0004: input,hiddev97,hidraw3: USB HID v1.11 Keypad [Logitech G19 Gaming Keyboard] on usb-0000:00:1d.7-2.1.2/input1 [ 956.216065] input: Logitech G19 as /devices/pci0000:00/0000:00:1d.7/usb1/1-2/1-2.1/1-2.1.2/1-2.1.2:1.1/input/input6 [ 956.216128] Registered led device: g19_97:orange:m1 [ 956.216146] Registered led device: g19_97:orange:m2 [ 956.216178] Registered led device: g19_97:orange:m3 [ 956.216198] Registered led device: g19_97:red:mr [ 956.216216] Registered led device: g19_97:red:bl [ 956.216235] Registered led device: g19_97:green:bl [ 956.216259] Registered led device: g19_97:blue:bl [ 956.216872] Console: switching to colour frame buffer device 40x30 [ 956.216899] BUG: unable to handle kernel NULL pointer dereference at 000000000000001c [ 956.216903] IP: [<ffffffffa040b21b>] sys_imageblit+0x21b/0x4ec [sysimgblt] [ 956.216911] PGD 273554067 PUD 2726ca067 PMD 0 [ 956.216914] Oops: 0000 [#1] SMP [ 956.216917] last sysfs file: /sys/devices/pci0000:00/0000:00:1d.7/usb1/1-2/1-2.1/1-2.1.2/1-2.1.2:1.1/usb/hiddev1/uevent [ 956.216921] CPU 5 [ 956.216922] Modules linked in: hid_g19(+) led_class hid_gfb fb_sys_fops sysimgblt sysfillrect syscopyarea btrfs zlib_deflate crc32c libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs exportfs reiserfs snd_hda_codec_atihdmi snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device ioatdma snd i5000_edac soundcore snd_page_alloc psmouse edac_core i5k_amb shpchp serio_raw dca ppdev parport_pc lp parport usbhid hid floppy e1000e [ 956.216953] [ 956.216956] Pid: 3147, comm: modprobe Not tainted 2.6.35-26-generic #46 DSBF-DE/System Product Name [ 956.216959] RIP: 0010:[<ffffffffa040b21b>] [<ffffffffa040b21b>] sys_imageblit+0x21b/0x4ec [sysimgblt] [ 956.216963] RSP: 0018:ffff8802766db738 EFLAGS: 00010246 [ 956.216965] RAX: 0000000000000000 RBX: ffff880273e71000 RCX: ffff880272e93b40 [ 956.216968] RDX: 0000000000000007 RSI: 0000000000000010 RDI: ffff880272e93b40 [ 956.216970] RBP: ffff8802766db7d8 R08: 0000000000000000 R09: ffff880272e93b98 [ 956.216972] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [ 956.216974] R13: 0000000000000010 R14: 0000000000000008 R15: ffff8802766db8c8 [ 956.216977] FS: 00007fcae7725700(0000) GS:ffff880001f40000(0000) knlGS:0000000000000000 [ 956.216979] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 956.216981] CR2: 000000000000001c CR3: 000000026ba26000 CR4: 00000000000006e0 [ 956.216983] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 956.216986] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 956.216988] Process modprobe (pid: 3147, threadinfo ffff8802766da000, task ffff8802696a16e0) [ 956.216990] Stack: [ 956.216991] ffff8802766db778 ffffffff810746ae ffff8802766db700 ffff88026b2cadc0 [ 956.216994] <0> ffff8802766db778 ffffffff812beef9 ffff8802f66db947 ffff8802766db94f [ 956.216998] <0> ffff8802766db848 00000000812bf22e ffff880272e93b40 ffffffff812feb40 [ 956.217001] Call Trace: [ 956.217011] [<ffffffff810746ae>] ? send_signal+0x3e/0x90 [ 956.217018] [<ffffffff812beef9>] ? put_dec+0x59/0x60 [ 956.217023] [<ffffffff812feb40>] ? fbcon_resize+0xd0/0x230 [ 956.217027] [<ffffffffa04175da>] gfb_fb_imageblit+0x1a/0x30 [hid_gfb] [ 956.217031] [<ffffffff813051b9>] soft_cursor+0x1c9/0x270 [ 956.217034] [<ffffffff81304e8b>] bit_cursor+0x65b/0x6c0 [ 956.217037] [<ffffffff812c1796>] ? vsnprintf+0x316/0x5a0 [ 956.217043] [<ffffffff81061045>] ? try_acquire_console_sem+0x15/0x60 [ 956.217046] [<ffffffff81300ca8>] fbcon_cursor+0x1d8/0x340 [ 956.217049] [<ffffffff81304830>] ? bit_cursor+0x0/0x6c0 [ 956.217054] [<ffffffff81368139>] hide_cursor+0x29/0x90 [ 956.217057] [<ffffffff8136b078>] redraw_screen+0x148/0x240 [ 956.217060] [<ffffffff8136b42e>] bind_con_driver+0x2be/0x3b0 [ 956.217063] [<ffffffff8136b569>] take_over_console+0x49/0x70 [ 956.217066] [<ffffffff812ff7fb>] fbcon_takeover+0x5b/0xb0 [ 956.217069] [<ffffffff81303ca5>] fbcon_event_notify+0x5c5/0x650 [ 956.217076] [<ffffffff8158e7f6>] notifier_call_chain+0x56/0x80 [ 956.217080] [<ffffffff8108510a>] __blocking_notifier_call_chain+0x5a/0x80 [ 956.217084] [<ffffffff81085146>] blocking_notifier_call_chain+0x16/0x20 [ 956.217089] [<ffffffff812f366b>] fb_notifier_call_chain+0x1b/0x20 [ 956.217092] [<ffffffff812f4c8c>] register_framebuffer+0x1ec/0x2e0 [ 956.217098] [<ffffffff814084f8>] ? usb_init_urb+0x28/0x40 [ 956.217101] [<ffffffffa041790f>] gfb_probe+0x21f/0x4f0 [hid_gfb] [ 956.217107] [<ffffffffa0425778>] g19_probe+0x558/0xedc [hid_g19] [ 956.217115] [<ffffffff811c059c>] ? sysfs_do_create_link+0xec/0x210 [ 956.217128] [<ffffffffa00330c7>] hid_device_probe+0x77/0xf0 [hid] [ 956.217131] [<ffffffff81388aa2>] ? driver_sysfs_add+0x62/0x90 [ 956.217134] [<ffffffff81388bc8>] really_probe+0x68/0x190 [ 956.217138] [<ffffffff81388d35>] driver_probe_device+0x45/0x70 [ 956.217140] [<ffffffff81388dfb>] __driver_attach+0x9b/0xa0 [ 956.217143] [<ffffffff81388d60>] ? __driver_attach+0x0/0xa0 [ 956.217146] [<ffffffff81388008>] bus_for_each_dev+0x68/0x90 [ 956.217149] [<ffffffff81388a3e>] driver_attach+0x1e/0x20 [ 956.217151] [<ffffffff813882fe>] bus_add_driver+0xde/0x280 [ 956.217154] [<ffffffff81389140>] driver_register+0x80/0x150 [ 956.217157] [<ffffffff8158e7f6>] ? notifier_call_chain+0x56/0x80 [ 956.217161] [<ffffffffa042a000>] ? g19_init+0x0/0x20 [hid_g19] [ 956.217166] [<ffffffffa0032913>] __hid_register_driver+0x53/0x90 [hid] [ 956.217169] [<ffffffff81085115>] ? __blocking_notifier_call_chain+0x65/0x80 [ 956.217173] [<ffffffffa042a01e>] g19_init+0x1e/0x20 [hid_g19] [ 956.217178] [<ffffffff8100204c>] do_one_initcall+0x3c/0x1a0 [ 956.217184] [<ffffffff8109bd9b>] sys_init_module+0xbb/0x200 [ 956.217192] [<ffffffff8100a0f2>] system_call_fastpath+0x16/0x1b [ 956.217195] Code: 83 e1 fc 48 89 4d c8 eb d3 8b 83 14 01 00 00 83 f8 04 74 09 83 f8 02 0f 85 7b 01 00 00 48 8b 4d b0 48 8b 83 00 04 00 00 8b 51 10 <44> 8b 04 90 8b 51 14 8b 3c 90 44 8b 4d ac 45 85 c9 75 16 41 b9 [ 956.217218] RIP [<ffffffffa040b21b>] sys_imageblit+0x21b/0x4ec [sysimgblt] [ 956.217221] RSP <ffff8802766db738> [ 956.217223] CR2: 000000000000001c [ 956.217227] ---[ end trace 95d6c6d6913ccc79 ]---
任何人都可以指出我如何調試它的正確方向嗎?
堆棧跟踪使我相信不是 hid-g15 驅動程序而是 hid-gfb 驅動程序,它為鍵盤上的 LCD 創建了幀緩衝區。這是有道理的,因為它鎖定了我的顯示/控制台,但深入研究核心程式碼並沒有真正去任何地方。其中很大一部分是彙編和宏功能。
堆棧跟踪中涉及我的新程式碼的最後一個函式是
gfb_fb_imageblit
. 該功能的整體是struct gfb_data *par = info->par; sys_imageblit(info, image); gfb_fb_update(par);
我讀錯了堆棧跟踪嗎?我錯過了什麼嗎?關於如何調試的任何提示?
首先,調試模組?看看你是否可以在 gdb 中載入它,它可能會直接指向使用相關變數(或接近它)的行。
哦,你可能會發現這篇文章很有用
我是那個更新檔的作者之一,抱歉它太有問題了:)
一般來說,要找到這樣的空指針,我只需插入 printks,直到找到為空 (=0) 的指針,然後我閱讀原始碼,直到找出原因。
但是在這種情況下,我知道您必須禁用幀緩衝區控制台,否則您會遇到這個討厭的錯誤,該錯誤僅在控制台可見時才會觸發。或者它可能是當您拔下鍵盤時觸發的錯誤,並且模組仍然嘗試寫入現在無效的緩衝區。
您應該查看github上的新程式碼,我正在嘗試清理這些程式碼,以便更輕鬆地針對任意核心進行編譯,並且修復了很多錯誤。
另外,請訪問我們的 IRC,freenode 上的#lg4l。