本文共 14878 字,大约阅读时间需要 49 分钟。
在Linux中调试段错误(core dumped)
在作比赛的时候经常遇到段错误, 但是一般都采用的是printf打印信息这种笨方法,而且定位bug比较慢,今天尝试利用gdb工具调试段错误.段错误(core dumped)一般都是数组索引位置不对,或者是数组越界等问题造成,在Linux环境下编程应该很容易就会遇到.GDB调试的具体流程查阅资料发现几个讲Core dump的具体方法:How to get a core dump for a segfault on Linux.How to analyze a program's core dump file with gdb?.Core dump file analysis [duplicate].Debugging with GDB.Linux Applications Debugging Techniques/Core files.什么是段错误Segmentation fault (core dumped)段错误一般是指程序尝试访问它不被允许访问的内存地址,可能会被一下情况导致:试图访问(dereference)一个空指针, 系统不允许访问地址为0的内存空间;试图访问一个不在自己内存访问范围内的一个指针;在C++程序中, 一个类的vtable(虚指针的列表)被占用, 而且指向了一个错误的地方, 导致程序试图去执行一个没有运行权限的地址;未内存对齐的程序访问也可能导致段错误.valgrind简单工具进行调试valgrind可以跟踪程序的一些堆栈信息, 使用之前必须利用sudo apt-get install valgrind进行安装该命令行工具.然后通过valgrind -v 可执行程序名字追踪有问题的二进制可执行程序.下面是段错误程序的显示结果:$ valgrind -v ./bin/CodeCraft-2019 ==19578== Memcheck, a memory error detector==19578== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.==19578== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info==19578== Command: ./bin/CodeCraft-2019==19578== --19578-- Valgrind options:--19578-- -v--19578-- Contents of /proc/version:--19578-- Linux version 4.15.0-46-generic (buildd@lgw01-amd64-038) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #49-Ubuntu SMP Wed Feb 6 09:33:07 UTC 2019--19578-- --19578-- Arch and hwcaps: AMD64, LittleEndian, amd64-cx16-rdtscp-sse3-avx--19578-- Page sizes: currently 4096, max supported 4096--19578-- Valgrind library directory: /usr/lib/valgrind--19578-- Reading syms from /usrdata/applications/huawei2019/03-28-01-coredump/bin/CodeCraft-2019--19578-- Reading syms from /lib/x86_64-linux-gnu/ld-2.27.so--19578-- Considering /lib/x86_64-linux-gnu/ld-2.27.so ..--19578-- .. CRC mismatch (computed 1b7c895e wanted 2943108a)--19578-- Considering /usr/lib/debug/lib/x86_64-linux-gnu/ld-2.27.so ..--19578-- .. CRC is valid--19578-- Reading syms from /usr/lib/valgrind/memcheck-amd64-linux--19578-- Considering /usr/lib/valgrind/memcheck-amd64-linux ..--19578-- .. CRC mismatch (computed c25f395c wanted 0a9602a8)--19578-- object doesn't have a symbol table--19578-- object doesn't have a dynamic symbol table--19578-- Scheduler: using generic scheduler lock implementation.--19578-- Reading suppressions file: /usr/lib/valgrind/default.supp==19578== embedded gdbserver: reading from /tmp/vgdb-pipe-from-vgdb-to-19578-by-jl-on-???==19578== embedded gdbserver: writing to /tmp/vgdb-pipe-to-vgdb-from-19578-by-jl-on-???==19578== embedded gdbserver: shared mem /tmp/vgdb-pipe-shared-mem-vgdb-19578-by-jl-on-???==19578== ==19578== TO CONTROL THIS PROCESS USING vgdb (which you probably==19578== don't want to do, unless you know exactly what you're doing,==19578== or are doing some strange experiment):==19578== /usr/lib/valgrind/../../bin/vgdb --pid=19578 ...command...==19578== ==19578== TO DEBUG THIS PROCESS USING GDB: start GDB like this==19578== /path/to/gdb ./bin/CodeCraft-2019==19578== and then give GDB the following command==19578== target remote | /usr/lib/valgrind/../../bin/vgdb --pid=19578==19578== --pid is optional if only one valgrind process is running==19578== --19578-- REDIR: 0x401f2f0 (ld-linux-x86-64.so.2:strlen) redirected to 0x58060901 (???)--19578-- REDIR: 0x401f0d0 (ld-linux-x86-64.so.2:index) redirected to 0x5806091b (???)--19578-- Reading syms from /usr/lib/valgrind/vgpreload_core-amd64-linux.so--19578-- Considering /usr/lib/valgrind/vgpreload_core-amd64-linux.so ..--19578-- .. CRC mismatch (computed 4b63d83e wanted 670599e6)--19578-- object doesn't have a symbol table--19578-- Reading syms from /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so--19578-- Considering /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so ..--19578-- .. CRC mismatch (computed a4b37bee wanted 8ad4dc94)--19578-- object doesn't have a symbol table==19578== WARNING: new redirection conflicts with existing -- ignoring it--19578-- old: 0x0401f2f0 (strlen ) R-> (0000.0) 0x58060901 ???--19578-- new: 0x0401f2f0 (strlen ) R-> (2007.0) 0x04c32db0 strlen--19578-- REDIR: 0x401d360 (ld-linux-x86-64.so.2:strcmp) redirected to 0x4c33ee0 (strcmp)--19578-- REDIR: 0x401f830 (ld-linux-x86-64.so.2:mempcpy) redirected to 0x4c374f0 (mempcpy)--19578-- Reading syms from /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25--19578-- object doesn't have a symbol table--19578-- Reading syms from /lib/x86_64-linux-gnu/libgcc_s.so.1--19578-- object doesn't have a symbol table--19578-- Reading syms from /lib/x86_64-linux-gnu/libc-2.27.so--19578-- Considering /lib/x86_64-linux-gnu/libc-2.27.so ..--19578-- .. CRC mismatch (computed b1c74187 wanted 042cc048)--19578-- Considering /usr/lib/debug/lib/x86_64-linux-gnu/libc-2.27.so ..--19578-- .. CRC is valid--19578-- Reading syms from /lib/x86_64-linux-gnu/libm-2.27.so--19578-- Considering /lib/x86_64-linux-gnu/libm-2.27.so ..--19578-- .. CRC mismatch (computed 7feae033 wanted b29b2508)--19578-- Considering /usr/lib/debug/lib/x86_64-linux-gnu/libm-2.27.so ..--19578-- .. CRC is valid--19578-- REDIR: 0x547bc70 (libc.so.6:memmove) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547ad40 (libc.so.6:strncpy) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547bf50 (libc.so.6:strcasecmp) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547a790 (libc.so.6:strcat) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547ad70 (libc.so.6:rindex) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547d7c0 (libc.so.6:rawmemchr) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547bde0 (libc.so.6:mempcpy) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547bc10 (libc.so.6:bcmp) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547ad00 (libc.so.6:strncmp) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547a800 (libc.so.6:strcmp) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547bd40 (libc.so.6:memset) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x54990f0 (libc.so.6:wcschr) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547aca0 (libc.so.6:strnlen) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547a870 (libc.so.6:strcspn) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547bfa0 (libc.so.6:strncasecmp) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547a840 (libc.so.6:strcpy) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547c0e0 (libc.so.6:memcpy@@GLIBC_2.14) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547ada0 (libc.so.6:strpbrk) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547a7c0 (libc.so.6:index) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547ac70 (libc.so.6:strlen) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x54856c0 (libc.so.6:memrchr) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547bff0 (libc.so.6:strcasecmp_l) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547bbe0 (libc.so.6:memchr) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x5499eb0 (libc.so.6:wcslen) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547b050 (libc.so.6:strspn) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547bf20 (libc.so.6:stpncpy) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547bef0 (libc.so.6:stpcpy) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547d7f0 (libc.so.6:strchrnul) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x547c040 (libc.so.6:strncasecmp_l) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)--19578-- REDIR: 0x548e330 (libc.so.6:__strrchr_sse2) redirected to 0x4c32790 (__strrchr_sse2)--19578-- REDIR: 0x5474070 (libc.so.6:malloc) redirected to 0x4c2faa0 (malloc)--19578-- REDIR: 0x548e620 (libc.so.6:__strlen_sse2) redirected to 0x4c32d30 (__strlen_sse2)--19578-- REDIR: 0x556cfc0 (libc.so.6:__memcmp_sse4_1) redirected to 0x4c35d50 (__memcmp_sse4_1)--19578-- REDIR: 0x5486e70 (libc.so.6:__strcmp_sse2_unaligned) redirected to 0x4c33da0 (strcmp)Begin--19578-- REDIR: 0x5498440 (libc.so.6:__mempcpy_sse2_unaligned) redirected to 0x4c37130 (mempcpy)please input args: carPath, roadPath, crossPath, answerPath--19578-- REDIR: 0x5498870 (libc.so.6:__memset_sse2_unaligned) redirected to 0x4c365d0 (memset)--19578-- REDIR: 0x5474950 (libc.so.6:free) redirected to 0x4c30cd0 (free)==19578== ==19578== HEAP SUMMARY:==19578== in use at exit: 0 bytes in 0 blocks==19578== total heap usage: 2 allocs, 2 frees, 73,728 bytes allocated==19578== ==19578== All heap blocks were freed -- no leaks are possible==19578== ==19578== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)==19578== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)怎么才能获得core dump文件一个core dump文件是程序运行时的一份内存拷贝, 通过这个文件可以调试程序找到出bug的地方;当程序程序出现了段错误时, Linux内核会根据配置情况将一个core dump文件写入到硬盘中.Linux用ulimit设置连接数的最大值, ulimit只能做临时修改,重启后失效:ulimit -c 设置core文件的最大值, 单位为区块;ulimit -a 显示目前资源限制的设定.利用ulimit -c unlimited将core文件设置为无限大.不能产生core文件的原因:没有足够内存空间;禁用了core文件的创建;设置一个进程当前目录没有写文件的的权限;利用命令sudo sysctl -w kernel.core_pattern=/tmp/core-%e.%p.%h.%t设置内核产生core文件的形式和位置, 放于/tmp目录并且显示时间戳.当程序出现段错误的时候, linux内核会自动地在/tmp目录保存一个core文件.利用cat /proc/PID/limit也可以显示一个进程中的core文件的大小限制.kernel.core_pattern表示coredumps文件放于什么地方,它是一个内核参数,可以通过sysctl进行查看和进行控制:sysctl -a表示查看内核的所有参数, 或使用sysctl kernel.core_pattern显示kernel.core_pattern的参数.通过GDB工具对生成的core文件进行回溯追踪通过命令gdb -c my_core_file打开一个名为my_core_file的文件.调试我的coredump的程序结果如下:sudo gdb -c /tmp/core-CodeCraft-2019.23637.jl.1554030516GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-gitCopyright (C) 2018 Free Software Foundation, Inc.License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it.There is NO WARRANTY, to the extent permitted by law. Type "show copying"and "show warranty" for details.This GDB was configured as "x86_64-linux-gnu".Type "show configuration" for configuration details.For bug reporting instructions, please see:.Find the GDB manual and other documentation resources online at:.For help, type "help".Type "apropos word" to search for commands related to "word".[New LWP 23637]Core was generated by `./bin/CodeCraft-2019 ../1-map-training-1/car.txt ../1-map-training-1/road.txt .'.Program terminated with signal SIGSEGV, Segmentation fault.(gdb)
可以看到, 该程序在执行过程中接收到了一个SIGSEGV信号, 该信号表示一个进程执行了一个无效的内存引用, 或发生了段错误.然后在gdb工具中不停的bt找到出现段错误在程序的多少行和真正引起段错误的原因.bt的含义是backtrace, 列出调用栈.gdb调试中常用的几个命令参数:attach用GDB调试一个正在运行中的进程gdb PID;br用来设置断点, br filename:line_num,br namespace::classname::func_name;n:单步跳过, s:单步进行;finish:执行到函数return返回的地方;list:列出当前位置之后的10行代码;list line_number列出line_number之后的十行代码;info locals列出当前函数的局部变量;p var_打印变量值;info breakpoints列出所有断点;delete breakpoints删除所有断点;delete breakpoints id删除编号为id的断点;disable/enable breakpoints id禁用/启动断点;break ... if ...条件中断;我的程序执行bt后发现有很多问号, 这是因为gdb没有加载我程序库的信息, 编译的时候需要加上-g选项:(gdb) bt在gdb中执行symbol-file 共享动态库的路径进行加载gdb调试时的动态库搜索路径.
ldd命令可以列出一个二进制文件的依赖关系.利用set solib-search-path进行寻找依赖库.Backtrace stopped: Cannot access memory at address 0x195
最后的结果gdb调试结果为:[New LWP 5070]Core was generated by `8, 6238, 6768, 6414, 5857, 6219, 6774, 5642, 5099, 6080)(gdb) frame 0
flags=1, slen=, format=0x55e0aef3a657 ", %d", args=args@entry=0x7ffcb0e28c00) at vsnprintf_chk.c:66
66 in vsnprintf_chk.c
(gdb) frame 1format=) at snprintf_chk.c:34
34 snprintf_chk.c: No such file or directory.
(gdb) frame 2(gdb) frame 3
(gdb) frame 4
以上结果说明在writeResult函数中出现了段错误.
利用thread apply all bt full查看每个线程在堆栈中的使用情况.GDB过程中最重要的几个指令为:转载地址:http://napxl.baihongyu.com/