内存越界检查工具:electric_fence原理及使用
一、背景
C语言编程,由于可以自由操作内存、指针,往往很容易出现内存越界问题,内存越界问题很难调试,因为内存越界访问时,往往不会立即触发异常,而是在后续其他访问这块内存时才触发异常,由于读取到的内存数据异常,导致程序挂起等操作,此时产生的coredump现场无有效信息,无法定位到越界现场(是哪个函数代码触发的越界)。如果没有工具,很难进行分析,常见的内存越界检查工具:
工具 | 实现原理 | 输出 | 可检测范围 | 内存占用 | CPU |
---|---|---|---|---|---|
valgrind | 虚拟机形式,拦截内存读写操作,检测越界 | log输出,显示越界访问的代码位置 | 全局变量、栈、堆 | 大 | 大 |
kasan/asan | 设置内存红区,基于变量设置红区,变量前后插入红区,内存访问时插入检测代码。 | log输出,显示越界访问的代码位置 | 全局变量、栈、堆 | 大(多1/8) | 大 |
electric fence | 设置内存红区,基于page设置红区,基于linux的缺页异常机制,越界访问红区,产生coredump。 | coredump文件,进程立即退出,不良即停。 | 堆 | 大(翻多倍) | 小 |
valgrind是使用虚拟机的形式检测越界,kasan和electric fence都是基于红区的越界检测,即在变量前后增加红区(red zone),在内存访问时增加红区检测机制。kasan是通过编译时插入代码,并且使用影子内存保存内存访问权限,每个变量前后增加适当长度的红区。在执行时访存操作ldr/str前插入代码,检查影子内存中的权限位是否可访问,如果是redzone,则输出log提示越界。
而electric fence是巧妙的利用了linux内存页内存管理机制,每个变量分配一个page的内存空间(4KB),然后再多分配一个虚拟page做为红区,越界访问红区时,触发内核缺页异常(page fault),page fault处理时检测访问权限位,为不可访问,触发SEGMENT FAULT信号到进程,产生coredump文件。
二、electric fence原理
1. malloc同名函数拦截
efence中重新定义了malloc/free等函数,替换c库中的同名函数,利用链接优先级:静态链接 > 动态链接,且动态链接时同名函数遵循先来后到原则,哪个so先加载则使用哪个so中的函数。所以仅需要使用静态链接,或者动态链接先于libc.so 前加载即可。
2. 内存红区布局
- 内存布局
每次malloc,最小分配一个page的大小,多分配一个虚拟page为红区,并且将用户内存放到page0的尾部,保证内存越界时就会踩到红区,从而触发segment fault。因为内存写时复制原则,只有在读写内存时才会真正分配内存,所以实际每次malloc的物理内存占用为一个page 4K。
- 内存申请mmap
内存申请时使用mmap分配内存,使用/dev/zero设备进行mmap进行分配匿名页。Page_Create函数代码如下:
void *
Page_Create(size_t size)
{static int devZeroFd = -1;caddr_t allocation;if ( devZeroFd == -1 ) {devZeroFd = open("/dev/zero", O_RDWR);if ( devZeroFd < 0 )EF_Exit("open() on /dev/zero failed: %s",stringErrorReport());}/** In this version, "startAddr" is a _hint_, not a demand.* When the memory I map here is contiguous with other* mappings, the allocator can coalesce the memory from two* or more mappings into one large contiguous chunk, and thus* might be able to find a fit that would not otherwise have* been possible. I could _force_ it to be contiguous by using* the MMAP_FIXED flag, but I don't want to stomp on memory mappings* generated by other software, etc.*/allocation = (caddr_t) mmap(startAddr,size,PROT_READ|PROT_WRITE,MAP_PRIVATE,devZeroFd,0);startAddr = allocation + size;if ( allocation == (caddr_t)-1 )EF_Exit("mmap() failed: %s", stringErrorReport());return (void *)allocation;
}
- 设置红区mprotect
申请内存时,红区权限设置使用系统调用mprotect设置红区为不可访问。
void
Page_AllowAccess(void * address, size_t size)
{if ( mprotect((caddr_t)address, size, PROT_READ|PROT_WRITE) < 0 )mprotectFailed();
}void
Page_DenyAccess(void * address, size_t size)
{if ( mprotect((caddr_t)address, size, PROT_NONE) < 0 )mprotectFailed();
}
- slot管理结构
使用slot记录每个内存款的管理结构,如返回给用户的地址以及其内部实际地址、size等,用于free操作匹配地址。
struct _Slot {void * userAddress;void * internalAddress;size_t userSize;size_t internalSize;Mode mode;
};
初始化时使用mmap分配1MB的空间用于存储slot管理节点,因为efence重定义了malloc,分配内存只能使用mmap分配,不能使用malloc,否则会递归调用。
#define MEMORY_CREATION_SIZE 1024 * 1024
/** initialize sets up the memory allocation arena and the run-time* configuration information.*/
static void
initialize(void)
{size_t size = MEMORY_CREATION_SIZE;size_t slack;char * string;Slot * slot;if ( EF_DISABLE_BANNER == -1 ) {if ( (string = getenv("EF_DISABLE_BANNER")) != 0 )EF_DISABLE_BANNER = atoi(string);elseEF_DISABLE_BANNER = 0;}if ( EF_DISABLE_BANNER == 0 )EF_Print(version);/** Import the user's environment specification of the default* alignment for malloc(). We want that alignment to be under* user control, since smaller alignment lets us catch more bugs,* however some software will break if malloc() returns a buffer* that is not word-aligned.** I would like* alignment to be zero so that we could catch all one-byte* overruns, however if malloc() is asked to allocate an odd-size* buffer and returns an address that is not word-aligned, or whose* size is not a multiple of the word size, software breaks.* This was the case with the Sun string-handling routines,* which can do word fetches up to three bytes beyond the end of a* string. I handle this problem in part by providing* byte-reference-only versions of the string library functions, but* there are other functions that break, too. Some in X Windows, one* in Sam Leffler's TIFF library, and doubtless many others.*/if ( EF_ALIGNMENT == -1 ) {if ( (string = getenv("EF_ALIGNMENT")) != 0 )EF_ALIGNMENT = (size_t)atoi(string);elseEF_ALIGNMENT = sizeof(int);}/** See if the user wants to protect the address space below a buffer,* rather than that above a buffer.*/if ( EF_PROTECT_BELOW == -1 ) {if ( (string = getenv("EF_PROTECT_BELOW")) != 0 )EF_PROTECT_BELOW = (atoi(string) != 0);elseEF_PROTECT_BELOW = 0;}/** See if the user wants to protect memory that has been freed until* the program exits, rather than until it is re-allocated.*/if ( EF_PROTECT_FREE == -1 ) {if ( (string = getenv("EF_PROTECT_FREE")) != 0 )EF_PROTECT_FREE = (atoi(string) != 0);elseEF_PROTECT_FREE = 0;}/** See if the user wants to allow malloc(0).*/if ( EF_ALLOW_MALLOC_0 == -1 ) {if ( (string = getenv("EF_ALLOW_MALLOC_0")) != 0 )EF_ALLOW_MALLOC_0 = (atoi(string) != 0);elseEF_ALLOW_MALLOC_0 = 0;}/** See if the user wants us to wipe out freed memory.*/if ( EF_FREE_WIPES == -1 ) {if ( (string = getenv("EF_FREE_WIPES")) != 0 )EF_FREE_WIPES = (atoi(string) != 0);elseEF_FREE_WIPES = 0;}/** Get the run-time configuration of the virtual memory page size.*/bytesPerPage = Page_Size();/** Figure out how many Slot structures to allocate at one time.*/slotCount = slotsPerPage = bytesPerPage / sizeof(Slot);allocationListSize = bytesPerPage;if ( allocationListSize > size )size = allocationListSize;if ( (slack = size % bytesPerPage) != 0 )size += bytesPerPage - slack;/** Allocate memory, and break it up into two malloc buffers. The* first buffer will be used for Slot structures, the second will* be marked free.*/slot = allocationList = (Slot *)Page_Create(size);memset((char *)allocationList, 0, allocationListSize);slot[0].internalSize = slot[0].userSize = allocationListSize;slot[0].internalAddress = slot[0].userAddress = allocationList;slot[0].mode = INTERNAL_USE;if ( size > allocationListSize ) {slot[1].internalAddress = slot[1].userAddress= ((char *)slot[0].internalAddress) + slot[0].internalSize;slot[1].internalSize= slot[1].userSize = size - slot[0].internalSize;slot[1].mode = FREE;}/** Deny access to the free page, so that we will detect any software* that treads upon free memory.*/Page_DenyAccess(slot[1].internalAddress, slot[1].internalSize);/** Account for the two slot structures that we've used.*/unUsedSlots = slotCount - 2;if ( EF_DISABLE_BANNER == 0 )EF_Print(enabled);
}
- 内存申请
内存申请函数memalign中,调用Page_Create分配内存,然后Page_DenyAccess设置红区,并且使用slot记录内存块信息。
extern C_LINKAGE void *
memalign(size_t alignment, size_t userSize)
{register Slot * slot;register size_t count;Slot * fullSlot = 0;Slot * emptySlots[2];size_t internalSize;size_t slack;char * address;if ( allocationList == 0 )initialize();if ( userSize == 0 && !EF_ALLOW_MALLOC_0 )EF_Abort("Allocating 0 bytes, probably a bug.");/** If EF_PROTECT_BELOW is set, all addresses returned by malloc()* and company will be page-aligned.*/if ( !EF_PROTECT_BELOW && alignment > 1 ) {if ( (slack = userSize % alignment) != 0 )userSize += alignment - slack;}/** The internal size of the buffer is rounded up to the next page-size* boudary, and then we add another page's worth of memory for the* dead page.*/internalSize = userSize + bytesPerPage;if ( (slack = internalSize % bytesPerPage) != 0 )internalSize += bytesPerPage - slack;/** These will hold the addresses of two empty Slot structures, that* can be used to hold information for any memory I create, and any* memory that I mark free.*/emptySlots[0] = 0;emptySlots[1] = 0;/** The internal memory used by the allocator is currently* inaccessable, so that errant programs won't scrawl on the* allocator's arena. I'll un-protect it here so that I can make* a new allocation. I'll re-protect it before I return.*/if ( !noAllocationListProtection )Page_AllowAccess(allocationList, allocationListSize);/** If I'm running out of empty slots, create some more before* I don't have enough slots left to make an allocation.*/if ( !internalUse && unUsedSlots < 7 ) {allocateMoreSlots();}/** Iterate through all of the slot structures. Attempt to find a slot* containing free memory of the exact right size. Accept a slot with* more memory than we want, if the exact right size is not available.* Find two slot structures that are not in use. We will need one if* we split a buffer into free and allocated parts, and the second if* we have to create new memory and mark it as free.**/for ( slot = allocationList, count = slotCount ; count > 0; count-- ) {if ( slot->mode == FREE&& slot->internalSize >= internalSize ) {if ( !fullSlot||slot->internalSize < fullSlot->internalSize){fullSlot = slot;if ( slot->internalSize == internalSize&& emptySlots[0] )break; /* All done, */}}else if ( slot->mode == NOT_IN_USE ) {if ( !emptySlots[0] )emptySlots[0] = slot;else if ( !emptySlots[1] )emptySlots[1] = slot;else if ( fullSlot&& fullSlot->internalSize == internalSize )break; /* All done. */}slot++;}if ( !emptySlots[0] )internalError();if ( !fullSlot ) {/** I get here if I haven't been able to find a free buffer* with all of the memory I need. I'll have to create more* memory. I'll mark it all as free, and then split it into* free and allocated portions later.*/size_t chunkSize = MEMORY_CREATION_SIZE;if ( !emptySlots[1] )internalError();if ( chunkSize < internalSize )chunkSize = internalSize;if ( (slack = chunkSize % bytesPerPage) != 0 )chunkSize += bytesPerPage - slack;/* Use up one of the empty slots to make the full slot. */fullSlot = emptySlots[0];emptySlots[0] = emptySlots[1];fullSlot->internalAddress = Page_Create(chunkSize);fullSlot->internalSize = chunkSize;fullSlot->mode = FREE;unUsedSlots--;}/** If I'm allocating memory for the allocator's own data structures,* mark it INTERNAL_USE so that no errant software will be able to* free it.*/if ( internalUse )fullSlot->mode = INTERNAL_USE;elsefullSlot->mode = ALLOCATED;/** If the buffer I've found is larger than I need, split it into* an allocated buffer with the exact amount of memory I need, and* a free buffer containing the surplus memory.*/if ( fullSlot->internalSize > internalSize ) {emptySlots[0]->internalSize= fullSlot->internalSize - internalSize;emptySlots[0]->internalAddress= ((char *)fullSlot->internalAddress) + internalSize;emptySlots[0]->mode = FREE;fullSlot->internalSize = internalSize;unUsedSlots--;}if ( !EF_PROTECT_BELOW ) {/** Arrange the buffer so that it is followed by an inaccessable* memory page. A buffer overrun that touches that page will* cause a segmentation fault.*/address = (char *)fullSlot->internalAddress;/* Set up the "live" page. */if ( internalSize - bytesPerPage > 0 )Page_AllowAccess(fullSlot->internalAddress,internalSize - bytesPerPage);address += internalSize - bytesPerPage;/* Set up the "dead" page. */Page_DenyAccess(address, bytesPerPage);/* Figure out what address to give the user. */address -= userSize;}else { /* EF_PROTECT_BELOW != 0 *//** Arrange the buffer so that it is preceded by an inaccessable* memory page. A buffer underrun that touches that page will* cause a segmentation fault.*/address = (char *)fullSlot->internalAddress;/* Set up the "dead" page. */Page_DenyAccess(address, bytesPerPage);address += bytesPerPage;/* Set up the "live" page. */if ( internalSize - bytesPerPage > 0 )Page_AllowAccess(address, internalSize - bytesPerPage);}fullSlot->userAddress = address;fullSlot->userSize = userSize;/** Make the pool's internal memory inaccessable, so that the program* being debugged can't stomp on it.*/if ( !internalUse )Page_DenyAccess(allocationList, allocationListSize);return address;
}
3. 相关资料
https://elinux.org/Electric_Fence
https://linux.die.net/man/3/efence
源码:
https://github.com/kallisti5/ElectricFence
三、electrice fence 使用
1)编译
编译时支持动态链接和静态链接,仅需链接efence库即可。
gcc -g demo.c -lefence -o demo
2)运行
静态链接运行时不需要做特殊处理,而动态链接的话执行前需要先指定最先加载libefence.so 库,先于libc.so 加载,保证malloc等函数优先使用efence中的,完成拦截,定义LD_PRELOAD环境变量。
export LD_PRELOAD=\lib\libefence.so
3)开启coredump
ulimit -c unlimited
四、注意事项
1)可能需要扩展vma个数,每个内存块占用一个vma,使用如下命令扩展
sysctl -w vm.max_map_count=262144
2)内存占用比较大,对于频繁的小内存分配,内存可能翻好几十倍,注意考虑内存是否够用。