Fixmap
Fixmap的主要用途
Fixmap区域是一段在编译器就固定(fixed)其虚拟地址的空间。其主要用于动态映射子系统(如vmap)准备好之前的mapping请求,如启动初期console设备的临时映射。当然,其在runtime时期也有应用场景:如内核代码修改、或页表更新等。
1) io device early mapping
- To allow mapping of devices at early boot time when the ioremap() function cannot be used, two methods are provided:
- Used for fixed mapping of devices statically set at compile time (e.g. early console)
- During the early boot-up running time when regular device mapping (ioremap) is not available, the device can be temporarily mapped to the fixmap virtual address space using the early_ioremap() function.
- A total of 7 times, each with 256K mapping possible
2) Kernel mapping of highmem physical memory
- For the ARM32 kernel, highmem pages in the physical area are mapped to the fixmap virtual address area allocated by CPU ID.
- stack-based kmap_atomic
- The initial mappable index slot was previously fixed based on the CPU ID and usage type, but now it is operated using push/pop in the same way as a stack, with the CPU ID being KM_TYPE_NR (ARM=20).
- That is, 20 mapping slots are assigned to each CPU.
- Note: mm: stack based kmap_atomic()
- The initial mappable index slot was previously fixed based on the CPU ID and usage type, but now it is operated using push/pop in the same way as a stack, with the CPU ID being KM_TYPE_NR (ARM=20).
- The kmap_atomic() function maps a highmem page to the fixmap area corresponding to the current CPU. If the page is already mapped to the kmap area, the corresponding virtual address is returned.
- ZONE_HIGHMEM uses several mapping methods in the kernel and always uses repeated mapping and unmapping, so the access speed shows slower performance than ZONE_NORMAL which is always pre-mapped due to the overhead for mapping. Of course, unlike the kernel, at the user level, a very large user address space (1G, 2G, or 3G depending on the setting) is mapped and used for each task.
- stack-based kmap_atomic
- 64-bit systems have a very large virtual address space, allowing them to map all physical memory within the system. Therefore, there is no need to use highmem in these cases.
3) Kernel code changes
- When changing the read-only kernel code, the fixmap virtual address area is temporarily used.
几种mapping方式的对比:
- vmap
- It can be used by mapping multiple pages for a long time and is mapped to a fairly large vmalloc address space.
- ARM32
- 240M space
- ARM64
- It varies depending on the size of CONFIG_VM_BITS, and it is almost half of the VM space, excluding some vmemmap, pci io, fixmap, kimage, module image areas, etc.
- For example, if you use CONFIG_VM_BITS=39, the VM size is 512G, of which vmalloc space is about 246G.
- ARM32
- It can be used by mapping multiple pages for a long time and is mapped to a fairly large vmalloc address space.
- kmap
- It uses the kmap address space for a certain period of time by mapping it. Once the mapping is complete, it is maintained even if the task is scheduled and changed to another task.
- ARM32
- 2M space
- ARM32
- It uses the kmap address space for a certain period of time by mapping it. Once the mapping is complete, it is maintained even if the task is scheduled and changed to another task.
- fixmap
- It can be used for extremely short periods of time for highmem pages and is mapped to the fixmap address space. It does not sleep and can therefore be used in interrupt context.
- It must be unmapped before being scheduled and replaced by another task.
- Other IO areas are fixed and used at boot time.
- ARM32
- The existing space has expanded from 2M to 3M.
- ARM64
- It varies depending on the kernel version and kernel options, but currently it is about 6M space.
- It can be used for extremely short periods of time for highmem pages and is mapped to the fixmap address space. It does not sleep and can therefore be used in interrupt context.
Fixmap的地址空间
- The fixmap virtual address space is located and sized differently depending on the architecture.
- ARM32
- Currently, it uses 3M of virtual address space, with 768 pages indexed from high to low (0 to 767).
- It uses a 3M area from FIXADDR_START(0xffc0_0000) to FIXADDR_END(0xfff0_0000).
- Previously, a 2M area was used from 0xffc0_0000 to 0xffe0_0000.
- Note: ARM: expand fixmap region to 3MB
- The FIXADDR_TOP area is FIXADDR_END – PAGE_SIZE(4K).
- The index designation is FIXADDR_TOP(0xffef_f000), and the index number increases downwards starting from index 0.
- Index numbers can be specified from 0 to a maximum of 0x2ff (767).
- ARM64
- It uses a virtual address space of about 6M and uses index slots from the highest address down.
- ARM32
Fixmap index slot
- The virtual address is fixed according to the fixmap index slot number.
- Example) ARM32
- index=0 -> vaddr=0xffef_f000 (FIXADDR_TOP)
- index=1 -> vaddr=0xffef_e000
- …
- index=767 -> vaddr=0xffc0_0000 (FIXADDR_START)
- Example) ARM32
Map page 1 starting with physical address 0x4000_0000 to slot index 1 of fixmap.
- set_fixmap(1, 0x4000_0000)
Fixmap slot classification
Fixmap的slot类型根据架构不同以及内核版本不同,差异较大。下面是一些常见的类型:
- HOLE
- Not used for ARM32.
- For ARM64, one slot is provided for debugging purposes and is currently unused as a spare entry page.
- Added in kernel v3.19-rc1.
- Note: arm64: Add FIX_HOLE to permanent fixed addresses
- FDT
- Not used for ARM32.
- For ARM64, it provides a slot that covers 4M of device tree (FDT).
- FDT is up to 2M, but since it uses 2M unit align, it requires an area of up to 4M.
- Added in kernel v4.2-rc1.
- Note: arm64: use fixmap region for permanent FDT mapping
- EARLYCON
- Use one index slot for input/output before regular mapping for use as a console for serial devices (early console).
- This area was added in kernel v4.3-rc1 in August 2015.
- Note: ARM: 8415/1: early fixmap support for earlycon
- KMAP
- This is the space used when mapping the highmem physical memory area used only in 32-bit systems, and the index slots used differ depending on the number of CPUs (NR_CPUS).
- Used when Fixmap was first introduced.
- For ARM32, 20 index slots are given depending on the number of CPUs.
- The number increased from 16 to 20.
- For x86_32, 41 index slots are given depending on the number of CPUs.
- This is the space used when mapping the highmem physical memory area used only in 32-bit systems, and the index slots used differ depending on the number of CPUs (NR_CPUS).
- TEXT_POKE
- When using kernel code, kprobes, static keys, etc., read-only kernel code is changed. This code is mapped briefly using one or two slots here and then changed.
- For ARM32, two slots are used.
- For ARM64, one slot is used.
- APEI_GHES
- Currently, when using the GHES driver on ARM64 and X86_64, two slots are given.
- The GHES driver uses a fixmap to prevent the ioremap_page_range() function from sleeping when used in irq context, as this can cause it to sleep.
- Added in kernel v4.15-rc.
- Note: ACPI / APEI: Replace ioremap_page_range() with fixmap
- This was added in kernel v5.1.
- Note: firmware: arm_sdei: Add ACPI GHES registration helper
- ENTRY_TRAMP
- For security, KASLR (Kernel Address Sanitizer Location Randomization) technology was adopted to hide the kernel location from user space. In addition, a separate top-level page table (pgd) for KPTI (Kernel Page Table Isolation, aka KAISER), which prepares and operates a kernel page table separately to prevent access to the kernel area from user space, is used in this ENTRY_TRAMP page. Kernels using this option must flush the TLB every time they switch between the kernel and the user, resulting in a performance degradation of about 5%. It is said that future CPUs will be designed to use technology that completely separates access to the kernel space from the user space so that performance is not degraded even without using this option.
- Added in kernel v4.16-rc1.
- KAISER: hiding the kernel from user space | LWN.net
- The current state of kernel page-table isolation | LWN.net
- arm64: mm: Map entry trampoline into trampoline and kernel page tables
- arm64: kaslr: Put kernel vectors address in separate data page
- BTMAPS
- This is where devices are temporarily mapped and used through early_ioremap() at early boot time when regular ioremap() cannot be used.
- You can have up to 7 mappings, each using 256K.
- Added in kernel v3.15-rc1.
- Note: arm64: add early_ioremap support
- FIX_PTE, FIX_PMD, FIX_PUD, FIX_PGD
- It is used to process atomically so that it can be applied to the TLB without any problems when creating a kernel page table at runtime, and uses a total of 4 index slots, 1 for each.
- The kernel page table has been changed to read-only, and this area is used whenever a page table entry is modified.
- Added in kernel v.4.6-rc1.
- Note: arm64: mm: add functions to walk tables in fixmap
enum fixed_addresses
arch/arm64/include/asm/fixmap.h
/*
* Here we define all the compile-time 'special' virtual
* addresses. The point is to have a constant address at
* compile time, but to set the physical address only
* in the boot process.
*
* Each enum increment in these 'compile-time allocated'
* memory buffers is page-sized. Use set_fixmap(idx,phys)
* to associate physical memory with a fixmap index.
*/
enum fixed_addresses {
FIX_HOLE,
/*
* Reserve a virtual window for the FDT that is 2 MB larger than the
* maximum supported size, and put it at the top of the fixmap region.
* The additional space ensures that any FDT that does not exceed
* MAX_FDT_SIZE can be mapped regardless of whether it crosses any
* 2 MB alignment boundaries.
*
* Keep this at the top so it remains 2 MB aligned.
*/
#define FIX_FDT_SIZE (MAX_FDT_SIZE + SZ_2M)
FIX_FDT_END,
FIX_FDT = FIX_FDT_END + FIX_FDT_SIZE / PAGE_SIZE - 1,
FIX_EARLYCON_MEM_BASE,
FIX_TEXT_POKE0,
#ifdef CONFIG_ACPI_APEI_GHES
/* Used for GHES mapping from assorted contexts */
FIX_APEI_GHES_IRQ,
FIX_APEI_GHES_SEA,
#ifdef CONFIG_ARM_SDE_INTERFACE
FIX_APEI_GHES_SDEI_NORMAL,
FIX_APEI_GHES_SDEI_CRITICAL,
#endif
#endif /* CONFIG_ACPI_APEI_GHES */
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
#ifdef CONFIG_RELOCATABLE
FIX_ENTRY_TRAMP_TEXT4, /* one extra slot for the data page */
#endif
FIX_ENTRY_TRAMP_TEXT3,
FIX_ENTRY_TRAMP_TEXT2,
FIX_ENTRY_TRAMP_TEXT1,
#define TRAMP_VALIAS (__fix_to_virt(FIX_ENTRY_TRAMP_TEXT1))
#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
__end_of_permanent_fixed_addresses,
/*
* Temporary boot-time mappings, used by early_ioremap(),
* before ioremap() is functional.
*/
#define NR_FIX_BTMAPS (SZ_256K / PAGE_SIZE)
#define FIX_BTMAPS_SLOTS 7
#define TOTAL_FIX_BTMAPS (NR_FIX_BTMAPS * FIX_BTMAPS_SLOTS)
FIX_BTMAP_END = __end_of_permanent_fixed_addresses,
FIX_BTMAP_BEGIN = FIX_BTMAP_END + TOTAL_FIX_BTMAPS - 1,
/*
* Used for kernel page table creation, so unmapped memory may be used
* for tables.
*/
FIX_PTE,
FIX_PMD,
FIX_PUD,
FIX_PGD,
__end_of_fixed_addresses
};
Fixmap分为两个主要的区域:
- __end_of_permanent_fixed_addresses
- This is a permanent mapping space that is not unmapped after booting.
- __end_of_fixed_addresses
- It is the last of the fixmap, and the area after __end_of_permanent_fixed_addresses is a space where mapping and unmapping are possible.
Arm64架构的Fixmap blocks如下图:
early_fixmap_init()
为支持fixmap,编译期静态定义了三个page: bm_pud[], bm_pmd[], and bm_pte[],用于fixmap的页表。如下注释,early_fixmap_init()函数主要是初始化bm_pud[]和bm_pmd[]页表条目的。
/*
* The p*d_populate functions call virt_to_phys implicitly so they can't be used
* directly on kernel symbols (bm_p*d). This function is called too early to use
* lm_alias so __p*d_populate functions must be used to populate with the
* physical address from __pa_symbol.
*/
void __init early_fixmap_init(void)
Mapping two virtual spaces in the kernel image
- The kernel image (kimage) is mapped into two virtual spaces at boot time.
- 1) linear mapping virtual space
- This is the space where the entire DRAM where the kernel is loaded is mapped.
- Example) 0xfff_0000_0000_0000~ (4K, 4-level page)
- 2) kimage virtual space
- This is a space where only kernel images are mapped.
- Example) 0xffff_8000_1000_0000~ (4K, 4-level page, KASLR=n
- 1) linear mapping virtual space
Virtual and physical address translation APIs
- The following API is used to convert 1) lm (linear mapping) virtual address and physical address.
- Usage restrictions
- It is available after linear mapping is completed via the paging_init() function.
- Prints a warning message if lm virtual addresses are not used using the CONFIG_DEBUG_VIRTUAL kernel option.
- virt_to_phys(), __virt_to_phys(), __va()
- phys_to_virt(), __phys_to_virt(), __pa()
- lm_alias()
- Converts lm virtual addresses to kernel symbols.
- Usage restrictions
- The following API is used to convert kernel symbol virtual and physical addresses in 2).
- Usage restrictions
- doesn't exist
- __kimg_to_phys()
- __phys_to_kimg()
- __pa_symbol()
- Converts kernel symbol virtual addresses to physical addresses.
- Usage restrictions
- The following API is used to convert 1) lm (linear mapping) virtual address and page.
- Usage restrictions
- Available after vmemmap, which consists of an array of page descriptors, is activated.
- virt_to_page()
- page_to_virt()
- Usage restrictions
The following figure shows how page tables at each stage are activated for fixmap.
- When using 4K pages and VA_BITS=48, a 4-level page table is used, and the p4d table following pgd uses pgd as is in ARM64(p4d = pgd).
注意:
early_fixmap_init()函数映射从FIXADDR_START开始的2M(1个pmd条目涵盖的范围,也可以说是最后一级512个pte条目涵盖的范围)空间,包括了early_ioremap()需要使用的BITMAPS空间和FIX_PGT~FIX_PTE包含的4个pages!
Main Fixmap API
set_fixmap()
Maps one page corresponding to physical address @phys to the request index @idx of fixmap using the normal kernel page mapping attribute.
- ARM32
- L_PTE_YOUNG | L_PTE_PRESENT | L_PTE_XN | L_PTE_DIRTY | L_PTE_MT_WRITEBACK
- ARM64
- Use FIXMAP_PAGE_NORMAL -> PAGE_KERNEL -> __pgprot(PROT_NORMAL) property
- PTE_TYPE_PAGE | PTE_AF | PTE_SHARED | PTE_MAYBE_NG | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX( MT_NORMAL )
- Use FIXMAP_PAGE_NORMAL -> PAGE_KERNEL -> __pgprot(PROT_NORMAL) property
set_fixmap_offset()
利用fixmap建立临时映射,并返回虚拟地址。
/* Return a pointer with offset calculated */
#define __set_fixmap_offset(idx, phys, flags) \
({ \
unsigned long ________addr; \
__set_fixmap(idx, phys, flags); \
________addr = fix_to_virt(idx) + ((phys) & (PAGE_SIZE - 1)); \
________addr; \
})
#define set_fixmap_offset(idx, phys) \
__set_fixmap_offset(idx, phys, FIXMAP_PAGE_NORMAL)
clear_fixmap()
Unmaps one physical address page mapped to the request index @idx area of fixmap, setting the attribute to CLEAR(0).
set_fixmap_nocache()
Maps one page corresponding to the physical address @phys to the request index @idx area of fixmap without caching, setting the property to FIXMAP_PAGE_NOCACHE
set_fixmap_offset_nocache()
#define set_fixmap_offset_nocache(idx, phys) \
__set_fixmap_offset(idx, phys, FIXMAP_PAGE_NOCACHE)
set_fixmap_io()
Maps one page corresponding to the physical address @phys to the request index @idx area of fixmap, and sets the attribute to FIXMAP_PAGE_IO.
fix_to_virt()
virt_to_fix()
#define __fix_to_virt(x) (FIXADDR_TOP - ((x) << PAGE_SHIFT))
#define __virt_to_fix(x) ((FIXADDR_TOP - ((x)&PAGE_MASK)) >> PAGE_SHIFT)
#ifndef __ASSEMBLY__
/*
* 'index to address' translation. If anyone tries to use the idx
* directly without translation, we catch the bug with a NULL-deference
* kernel oops. Illegal ranges of incoming indices are caught too.
*/
static __always_inline unsigned long fix_to_virt(const unsigned int idx)
{
BUILD_BUG_ON(idx >= __end_of_fixed_addresses);
return __fix_to_virt(idx);
}
static inline unsigned long virt_to_fix(const unsigned long vaddr)
{
BUG_ON(vaddr >= FIXADDR_TOP || vaddr < FIXADDR_START);
return __virt_to_fix(vaddr);
}