当前位置: 首页 > news >正文

Kernel FPU save/restore机制详解

FPU,fload point unit,intel平台的fpu经历多代,到现在为avx512,以下我们统称FPU;

技术 寄存器宽度寄存器数量指令特定
x8780位(栈) 8 ST 堆栈式操作,支持高精度浮点
MMX 64位8 MM整数SIMD,复用FPU寄存器
SSE128位8 XMM浮点SIMD、独立寄存器、指令扩展
AVX256位 16 YMM三操作数指令、FMA、GATHER
AVX-512  512位32 ZMM掩码操作、复杂逻辑指令、子集扩展

从内核开发者角度,以上技术对比普通的mov rdi rdx,并无区别;我们知道,在上下文切换时,需要保存程序运行上下文,这其中就包括ip、rsp、通用寄存器等,那么fpu寄存器是如何保存和切换的?

在深入FPU上下文切换之前,我们首先要知道,内核并不能隐式的使用FPU,换句话说,绝大多数内核代码都不能会使用FPU,个别需要使用的需要显式的调用kernel_fpu_begin()/kernel_fpu_end(),例如raid6和crypto;原因有二:

  • 对比于Linux Kernel运行的海量平台,仅有少数支持FPU,
  • FPU上下文的save/restore开销较大,SSE寄存器128字节,AVX 512字节,AVX512 2048字节,所以仅有需要利用FPU处理大量数据时才有价值,

正是FPU上下文save/restore的开销较大,内核在处理时非常小心,早期使用了一种Lazy FPU的机制,参考链接:

LWN Really lazy fpuhttps://lwn.net/Articles/391972/

Currently fpu management is only lazy in one direction.  When we switch into
a task, we may avoid loading the fpu state in the hope that the task will
never use it.  If we guess right we save an fpu load/save cycle; if not,
a Device not Available exception will remind us to load the fpu.However, in the other direction, fpu management is eager.  When we switch out
of an fpu-using task, we always save its fpu state.

这里我们不去追溯它的相关代码,因为后来这个Feature因为安全问题CVE-2018-3665被关闭,

WIKI Lazy_FP_state_restore https://en.wikipedia.org/wiki/Lazy_FP_state_restore

Lazy FPU Save/Restore (CVE-2018-3665)https://access.redhat.com/solutions/3485131

Lazy save/restore of FPU/SSE/AVX States:

Modern processors employ numerous techniques to improve system performance. One such technique is to defer save and restore of certain CPU context states on task switch. Today, processors come equipped with a dedicated Floating Point Unit (FPU) to perform high precision floating-point operations used in scientific, engineering and/or graphics applications. The FPU maintains its own context state in its data registers, status registers, as well as control and opcode registers.

A task/context switch occurs when a user application calls a kernel function or when a process is preempted to schedule the next one in the queue. Upon a task switch, the processor saves its current execution context (various registers, instruction and stack pointers, etc.) and loads the context of the new process. While doing so, it can defer restoring of FPU/SSE context state, because not all applications use the Floating Point Unit (FPU). If the newly scheduled process does not use Floating-Point (FP) instructions, it does not need to save/restore FPU context state. This can save precious execution cycles and improves performance.

Under the lazy restore scheme, during task switch, the first FP instruction executed by a process generates a “Device not Available (DNA)” exception; the DNA exception handler then saves the current FPU context into the old task’s state save area and loads the new FPU context for the current process. In other words, loading of the FPU state is deferred until an FP instruction is invoked by the current task - Lazy FPU restore.

Recent processors include processor extensions (“XSAVEOPT”) that implement FPU restore in hardware more efficiently, giving the performance benefits of lazy FPU without having to rely on the DNA exception. On these processors, Red Hat Enterprise Linux 7 is already using eager FPU restore, and is therefore not vulnerable. In practice, the FPU registers are usually involved in block memory copies and string operations such that lazy FPU restore does not benefit performance sensibly even on older processors.

总结起来就是,在引入了XSAVEOPT指令之后,FPU save/restore完全由硬件完成,效率很高,同时,Lazy FPU Restore机制有安全隐患,所以,换成Eger FPU restore。

接下来看下5.14版本的实现:

__switch_to()-> switch_fpu_prepare()---if (cpu_feature_enabled(X86_FEATURE_FPU) &&!(current->flags & PF_KTHREAD)) {save_fpregs_to_fpstate(old_fpu);...}---__switch_to()-> switch_fpu_finish()
---if (cpu_feature_enabled(X86_FEATURE_FPU))set_thread_flag(TIF_NEED_FPU_LOAD);
---arch_exit_to_user_mode_prepare()
---if (unlikely(ti_work & _TIF_NEED_FPU_LOAD))switch_fpu_return();
---

进程到调度回来的时候,并不会立刻restore,而是等到返回用户态之前,这是基于内核不会直接使用FPU。

如果内核要使用FPU机制,需要调用kenrel_fpu_begin()和kernel_fpu_end(),

kernel_fpu_begin()-> kernel_fpu_begin_mask()---preempt_disable();...if (!(current->flags & PF_KTHREAD) &&!test_thread_flag(TIF_NEED_FPU_LOAD)) {set_thread_flag(TIF_NEED_FPU_LOAD);save_fpregs_to_fpstate(&current->thread.fpu);}__cpu_invalidate_fpregs_state();...---kernel_fpu_end()
---...preempt_enable();
---

这里其实就是保存了当前进程的FPU状态,设置上TIF_NEED_FPU_LOAD,之后,该进程返回用户态之前,会再重新加载。

KVM中是如何处理FPU的呢?首先要了解在KVM运行过程中,FPU的几个使用方,即guest、kvm和qemu;

  • guest->kvm,kvm位于host内核态,它不会直接使用FPU,所以,当vm-exit时,无需保存FPU state;
  • guest->kvm->qemu,qemu位于用户态,是会使用FPU的,所以,这里需要fpstate save/restore的过程;
fpu_swap_kvm_fpstate()
---if (enter_guest) {fpu->__task_fpstate = cur_fps;fpu->fpstate = guest_fps;guest_fps->in_use = true;} else {guest_fps->in_use = false;fpu->fpstate = fpu->__task_fpstate;fpu->__task_fpstate = NULL;}cur_fps = fpu->fpstate;if (!cur_fps->is_confidential) {/* Includes XFD update */restore_fpregs_from_fpstate(cur_fps, XFEATURE_MASK_FPSTATE);}
---kvm_arch_vcpu_ioctl_run()
---vcpu_load(vcpu);-> fpu_swap_kvm_fpstate(&vcpu->arch.guest_fpu, true);...kvm_load_guest_fpu(vcpu);...vcpu_run()...kvm_put_guest_fpu(vcpu);-> fpu_swap_kvm_fpstate(&vcpu->arch.guest_fpu, false);...
---

另外,在KVM内核路径中,有很多可能会调度出去的点,例如:

  • vcpu_block()
  • xfer_to_guest_mode_work()中的抢占点
  • kvm_faultin_pfn() get_use_page()
  • 等等等

任何一次调度出去,都可能导致FPU被换成别的进程的,所以,有以下两点处理:

kvm_fpu_get()
---...if (test_thread_flag(TIF_NEED_FPU_LOAD))switch_fpu_return();...
---vcpu_enter_guest()
---...if (test_thread_flag(TIF_NEED_FPU_LOAD))switch_fpu_return();...exit_fastpath = static_call(kvm_x86_run)(vcpu);...
---

其中:

  • kvm_fpu_get()通常用在指令模拟中需要访问FPU相关寄存器时,
  • vcpu_enter_guest()则是在进入guest之前;
http://www.dtcms.com/a/493142.html

相关文章:

  • SQL中常见的英文术语及其含义
  • 安阳网站设计哪家好wordpress自定后台
  • 1营销型网站建设给公司做网站的费用入什么科目
  • 如何建小企业网站家装公司哪家比较好
  • 广西网站建设价格低济南网站建设 unzz
  • 四川省建设厅新网站网络营销方案ppt模板
  • 互动网站策划常州建设局建筑职称网站
  • 做网站就用建站之星设计网站策划书
  • 建设一个网站需要哪些硬件设备哪个网站的前台背景墙做的好
  • 福田做商城网站建设哪家效益快微信微网站开发策划
  • 整站优化和单词织梦园模板网站
  • 建设部网站公民服务建筑工程网页
  • 强化学习入门-1-CartPole-v1(DQN)
  • 精品网站源码资源程序下载新网站怎么做流畅
  • 温州专业手机网站制作多少钱郑州外贸网站制作
  • 阴阳师网站建设动漫专业就业前景
  • 企业门户网站服务器网页设计与网站建设的热点
  • 义乌公司网站制作海南网络广播电视台少儿频道
  • 企业怎么做自己的网站辽宁建设工程信息网怎么业绩加分
  • 萍缘网站建设工作曲靖房地产网站开发
  • 互联网站备案信息查询河南开展涉网暴力专项举报工作
  • 免备案域名是危险网站南京网站推广哪家便宜
  • 做电商卖玉器的网站如何自己开个网站平台
  • 南昌哪家做网站好h5制作开发哪儿
  • windows优化大师最新版本杭州市网站seo
  • 网站设计制作哪个公司的好wordpress淘宝联盟模板下载地址
  • 百度天气赋能下的湖南省湖南省空气质量WebGIS可视化关键技术与实现
  • 自己做服装搭配的网站网站cdn自己做
  • 建设银行官网网站人事百度广告投放公司
  • 公司展示型网站seo营销外包