【凸优化】分式规划
P 1 : min c , f , W , ν α β ≥ Γ , ∣ ν m c m ∣ 2 + W m , m ≤ P , m ∈ M , W ⪰ 0 , ν m ∈ { 0 , 1 } , m ∈ M , \begin{aligned} \mathcal{P}1:\min_{\boldsymbol{c},\boldsymbol{f},\boldsymbol{W},\boldsymbol{\nu}} &\alpha \\ & \beta\geq\Gamma, \\ & |\nu_{m}c_{m}|^{2}+W_{m,m}\leq P,m\in\mathcal{M}, \\ & W\succeq0, \\ & \nu_{m}\in\{0,1\},m\in\mathcal{M}, \end{aligned} P1:c,f,W,νminαβ≥Γ,∣νmcm∣2+Wm,m≤P,m∈M,W⪰0,νm∈{0,1},m∈M,
α = ∑ m = 1 M ν m 2 − f † ( ∑ m = 1 M c m ν m h m ) ( ∑ m = 1 M c m ν m h m ) † f f † ( ∑ m = 1 M ∣ c m ∣ 2 ν m 2 h m h m † + H W H † + A ) f \begin{aligned} \alpha & =\sum_{m=1}^M\nu_m^2 -\frac{f^\dagger\left(\sum_{m=1}^Mc_m\nu_mh_m\right)\left(\sum_{m=1}^Mc_m\nu_mh_m\right)^\dagger f}{f^\dagger\left(\sum_{m=1}^M|c_m|^2\nu_m^2h_mh_m^\dagger+HWH^\dagger+A\right)f} \end{aligned} α=m=1∑Mνm2−f†(∑m=1M∣cm∣2νm2hmhm†+HWH†+A)ff†(∑m=1Mcmνmhm)(∑m=1Mcmνmhm)†f
β = ∑ m = 1 M ν m 2 − ∣ ∑ m = 1 M c m ν m g m ∣ 2 ∑ m = 1 M ∣ c m ∣ 2 ν m 2 ∣ g m ∣ 2 + g † W g + P J ∣ g J ∣ 2 + σ e 2 \begin{aligned} \beta & =\sum_{m=1}^{M}\nu_{m}^{2} -\frac{|\sum_{m=1}^{M}c_{m}\nu_{m}g_{m}|^{2}}{\sum_{m=1}^{M}|c_{m}|^{2}\nu_{m}^{2}|g_{m}|^{2}+g^{\dagger}Wg+P_{\mathrm{J}}|g_{\mathrm{J}}|^{2}+\sigma_{\mathrm{e}}^{2}} \end{aligned} β=m=1∑Mνm2−∑m=1M∣cm∣2νm2∣gm∣2+g†Wg+PJ∣gJ∣2+σe2∣∑m=1Mcmνmgm∣2
定义辅助变量 ν ~ \tilde{\boldsymbol{\nu}} ν~来处理离散的整数变量 ν \boldsymbol{\nu} ν
P 2 : min f , c , W , ν ~ , ν α s.t. β ≥ Γ , ν ~ m = ν m , m ∈ M , ν m ( 1 − ν ~ m ) = 0 , m ∈ M , c m ( 1 − ν m ) = 0 , m ∈ M , W ⪰ 0 , ν m ∈ { 0 , 1 } , m ∈ M , \begin{aligned} \mathcal{P} 2: \min _{\boldsymbol{f}, c, \boldsymbol{W}, \tilde{\boldsymbol{\nu}}, \boldsymbol{\nu}} & \alpha \\ \text { s.t. } & \beta \geq \Gamma, \\ & \tilde{\nu}_m=\nu_m, \quad m \in \mathcal{M}, \\ & \nu_m\left(1-\tilde{\nu}_m\right)=0, \quad m \in \mathcal{M}, \\ & c_m\left(1-\nu_m\right)=0, \quad m \in \mathcal{M}, \\ & W\succeq0, \\ & \nu_{m}\in\{0,1\},m\in\mathcal{M},\end{aligned} P2:f,c,W,ν~,νmin s.t. αβ≥Γ,ν~m=νm,m∈M,νm(1−ν~m)=0,m∈M,cm(1−νm)=0,m∈M,W⪰0,νm∈{0,1},m∈M,
下面利用罚对偶分解(penalty dual decomposition)来处理上面的优化问题,首先定义增广拉格朗日函数
A
=
α
+
ψ
=
α
+
κ
⊤
(
ν
−
ν
~
)
+
∑
m
=
1
M
ϖ
m
(
ν
m
(
1
−
ν
~
m
)
)
+
∑
m
=
1
M
η
m
(
c
m
(
1
−
ν
m
)
)
+
1
2
ϱ
(
∥
ν
−
ν
~
∥
2
)
+
1
2
ϱ
(
∑
m
=
1
M
(
ν
m
(
1
−
ν
~
m
)
)
2
+
∣
c
m
(
1
−
ν
m
)
∣
2
)
\begin{aligned} & \mathcal{A} = \alpha + \psi=\alpha + \boldsymbol{\kappa}^{\top}(\boldsymbol{\nu}-\tilde{\boldsymbol{\nu}})+\sum_{m=1}^M \varpi_m\left(\nu_m\left(1-\tilde{\nu}_m\right)\right) +\sum_{m=1}^M \eta_m\left(c_m\left(1-\nu_m\right)\right)+\frac{1}{2 \varrho}\left(\|\boldsymbol{\nu}-\tilde{\boldsymbol{\nu}}\|^2\right) \\ &+\frac{1}{2 \varrho}\left(\sum_{m=1}^M\left(\nu_m\left(1-\tilde{\nu}_m\right)\right)^2+\left|c_m\left(1-\nu_m\right)\right|^2\right)\end{aligned}
A=α+ψ=α+κ⊤(ν−ν~)+m=1∑Mϖm(νm(1−ν~m))+m=1∑Mηm(cm(1−νm))+2ϱ1(∥ν−ν~∥2)+2ϱ1(m=1∑M(νm(1−ν~m))2+∣cm(1−νm)∣2)
其中
κ
\boldsymbol{\kappa}
κ
ϖ
\varpi
ϖ
ν
\nu
ν
η
\eta
η是对偶变量,
ϱ
\varrho
ϱ是罚系数,于是求解下面的问题
P 3 : min f , c , W , ν ~ , ν , κ , ϖ , ϱ , η A , s.t. ∣ ν m c m ∣ 2 + W m , m ≤ P , m ∈ M , W ⪰ 0 , β ≥ Γ \begin{aligned} \mathcal{P} 3: \min _{\substack{\boldsymbol{f}, \boldsymbol{c}, \boldsymbol{W}, \tilde{\boldsymbol{\nu}}, \boldsymbol{\nu}},\boldsymbol{\kappa},\varpi,\varrho,\eta} &\mathcal{A},\\ \text { s.t. } & |\nu_{m}c_{m}|^{2}+W_{m,m}\leq P,m\in\mathcal{M}, \\ & W\succeq0, \\ & \beta \geq \Gamma \end{aligned} P3:f,c,W,ν~,ν,κ,ϖ,ϱ,ηmin s.t. A,∣νmcm∣2+Wm,m≤P,m∈M,W⪰0,β≥Γ
在罚对偶分解中外层循环更新 f , c , W , ν ~ , κ \boldsymbol{f}, c, \boldsymbol{W}, \tilde{\boldsymbol{\nu}},\boldsymbol{\kappa} f,c,W,ν~,κ ϖ \varpi ϖ ν \nu ν ϱ \varrho ϱ,内层更新下面的变量
当给定
c
,
W
,
ν
~
,
ν
c, \boldsymbol{W}, \tilde{\boldsymbol{\nu}}, \boldsymbol{\nu}
c,W,ν~,ν,最优的
f
\boldsymbol{f}
f等价求解如下问题
P
3.1
:
max
f
f
†
(
∑
m
=
1
M
c
m
h
m
)
(
∑
m
=
1
M
c
m
h
m
)
†
f
f
†
(
∑
m
=
1
M
∣
c
m
∣
2
h
m
h
m
†
+
F
)
f
,
\mathcal{P} 3.1: \max _{\boldsymbol{f}} \frac{\boldsymbol{f}^{\dagger}\left(\sum_{m=1}^M c_m \boldsymbol{h}_m\right)\left(\sum_{m=1}^M c_m \boldsymbol{h}_m\right)^{\dagger} \boldsymbol{f}}{\boldsymbol{f}^{\dagger}\left(\sum_{m=1}^M\left|c_m\right|^2 \boldsymbol{h}_m \boldsymbol{h}_m^{\dagger}+\boldsymbol{F}\right) \boldsymbol{f}},
P3.1:maxff†(∑m=1M∣cm∣2hmhm†+F)ff†(∑m=1Mcmhm)(∑m=1Mcmhm)†f,
基于广义瑞利熵
F
=
H
W
H
†
+
A
\boldsymbol{F} = \boldsymbol{H} \boldsymbol{W} \boldsymbol{H}^{\dagger}+\boldsymbol{A}
F=HWH†+A. 最优的
f
\boldsymbol{f}
f 如下
f
=
(
∑
m
=
1
M
∣
c
m
∣
2
h
m
h
m
†
+
F
)
−
1
∑
m
=
1
M
c
m
h
m
\boldsymbol{f}=\left(\sum_{m=1}^M\left|c_m\right|^2 \boldsymbol{h}_m \boldsymbol{h}_m^{\dagger}+\boldsymbol{F}\right)^{-1} \sum_{m=1}^M c_m \boldsymbol{h}_m
f=(∑m=1M∣cm∣2hmhm†+F)−1∑m=1Mcmhm
当给定 { f , W , ν ~ , ν } \{\boldsymbol{f}, \boldsymbol{W}, \tilde{\boldsymbol{\nu}}, \boldsymbol{\nu}\} {f,W,ν~,ν} 更新 c \boldsymbol{c} c, 则问题退化为一个分式规划,使用二次转换(quadratic transform)忽略与 c \boldsymbol{c} c无关的项,则 P 3 \mathcal{P} 3 P3可以转为如下优化问题,其中 t t t是辅助变量。
P 3.2 : max c , t 2 ℜ { t ∗ f † H c } − ∣ t ∣ 2 ( c † D c + ζ ) − ϕ s.t. c † g g † c − ( 1 ⊤ ν − Γ ) ( c † B c + ς ) ≤ 0 ∣ ν m c m ∣ 2 + W m , m ≤ P , m ∈ M , \begin{array}{rl} \mathcal{P} 3.2: \max _{\boldsymbol{c}, t} & 2 \Re\left\{t^* \boldsymbol{f}^{\dagger} \boldsymbol{H} \boldsymbol{c}\right\}-|t|^2\left(\boldsymbol{c}^{\dagger} \boldsymbol{D} \boldsymbol{c}+\zeta\right)-\phi \\ \text { s.t. } & \boldsymbol{c}^{\dagger} \boldsymbol{g} \boldsymbol{g}^{\dagger} \boldsymbol{c}-\left(\mathbf{1}^{\top} \boldsymbol{\nu}-\Gamma\right)\left(\boldsymbol{c}^{\dagger} \boldsymbol{B} \boldsymbol{c}+\varsigma\right) \leq 0 \\ & |\nu_{m}c_{m}|^{2}+W_{m,m}\leq P,m\in\mathcal{M}, \\ \end{array} P3.2:maxc,t s.t. 2ℜ{t∗f†Hc}−∣t∣2(c†Dc+ζ)−ϕc†gg†c−(1⊤ν−Γ)(c†Bc+ς)≤0∣νmcm∣2+Wm,m≤P,m∈M,
ϕ ≜ ( η ⊙ ( 1 − ν ) ) ⊤ c + 1 2 ϱ c † diag ( ( 1 − ν ) ⊙ ( 1 − ν ) ) c ζ ≜ f † ( H W H † + A ) f ς ≜ g † W g + P J ∣ g J ∣ 2 + σ e 2 B ≜ diag ( [ ∣ g 1 ∣ 2 , … , ∣ g M ∣ 2 ] ⊤ ) D ≜ ≜ diag ( [ f † h 1 h 1 † f , … , f † h M h M † f ] ⊤ ) , \begin{aligned} & \phi \triangleq(\boldsymbol{\eta} \odot(1-\boldsymbol{\nu}))^{\top} \boldsymbol{c} +\frac{1}{2 \varrho} \boldsymbol{c}^{\dagger} \operatorname{diag}((\mathbf{1}-\boldsymbol{\nu}) \odot(\mathbf{1}-\boldsymbol{\nu})) \boldsymbol{c} \\ & \zeta \triangleq \boldsymbol{f}^{\dagger}\left(\boldsymbol{H} \boldsymbol{W} \boldsymbol{H}^{\dagger}+\boldsymbol{A}\right) \boldsymbol{f} \\ & \varsigma \triangleq \boldsymbol{g}^{\dagger} \boldsymbol{W} \boldsymbol{g}+P_{\mathrm{J}}\left|g_{\mathrm{J}}\right|^2+\sigma_{\mathrm{e}}^2 \\ & \boldsymbol{B} \triangleq \operatorname{diag}\left(\left[\left|g_1\right|^2, \ldots,\left|g_M\right|^2\right]^{\top}\right) \\ & \boldsymbol{D \triangleq} \triangleq \operatorname{diag}\left(\left[f^{\dagger} \boldsymbol{h}_1 \boldsymbol{h}_1^{\dagger} \boldsymbol{f}, \ldots, \boldsymbol{f}^{\dagger} \boldsymbol{h}_M \boldsymbol{h}_M^{\dagger} \boldsymbol{f}\right]^{\top}\right),\end{aligned} ϕ≜(η⊙(1−ν))⊤c+2ϱ1c†diag((1−ν)⊙(1−ν))cζ≜f†(HWH†+A)fς≜g†Wg+PJ∣gJ∣2+σe2B≜diag([∣g1∣2,…,∣gM∣2]⊤)D≜≜diag([f†h1h1†f,…,f†hMhM†f]⊤),