Muvera模型理论保证的证明
👉原文章
👉模型原理总结
👉完整的证明过程,包括了所有证明的细节
文章目录
- 1. \textbf{1. } 1. 定理 2.1 \textbf{2.1} 2.1证明的思路
- 1.0. \textbf{1.0. } 1.0. 定理 2.1 \textbf{2.1} 2.1的主要内容
- 1.1. \textbf{1.1. } 1.1. 无投影无重复时 FDE \textbf{FDE} FDE相似度的上界
- 1.2. \textbf{1.2. } 1.2. 加上一定维度的投影后原上界近似成立
- 1.3. \textbf{1.3. } 1.3. 已给出无重复 FDE \textbf{FDE} FDE上界👉还需证明什么才能给出无重复 FDE \textbf{FDE} FDE的上下界
- 1.4. \textbf{1.4. } 1.4. 无投影无重复时 FDE \textbf{FDE} FDE相似度的子下界
- 1.5. \textbf{1.5. } 1.5. 加上一定维度的投影后原子下界近似成立
- 2. \textbf{2. } 2. 定理 2.2 \textbf{2.2} 2.2证明的思路
1. \textbf{1. } 1. 定理 2.1 \textbf{2.1} 2.1证明的思路
1.0. \textbf{1.0. } 1.0. 定理 2.1 \textbf{2.1} 2.1的主要内容
👉前提 1 1 1:设定 ∀ ε , δ > 0 \forall{}\varepsilon,\delta\text{>}0 ∀ε,δ>0(其中 ε ≤ 1 2 \varepsilon{}\text{≤}\cfrac{1}{2} ε≤21且 δ ≤ ε \delta{≤}\varepsilon δ≤ε),给定单位向量集 P , Q ⊆ R d P,Q\text{⊆}\mathbb{R}^d P,Q⊆Rd并满足 m = ∣ Q ∣ + ∣ P ∣ m\text{=}|Q|\text{+}|P| m=∣Q∣+∣P∣
👉前提 2 2 2:选择参数 k sim = O ( 1 ε log ( m δ ) ) k_{\text{sim}}\text{=}O\left(\cfrac{1}{\varepsilon}\log{\left(\cfrac{m}{\delta}\right)}\right) ksim=O(ε1log(δm)), d proj = O ( 1 ε 2 log ( m ε δ ) ) d_{\text{proj}}\text{=}O\left(\cfrac{1}{\varepsilon^2}\log{\left(\cfrac{m}{\varepsilon\delta}\right)}\right) dproj=O(ε21log(εδm)), R reps = 1 R_{\text{reps}}\text{=}1 Rreps=1
👉结论: Chamfer ( Q , P ) – ε ≤FDE ( Q , P ) ≤Chamfer ( Q , P ) + ε \text{Chamfer}(Q,P)\text{–}\varepsilon\text{≤}\text{FDE}(Q,P)\text{≤}\text{Chamfer}(Q,P)\text{+}\varepsilon Chamfer(Q,P)–ε≤FDE(Q,P)≤Chamfer(Q,P)+ε以 Pr≥ 1 – δ \text{Pr}\text{≥}1\text{–}\delta Pr≥1–δ概率成立,即给出了 FDE \textbf{FDE} FDE相似度的上下界
1.1. \textbf{1.1. } 1.1. 无投影无重复时 FDE \textbf{FDE} FDE相似度的上界
➡️ Chamfer \text{Chamfer} Chamfer后期交互的本质:一套针对 q i q_i qi向量的映射策略
![]()
- f ( q i ) f(q_i) f(qi)操作为将 q i q_i qi映射到某一个 p q i ∈ P = { p 1 , p 2 , . . . , p m } p_{q_i}\text{∈}P\text{=}\{p_1,p_2,...,p_m\} pqi∈P={p1,p2,...,pm}向量
- 在 Chamfer \text{Chamfer} Chamfer后期交互中 f ( q i ) = p q i = arg p ∈ P MaxSim ( q i , P ) f(q_i)\text{=}p_{q_i}\text{=}\arg\limits_{p\text{∈}P}\text{MaxSim}(q_i,P) f(qi)=pqi=p∈PargMaxSim(qi,P),即遍历 p q i ∈ P = { p 1 , p 2 , . . . , p m } p_{q_i}\text{∈}P\text{=}\{p_1,p_2,...,p_m\} pqi∈P={p1,p2,...,pm},最终选取使内积 ⟨ q i , p q i ⟩ \langle{q_i,p_{q_i}}\rangle ⟨qi,pqi⟩最大的 p q i p_{q_i} pqi
- 最终合并所有的内积 ∑ i =1 n ⟨ q i , p q i ⟩ \displaystyle{\sum_{i\text{=1}}^n}\langle{q_i,p_{q_i}}\rangle i=1∑n⟨qi,pqi⟩是为 Chamfer \text{Chamfer} Chamfer相似度,即 ∑ i =1 n ⟨ q i , p q i ⟩ =Chamfer ( Q , P ) = ⟨ Q , P ′ ⟩ \displaystyle{\sum_{i\text{=1}}^n}\langle{q_i,p_{q_i}}\rangle\text{=}\text{Chamfer}(Q,P)\text{=}\langle{Q,P^\prime}\rangle i=1∑n⟨qi,pqi⟩=Chamfer(Q,P)=⟨Q,P′⟩
➡️ Muvera \text{Muvera} Muvera分桶的本质:另一套针对 q i q_i qi向量的映射策略
![]()
- 对于单个桶 Bucket-k \text{Bucket-k} Bucket-k中有 ⟨ q ⃗ ( k ) , p ⃗ ( k ) ⟩ = ⟨ ∑ k i q k i , p ⃗ ( k ) ⟩ = ∑ k i ⟨ q k i , p ⃗ ( k ) ⟩ \langle{\vec{q}_{(k)},\vec{p}_{(k)}}\rangle\text{=}\left\langle{\displaystyle\sum_{k_i}{}q_{k_i},\vec{p}_{(k)}}\right\rangle\text{=}\displaystyle\sum_{k_i}{}\left\langle{q_{k_i},\vec{p}_{(k)}}\right\rangle ⟨q(k),p(k)⟩=⟨ki∑qki,p(k)⟩=ki∑⟨qki,p(k)⟩,相当于为每个落入该桶 k k k的 q k i q_{k_i} qki映射了一个 p ⃗ ( k ) \vec{p}_{(k)} p(k)
- 合并 B B B个桶的结果有 ∑ k = 1 B ⟨ q ⃗ ( k ) , p ⃗ ( k ) ⟩ = ∑ q i ∈ Q ⟨ q i , p ⃗ ( k q i ) ⟩ \displaystyle\sum_{k\text{=}1}^{B}\langle{\vec{q}_{(k)},\vec{p}_{(k)}}\rangle\text{=}\displaystyle\sum_{q_i\text{∈}Q}{}\left\langle{q_{i},\vec{p}_{(k_{q_i})}}\right\rangle k=1∑B⟨q(k),p(k)⟩=qi∈Q∑⟨qi,p(kqi)⟩,相当于为每个 q i ∈ Q q_i\text{∈}Q qi∈Q都映射一个 p ⃗ ( k q i ) = p q i ∗ ∈ { p ⃗ ( 1 ) , . . . , p ⃗ ( B ) } \vec{p}_{(k_{q_i})}\text{=}p_{q_i}^*\text{∈}\{\vec{p}_{(1)},...,\vec{p}_{(B)}\} p(kqi)=pqi∗∈{p(1),...,p(B)}
➡️对两种映射方案的对比:有 ⟨ q i , p q i ∗ ⟩ ≤ ⟨ q i , p q i ⟩ \langle{q_i,p_{q_i}^*}\rangle\text{≤}\langle{q_i,p_{q_i}}\rangle ⟨qi,pqi∗⟩≤⟨qi,pqi⟩
![]()
- p q i p_{q_i} pqi的得到有且只有一种情况,就是 p q i = arg max p ∈ P ⟨ q i , p ⟩ = arg p ∈ P MaxSim ( q i , P ) \displaystyle{}p_{q_i}\text{=}\arg\max_{p\text{∈}P}\langle{q_i,p}\rangle\text{=}\arg\limits_{p\text{∈}P}\text{MaxSim}(q_i,P) pqi=argp∈Pmax⟨qi,p⟩=p∈PargMaxSim(qi,P)
- p q i ∗ p_{q_i}^* pqi∗的得到分为三种情况,即 Case-0/Case-n \text{Case-0/Case-n} Case-0/Case-n(其中视 Case-0 \text{Case-0} Case-0为 Case-n \text{Case-n} Case-n种 n=1 \text{n=1} n=1的情形)
- Case-0 \text{Case-0} Case-0时,假设 q i q_i qi所落入的是桶 k k k,则选取离桶 k k k编码最近的一点 p ^ k \hat{p}_{k} p^k,即 p q i ∗ = p ⃗ ( k ) = p ^ k p_{q_i}^*\text{=}\displaystyle{}\vec{p}_{(k)}\text{=}\hat{p}_{k} pqi∗=p(k)=p^k
- Case-n \text{Case-n} Case-n时,假设 q i q_i qi所落入的是桶 k k k,则 p q i ∗ = p ⃗ ( k ) = ∑ φ ( p j ) = k 1 ∣ P ∩ φ – 1 ( k ) ∣ p j p_{q_i}^*\text{=}\displaystyle{}\vec{p}_{(k)}\text{=}\displaystyle{}\sum_{\varphi(p_j)\text{=}k}\cfrac{1}{\left|P\text{∩}\varphi^{–1}(k)\right|}p_j pqi∗=p(k)=φ(pj)=k∑∣P∩φ–1(k)∣1pj,相当于桶 k k k种所有 p p p向量的质心
- 定义集合 Pdt= { ⟨ q i , p 1 ⟩ , ⟨ q i , p 2 ⟩ , . . . , ⟨ q i , p m ⟩ } \text{Pdt=}\{\langle{q_i,p_{1}}\rangle,\langle{q_i,p_{2}}\rangle,...,\langle{q_i,p_{m}}\rangle\} Pdt={⟨qi,p1⟩,⟨qi,p2⟩,...,⟨qi,pm⟩},根据 p q i p_{q_i} pqi的定义可知 ⟨ q i , p q 1 ⟩ \langle{q_i,p_{q_1}}\rangle ⟨qi,pq1⟩是 Pdt \text{Pdt} Pdt集合的最大值
- Case-0 \text{Case-0} Case-0时,显然有 ⟨ q i , p q i ∗ ⟩ = ⟨ q i , p ^ k ⟩ ∈Pdt \langle{q_i,p_{q_i}^*}\rangle\text{=}\langle{q_i,\hat{p}_{k}}\rangle\text{∈Pdt} ⟨qi,pqi∗⟩=⟨qi,p^k⟩∈Pdt,因此 ⟨ q i , p q i ∗ ⟩ ≤ ⟨ q i , p q i ⟩ \langle{q_i,p_{q_i}^*}\rangle\text{≤}\langle{q_i,p_{q_i}}\rangle ⟨qi,pqi∗⟩≤⟨qi,pqi⟩即小于集合最大值
- Case-n \text{Case-n} Case-n时,对每个 p j p_j pj都有 ⟨ q i , p q i ∗ ⟩ = ⟨ q i , ∑ φ ( p j ) = k 1 ∣ P ∩ φ – 1 ( k ) ∣ p j ⟩ = ∑ φ ( p j ) = k 1 ∣ P ∩ φ – 1 ( k ) ∣ ⟨ q i , p j ⟩ \langle{q_i,p_{q_i}^*}\rangle\text{=}\left\langle{q_i},\displaystyle{}\sum_{\varphi(p_j)\text{=}k}\cfrac{1}{\left|P\text{∩}\varphi^{–1}(k)\right|}p_j\right\rangle\text{=}\displaystyle{}\sum_{\varphi(p_j)\text{=}k}\cfrac{1}{\left|P\text{∩}\varphi^{–1}(k)\right|}\langle{q_i,p_j}\rangle ⟨qi,pqi∗⟩=⟨qi,φ(pj)=k∑∣P∩φ–1(k)∣1pj⟩=φ(pj)=k∑∣P∩φ–1(k)∣1⟨qi,pj⟩
- 集合 Pdt \text{Pdt} Pdt中 ∑ φ ( p j ) = k 1 ∣ P ∩ φ – 1 ( k ) ∣ ⟨ q i , p j ⟩ \displaystyle{}\sum_{\varphi(p_j)\text{=}k}\cfrac{1}{\left|P\text{∩}\varphi^{–1}(k)\right|}\langle{q_i,p_j}\rangle φ(pj)=k∑∣P∩φ–1(k)∣1⟨qi,pj⟩是子集均值, ⟨ q i , p q i ⟩ \langle{q_i,p_{q_i}}\rangle ⟨qi,pqi⟩是全集最大值,后者必然更大
- 所以 ⟨ q i , p q i ∗ ⟩ = ∑ φ ( p j ) = k 1 ∣ P ∩ φ – 1 ( k ) ∣ ⟨ q i , p j ⟩ ≤ ⟨ q i , p q i ⟩ \langle{q_i,p_{q_i}^*}\rangle\text{=}\displaystyle{}\sum_{\varphi(p_j)\text{=}k}\cfrac{1}{\left|P\text{∩}\varphi^{–1}(k)\right|}\langle{q_i,p_j}\rangle\text{≤}\langle{q_i,p_{q_i}}\rangle ⟨qi,pqi∗⟩=φ(pj)=k∑∣P∩φ–1(k)∣1⟨qi,pj⟩≤⟨qi,pqi⟩
➡️两种相似评分的对比
![]()
- Chamfer \text{Chamfer} Chamfer相似度:根据定义为 Chamfer ( Q , P ) = ∑ i =1 n ⟨ q i , p q i ⟩ = ⟨ Q , P ⟩ \text{Chamfer}(Q,P)\text{=}\displaystyle{\sum_{i\text{=1}}^n}\langle{q_i,p_{q_i}}\rangle\text{=}\langle{Q,P}\rangle Chamfer(Q,P)=i=1∑n⟨qi,pqi⟩=⟨Q,P⟩
- FDE \text{FDE} FDE相似度:根据定义没经过投影和重复的 FDE \text{FDE} FDE相似度可写作 FDE ( Q , P ) = ∑ k = 1 B ⟨ q ⃗ ( k ) , p ⃗ ( k ) ⟩ = ∑ i =1 n ⟨ q i , p q i ∗ ⟩ = ⟨ Q , P ′ ⟩ \text{FDE}(Q,P)\text{=}\displaystyle\sum_{k\text{=}1}^{B}\langle{\vec{q}_{(k)},\vec{p}_{(k)}}\rangle\text{=}\sum_{i\text{=1}}^n\langle{q_i,p_{q_i}^*}\rangle\text{=}\langle{Q,P^\prime}\rangle FDE(Q,P)=k=1∑B⟨q(k),p(k)⟩=i=1∑n⟨qi,pqi∗⟩=⟨Q,P′⟩
- 已证了 ⟨ q i , p q i ∗ ⟩ ≤ ⟨ q i , p q i ⟩ \langle{q_i,p_{q_i}^*}\rangle\text{≤}\langle{q_i,p_{q_i}}\rangle ⟨qi,pqi∗⟩≤⟨qi,pqi⟩故 FDE ( Q , P ) ≤Chamfer ( Q , P ) \text{FDE}(Q,P)\text{≤}\text{Chamfer}(Q,P) FDE(Q,P)≤Chamfer(Q,P)
1.2. \textbf{1.2. } 1.2. 加上一定维度的投影后原上界近似成立
➡️ Muvera \text{Muvera} Muvera分桶 + \text{+} +投影的过程
![]()
- 同样首先相当于为每个 q i ∈ Q q_i\text{∈}Q qi∈Q都映射一个 p ⃗ ( k q i ) = p q i ∗ ∈ { p ⃗ ( 1 ) , . . . , p ⃗ ( B ) } \vec{p}_{(k_{q_i})}\text{=}p_{q_i}^*\text{∈}\{\vec{p}_{(1)},...,\vec{p}_{(B)}\} p(kqi)=pqi∗∈{p(1),...,p(B)},但是后续参与交互(内积)的从原始向量变为了投影向量
- 所以原有的 FDE ( Q , P ) \text{FDE}(Q,P) FDE(Q,P)为直接将内积 ⟨ q i , p q i ∗ ⟩ \left\langle{q_i,p^*_{q_i}}\right\rangle ⟨qi,pqi∗⟩累加,现在变成将各自投影后的内积 ⟨ ψ ( q i ) , ψ ( p q i ∗ ) ⟩ \left\langle{\psi(q_i),\psi\left(p^*_{q_i}\right)}\right\rangle ⟨ψ(qi),ψ(pqi∗)⟩累加
- FDE \text{FDE} FDE相似度(无投影):可写作 FDE orgn ( Q , P ) = ∑ k = 1 B ⟨ q ⃗ ( k ) , p ⃗ ( k ) ⟩ = ∑ i =1 n ⟨ q i , p q i ∗ ⟩ \text{FDE}_{\text{orgn}}(Q,P)\text{=}\displaystyle\sum_{k\text{=}1}^{B}\langle{\vec{q}_{(k)},\vec{p}_{(k)}}\rangle\text{=}\sum_{i\text{=1}}^n\langle{q_i,p_{q_i}^*}\rangle FDEorgn(Q,P)=k=1∑B⟨q(k),p(k)⟩=i=1∑n⟨qi,pqi∗⟩,并且已证明 FDE orgn ( Q , P ) ≤Chamfer ( Q , P ) \text{FDE}_{\text{orgn}}(Q,P)\text{≤}\text{Chamfer}(Q,P) FDEorgn(Q,P)≤Chamfer(Q,P)
- FDE \text{FDE} FDE相似度(无投影):可写作 FDE proj ( Q , P ) = ∑ k = 1 B ⟨ ψ ( q ⃗ ( k ) ) , ψ ( p ⃗ ( k ) ) ⟩ = ∑ i =1 n ⟨ ψ ( q i ) , ψ ( p q i ∗ ) ⟩ \text{FDE}_{\text{proj}}(Q,P)\text{=}\displaystyle\sum_{k\text{=}1}^{B}\left\langle{\psi\left(\vec{q}_{(k)}\right),\psi\left(\vec{p}_{(k)}\right)}\right\rangle\text{=}\sum_{i\text{=1}}^n\left\langle{\psi(q_i),\psi\left(p^*_{q_i}\right)}\right\rangle FDEproj(Q,P)=k=1∑B⟨ψ(q(k)),ψ(p(k))⟩=i=1∑n⟨ψ(qi),ψ(pqi∗)⟩
➡️事实 A2 \text{A2} A2表明了:给定 d proj = O ( 1 ε 2 log m δ ) d_{\text{proj}}\text{=}O\left(\cfrac{1}{\varepsilon^{2}}\log\cfrac{m}{\delta}\right) dproj=O(ε21logδm)则 ⟨ x , y ⟩ – ε ≤ ⟨ ψ ( x ) , ψ ( y ) ⟩ ≤ ⟨ x , y ⟩ + ε \langle{}x,y\rangle\text{–}\varepsilon\text{≤}\langle\psi(x),\psi(y)\rangle{}\text{≤}\langle{}x,y\rangle\text{+}\varepsilon ⟨x,y⟩–ε≤⟨ψ(x),ψ(y)⟩≤⟨x,y⟩+ε以 Pr≥ 1 – δ \text{Pr}\text{≥}1\text{–}\delta Pr≥1–δ概率成立
- 将 d proj = O ( 1 ε 2 log m δ ) d_{\text{proj}}\text{=}O\left(\cfrac{1}{\varepsilon^{2}}\log\cfrac{m}{\delta}\right) dproj=O(ε21logδm)作为前提引入,代入结论有 ∑ i =1 n ⟨ q i , p q i ∗ ⟩ – ∣ Q ∣ ε ≤ ∑ i =1 n ⟨ ψ ( q i ) , ψ ( p q i ∗ ) ⟩ ≤ ∑ i =1 n ⟨ q i , p q i ∗ ⟩ + ∣ Q ∣ ε \displaystyle\sum_{i\text{=1}}^n\langle{q_i,p_{q_i}^*}\rangle\text{–}|Q|\varepsilon\text{≤}\sum_{i\text{=1}}^n\left\langle{\psi(q_i),\psi\left(p^*_{q_i}\right)}\right\rangle\text{≤}\sum_{i\text{=1}}^n\langle{q_i,p_{q_i}^*}\rangle\text{+}|Q|\varepsilon i=1∑n⟨qi,pqi∗⟩–∣Q∣ε≤i=1∑n⟨ψ(qi),ψ(pqi∗)⟩≤i=1∑n⟨qi,pqi∗⟩+∣Q∣ε
- 概率调整为,以 Pr≥ ( 1 – δ ) n ≥ 1 – ∣ Q ∣ δ \text{Pr}\text{≥}(1\text{–}\delta)^n\text{≥}1\text{–}|Q|\delta Pr≥(1–δ)n≥1–∣Q∣δ概率成立
- 也可写作 FDE orgn ( Q , P ) – ∣ Q ∣ ε ≤ FDE proj ( Q , P ) ≤ FDE orgn ( Q , P ) + ∣ Q ∣ ε \displaystyle\text{FDE}_{\text{orgn}}(Q,P)\text{–}|Q|\varepsilon\text{≤}\text{FDE}_{\text{proj}}(Q,P)\text{≤}\text{FDE}_{\text{orgn}}(Q,P)\text{+}|Q|\varepsilon FDEorgn(Q,P)–∣Q∣ε≤FDEproj(Q,P)≤FDEorgn(Q,P)+∣Q∣ε<\mark>
- 稍作调整,则有 FDE proj ( Q , P ) ≤ FDE orgn ( Q , P ) + ∣ Q ∣ ε ≤Chamfer ( Q , P ) + ∣ Q ∣ ε \text{FDE}_{\text{proj}}(Q,P)\text{≤}\text{FDE}_{\text{orgn}}(Q,P)\text{+}|Q|\varepsilon\text{≤}\text{Chamfer}(Q,P)\text{+}|Q|\varepsilon FDEproj(Q,P)≤FDEorgn(Q,P)+∣Q∣ε≤Chamfer(Q,P)+∣Q∣ε,以 Pr≥ 1 – ∣ Q ∣ δ \text{Pr}\text{≥}1\text{–}|Q|\delta Pr≥1–∣Q∣δ概率成立
- 再按常数 Q Q Q的比例调整 δ \delta δ(注意 ε \varepsilon ε为与后续统一故不调整),则有 FDE proj ( Q , P ) ≤Chamfer ( Q , P ) + ∣ Q ∣ ε \text{FDE}_{\text{proj}}(Q,P)\text{≤}\text{Chamfer}(Q,P)\text{+}|Q|\varepsilon FDEproj(Q,P)≤Chamfer(Q,P)+∣Q∣ε,以 Pr≥ 1 – δ \text{Pr}\text{≥}1\text{–}\delta Pr≥1–δ概率成立
1.3. \textbf{1.3. } 1.3. 已给出无重复 FDE \textbf{FDE} FDE上界👉还需证明什么才能给出无重复 FDE \textbf{FDE} FDE的上下界
➡️不妨假设需要证明的是 Chamfer ( Q , P ) – ∣ Q ∣ ε ≤FDE ( Q , P ) ≤Chamfer ( Q , P ) + ∣ Q ∣ ε \text{Chamfer}(Q,P)\text{–}|Q|\varepsilon\text{≤}\text{FDE}(Q,P)\text{≤}\text{Chamfer}(Q,P)\text{+}|Q|\varepsilon Chamfer(Q,P)–∣Q∣ε≤FDE(Q,P)≤Chamfer(Q,P)+∣Q∣ε以 Pr≥ 1 – δ \text{Pr}\text{≥}1\text{–}\delta Pr≥1–δ概率成立
- 令下界成立为事件 F L / F_L/ FL/上界成立为事件 F U / F_U/ FU/上下界都成立为事件 F F F,原问题变为要证明 Pr [ F ] ≥ 1 – δ \text{Pr}[F]\text{≥}1\text{–}\delta Pr[F]≥1–δ
- 不难得到当 ε ≤ 1 2 \varepsilon{}\text{≤}\cfrac{1}{2} ε≤21时,同时满足 Pr [ F L ] ≥ 1 – ε δ \text{Pr}[F_L]\text{≥}1\text{–}\varepsilon\delta Pr[FL]≥1–εδ和 Pr [ F U ] ≥ 1 – ε δ \text{Pr}[F_U]\text{≥}1\text{–}\varepsilon\delta Pr[FU]≥1–εδ,可推导出 Pr [ F ] ≥ 1 – δ \text{Pr}[F]\text{≥}1\text{–}\delta Pr[F]≥1–δ
- 由已证明的上界结论可知设定 d proj = O ( 1 ε 2 log ( m ε δ ) ) d_{\text{proj}}\text{=}O\left(\cfrac{1}{\varepsilon^2}\log{\left(\cfrac{m}{\varepsilon\delta}\right)}\right) dproj=O(ε21log(εδm))则有 Pr [ F U ] ≥ 1 – ε δ \text{Pr}[F_U]\text{≥}1\text{–}\varepsilon\delta Pr[FU]≥1–εδ
- 所以增加前提 d proj = O ( 1 ε 2 log ( m ε δ ) ) d_{\text{proj}}\text{=}O\left(\cfrac{1}{\varepsilon^2}\log{\left(\cfrac{m}{\varepsilon\delta}\right)}\right) dproj=O(ε21log(εδm))以及 ε ≤ 1 2 \varepsilon{}\text{≤}\cfrac{1}{2} ε≤21后,原问题变为要证明 Pr [ F L ] ≥ 1 – ε δ \text{Pr}\left[F_{L}\right]\text{≥}1\text{–}\varepsilon\delta Pr[FL]≥1–εδ
➡️对 Chamfer ( Q , P ) – ∣ Q ∣ ε ≤FDE ( Q , P ) \text{Chamfer}(Q,P)\text{–}|Q|\varepsilon\text{≤}\text{FDE}(Q,P) Chamfer(Q,P)–∣Q∣ε≤FDE(Q,P)的进一步变换
![]()
- 根据 Chamfer \text{Chamfer} Chamfer的定义有 Chamfer ( Q , P ) = ∑ q i ∈ Q MaxSim ( q i , P ) \text{Chamfer}(Q,P)\text{=}\displaystyle{}\sum_{q_i\text{∈}Q}\text{MaxSim}(q_i,P) Chamfer(Q,P)=qi∈Q∑MaxSim(qi,P)
- 又由于 FDE \text{FDE} FDE对 q i q_i qi的处理式独立线性的,所以 FDE ( Q , P ) = ∑ q i ∈ Q FDE ( q i , P ) \text{FDE}(Q,P)\text{=}\displaystyle{}\sum_{q_i\text{∈}Q}\text{FDE}(q_i,P) FDE(Q,P)=qi∈Q∑FDE(qi,P)
- 所以原问题变为需证 ∑ q i ∈ Q ( MaxSim ( q i , P ) – ε ) ≤ ∑ q i ∈ Q FDE ( q i , P ) \displaystyle{}\sum_{q_i\text{∈}Q}\left(\text{MaxSim}(q_i,P)\text{–}\varepsilon\right)\text{≤}\sum_{q_i\text{∈}Q}\text{FDE}(q_i,P) qi∈Q∑(MaxSim(qi,P)–ε)≤qi∈Q∑FDE(qi,P),以 Pr≥ 1 – ε δ \text{Pr}\text{≥}1\text{–}\varepsilon\delta Pr≥1–εδ概率成立
- 去掉求和号后,原问题变为需证 MaxSim ( q i , P ) – ε ≤FDE ( q i , P ) \text{MaxSim}(q_i,P)\text{–}\varepsilon\text{≤}\text{FDE}(q_i,P) MaxSim(qi,P)–ε≤FDE(qi,P),以 Pr≥ 1 – ε δ ∣ Q ∣ \text{Pr}\text{≥}1–\cfrac{\varepsilon\delta}{|Q|} Pr≥1–∣Q∣εδ概率成立
➡️后续称 FDE ( q i , P ) \text{FDE}(q_i,P) FDE(qi,P)的上下界为 FDE \text{FDE} FDE的子上下界
1.4. \textbf{1.4. } 1.4. 无投影无重复时 FDE \textbf{FDE} FDE相似度的子下界
➡️结合之前的分析,对于 FDE ( q i , P ) = ⟨ q i , p q i ∗ ⟩ \displaystyle\text{FDE}(q_i,P)\text{=}\left\langle{q_i,p_{q_i}^*}\right\rangle FDE(qi,P)=⟨qi,pqi∗⟩,在无投影时总存在一个子集 S ⊆ P S\text{⊆}P S⊆P,使得 p q i ∗ = 1 ∣ S ∣ ∑ p j ∈ S p j = S center p_{q_i}^*\text{=}\displaystyle{}\cfrac{1}{|S|}\sum_{p_j\text{∈}S}p_j\text{=}S_{\text{center}} pqi∗=∣S∣1pj∈S∑pj=Scenter
![]()
- 当单个 q i q_i qi落入的桶属于 Case-0 \text{Case-0} Case-0情形时, S S S是单个点 p ^ k = arg min p j ∈ P ∥ φ ( q i ) – φ ( p j ) ∥ 0 \displaystyle{}\hat{p}_k\text{=}\arg{}\min_{p_j\text{∈}P}\|\varphi(q_i)–\varphi(p_j)\|_{0} p^k=argpj∈Pmin∥φ(qi)–φ(pj)∥0,也就是离桶编码 φ ( q i ) \varphi(q_i) φ(qi)海明距离最近的点
- 当单个 q i q_i qi落入的桶属于 Case-n \text{Case-n} Case-n情形时, S S S元素需满足 φ ( q i ) = φ ( p j ) \varphi(q_i)\text{=}\varphi(p_j) φ(qi)=φ(pj),也就是所有落入该桶的 q j q_j qj点
➡️对事实 A4 \text{A4} A4给定 k sim = O ( 1 ε log ( m δ ) ) k_{\text{sim}}\text{=}O\left(\cfrac{1}{\varepsilon}\log{\left(\cfrac{m}{\delta}\right)}\right) ksim=O(ε1log(δm))时 ∀ p j ∈ P \forall{p_j}\text{∈}P ∀pj∈P有 ∣ ∥ φ ( q i ) – φ ( p j ) ∥ 0 k s i m – θ ( q i , p j ) π ∣ ≤ ε \left|\cfrac{\|\varphi(q_i)–\varphi(p_j)\|_0}{k_{\mathrm{sim}}}\text{–}\cfrac{\theta(q_i,p_j)}{\pi}\right|\text{≤}\sqrt{\varepsilon} ksim∥φ(qi)–φ(pj)∥0–πθ(qi,pj) ≤ε以 Pr> 1 – ε δ m 2 \text{Pr}\text{>}1–\cfrac{\varepsilon{}\delta{}}{m^2} Pr>1–m2εδ概率成立
![]()
- 对于 q i {q_i} qi和 p q 1 ∗ p^*_{q_1} pq1∗还有 p q 1 p_{q_1} pq1,总是会有 ∥ φ ( q i ) – φ ( p q 1 ∗ ) ∥ 0 k s i m ≤ ∥ φ ( q i ) – φ ( p q 1 ) ∥ 0 k s i m \cfrac{\|\varphi(q_i)–\varphi(p^*_{q_1})\|_0}{k_{\mathrm{sim}}}\text{≤}\cfrac{\|\varphi(q_i)–\varphi(p_{q_1})\|_0}{k_{\mathrm{sim}}} ksim∥φ(qi)–φ(pq1∗)∥0≤ksim∥φ(qi)–φ(pq1)∥0
- 对于 Case-0 \text{Case-0} Case-0情形, p q i ∗ p^*_{q_i} pqi∗为离编号为 φ ( q i ) \varphi(q_i) φ(qi)的桶海明距离最近的 p j p_j pj向量,所以 ∥ φ ( q i ) – φ ( p q i ∗ ) ∥ 0 ≤ ∥ φ ( q i ) – φ ( p q i ) ∥ 0 {\|\varphi(q_i)–\varphi(p^*_{q_i})\|_0}\text{≤}{\|\varphi(q_i)–\varphi(p_{q_i})\|_0} ∥φ(qi)–φ(pqi∗)∥0≤∥φ(qi)–φ(pqi)∥0
- 对于 Case-n \text{Case-n} Case-n情形, p q i ∗ p^*_{q_i} pqi∗为与 q i q_i qi同桶的所有 p j p_j pj向量的质心,即 S = { p j ∈ P ∣ φ ( q i ) = φ ( p j ) } S\text{=}\left\{p_j\text{∈}P|{\varphi}(q_i)\text{=}\varphi(p_j)\right\} S={pj∈P∣φ(qi)=φ(pj)}的质心 S center S_{\text{center}} Scenter
- 回忆 SimHash \text{SimHash} SimHash分桶的本质,就是用 k sim k_\text{sim} ksim个法平面将空间切割为 2 k sim 2^{k_\text{sim}} 2ksim个子空间
- 所有 p j ∈ S p_j\text{∈}S pj∈S都落入同一桶即在同一子空间,则其质心 p q i ∗ = S center p^*_{q_i}\text{=}S_{\text{center}} pqi∗=Scenter也一定在该子空间中,所以 φ ( q i ) = φ ( p j ) = φ ( p q i ∗ ) \varphi(q_i)\text{=}\varphi(p_j)\text{=}\varphi(p^*_{q_i}) φ(qi)=φ(pj)=φ(pqi∗)
- 所以 ∥ φ ( q i ) – φ ( p q i ∗ ) ∥ 0 = 0 ≤ ∥ φ ( q i ) – φ ( p q i ) ∥ 0 {\|\varphi(q_i)–\varphi(p^*_{q_i})\|_0}\text{=}0\text{≤}{\|\varphi(q_i)–\varphi(p_{q_i})\|_0} ∥φ(qi)–φ(pqi∗)∥0=0≤∥φ(qi)–φ(pqi)∥0
- p q 1 p_{q_1} pq1和 p q 1 ∗ p^*_{q_1} pq1∗都代入事实 A4 \text{A4} A4,则 θ ( q i , p q 1 ∗ ) π – ε ≤ ∥ φ ( q i ) – φ ( p q 1 ∗ ) ∥ 0 k s i m ≤ ∥ φ ( q i ) – φ ( p q i ) ∥ 0 k s i m ≤ θ ( q i , p q i ) π + ε \cfrac{\theta({q_i,p^*_{q_1}})}{\pi}–\sqrt{\varepsilon}\text{≤}\cfrac{\|\varphi(q_i)–\varphi(p^*_{q_1})\|_0}{k_{\mathrm{sim}}}\text{≤}\cfrac{\|\varphi(q_i)–\varphi(p_{q_i})\|_0}{k_{\mathrm{sim}}}\text{≤}\cfrac{\theta({q_i,p_{q_i}})}{\pi}\text{+}\sqrt{\varepsilon} πθ(qi,pq1∗)–ε≤ksim∥φ(qi)–φ(pq1∗)∥0≤ksim∥φ(qi)–φ(pqi)∥0≤πθ(qi,pqi)+ε以 Pr> 1 – ε δ m 2 \text{Pr}\text{>}1–\cfrac{\varepsilon{}\delta{}}{m^2} Pr>1–m2εδ概率成立
- 因此 ∣ θ ( q i , p q 1 ∗ ) – θ ( q i , p q i ) ∣ = O ( ε ) \left|\theta(q_i,p^*_{q_1})–\theta({q_i,p_{q_i}})\right|\text{=}O\left(\sqrt{\varepsilon}\right) θ(qi,pq1∗)–θ(qi,pqi) =O(ε)以 Pr> 1 – ε δ m 2 \text{Pr}\text{>}1–\cfrac{\varepsilon{}\delta{}}{m^2} Pr>1–m2εδ概率成立
- 上式可得 ∣ cos θ ( q i , p q 1 ∗ ) – cos θ ( q i , p q i ) ∣ < O ( ε ) \left|\cos\theta(q_i,p^*_{q_1})–\cos\theta({q_i,p_{q_i}})\right|\text{<}O(\varepsilon) cosθ(qi,pq1∗)–cosθ(qi,pqi) <O(ε)以 Pr> 1 – ε δ m 2 \text{Pr}\text{>}1–\cfrac{\varepsilon{}\delta{}}{m^2} Pr>1–m2εδ成立,其中 { cos θ ( q i , p q 1 ∗ ) = ⟨ q i , p q 1 ∗ ⟩ =FDE ( q i , P ) cos θ ( q i , p q i ) = ⟨ q i , p q i ⟩ =MaxSim ( q i , P ) \begin{cases}\cos\theta(q_i,p^*_{q_1})\text{=}\langle{q_i,p^*_{q_1}}\rangle\text{=}\text{FDE}(q_i,P)\\\\\cos\theta({q_i,p_{q_i}})\text{=}\langle{q_i,p_{q_i}}\rangle\text{=}\text{MaxSim}(q_i,P)\end{cases} ⎩ ⎨ ⎧cosθ(qi,pq1∗)=⟨qi,pq1∗⟩=FDE(qi,P)cosθ(qi,pqi)=⟨qi,pqi⟩=MaxSim(qi,P)
- 所以 MaxSim ( q i , P ) – O ( ε ) ≤FDE ( q i , P ) ≤MaxSim ( q i , P ) + O ( ε ) \text{MaxSim}(q_i,P)–O(\varepsilon)\text{≤}\text{FDE}(q_i,P)\text{≤}\text{MaxSim}(q_i,P)\text{+}O(\varepsilon) MaxSim(qi,P)–O(ε)≤FDE(qi,P)≤MaxSim(qi,P)+O(ε) 以 Pr> 1 – ε δ m 2 \text{Pr}\text{>}1–\cfrac{\varepsilon{}\delta{}}{m^2} Pr>1–m2εδ概率成立
➡️事实 A4 \text{A4} A4给出了 FDE ( q i , P ) \text{FDE}(q_i,P) FDE(qi,P)的子上下界
- 取结论的左半边 MaxSim ( q i , P ) – O ( ε ) ≤FDE ( q i , P ) \text{MaxSim}(q_i,P)–O(\varepsilon)\text{≤}\text{FDE}(q_i,P) MaxSim(qi,P)–O(ε)≤FDE(qi,P)并给概率放水 Pr> 1 – ε δ m 2 > 1 – ε δ ∣ Q ∣ \text{Pr}\text{>}1–\cfrac{\varepsilon{}\delta{}}{m^2}\text{>}1–\cfrac{\varepsilon\delta}{|Q|} Pr>1–m2εδ>1–∣Q∣εδ,便得到了一个更弱的结论
- 再对 O ( ε ) O(\varepsilon) O(ε)进行常数缩放为 ε \varepsilon ε,该更弱的结论就变成了所要证的 FDE \text{FDE} FDE相似度的子下界
1.5. \textbf{1.5. } 1.5. 加上一定维度的投影后原子下界近似成立
➡️投影前后的对比:用 FDE orgn \text{FDE}_{\text{orgn}} FDEorgn和 FDE proj \text{FDE}_{\text{proj}} FDEproj表示区分
![]()
- 投影前为 FDE orgn ( q i , P ) = ⟨ q i , p q i ∗ ⟩ = ⟨ q i , 1 ∣ S ∣ ∑ p j ∈ S p j ⟩ = 1 ∣ S ∣ ∑ p j ∈ S ⟨ q i , p j ⟩ \displaystyle{}\text{FDE}_{\text{orgn}}(q_i,P)\text{=}\left\langle{q_i,p_{q_i}^*}\right\rangle\text{=}\left\langle{}q_i,\cfrac{1}{|S|}\sum_{p_j\text{∈}S}p_j\right\rangle{}\text{=}\cfrac{1}{|S|}\sum_{p_j\text{∈}S}{\langle{}q_i,p_j\rangle} FDEorgn(qi,P)=⟨qi,pqi∗⟩=⟨qi,∣S∣1pj∈S∑pj⟩=∣S∣1pj∈S∑⟨qi,pj⟩
- 投影后为 FDE proj ( q i , P ) = ⟨ ψ ( q i ) , ψ ( p q i ∗ ) ⟩ = ⟨ ψ ( q i ) , ψ ( 1 ∣ S ∣ ∑ p j ∈ S p j ) ⟩ = 1 ∣ S ∣ ∑ p j ∈ S ⟨ ψ ( q i ) , ψ ( p j ) ⟩ \displaystyle{}\text{FDE}_{\text{proj}}(q_i,P)\text{=}\left\langle{\psi(q_i),\psi\left(p_{q_i}^*\right)}\right\rangle\text{=}\left\langle{}{\psi}(q_i),{\psi}\left(\cfrac{1}{|S|}\sum_{p_j\text{∈}S}p_j\right)\right\rangle{}\text{=}\cfrac{1}{|S|}\sum_{p_j\text{∈}S}\left\langle{\psi(q_i),\psi\left(p_j\right)}\right\rangle FDEproj(qi,P)=⟨ψ(qi),ψ(pqi∗)⟩=⟨ψ(qi),ψ ∣S∣1pj∈S∑pj ⟩=∣S∣1pj∈S∑⟨ψ(qi),ψ(pj)⟩
➡️应用事实 A2 \text{A2} A2:用 ε δ ∣ Q ∣ \cfrac{\varepsilon\delta}{|Q|} ∣Q∣εδ替代 δ \delta δ以设定 d proj = O ( 1 ε 2 log ( ∣ Q ∣ ε δ ) ) = O ( 1 ε 2 log ( m ε δ ) ) d_{\text{proj}}\text{=}O\left(\cfrac{1}{\varepsilon^{2}}\log\left(\cfrac{|Q|}{\varepsilon\delta}\right)\right)\text{=}O\left(\cfrac{1}{\varepsilon^{2}}\log\left(\cfrac{m}{\varepsilon\delta}\right)\right) dproj=O(ε21log(εδ∣Q∣))=O(ε21log(εδm))
- 则有内积的偏差 ⟨ x , y ⟩ – ε ≤ ⟨ ψ ( x ) , ψ ( y ) ⟩ ≤ ⟨ x , y ⟩ + ε \langle{}x,y\rangle\text{–}\varepsilon\text{≤}\langle\psi(x),\psi(y)\rangle{}\text{≤}\langle{}x,y\rangle\text{+}\varepsilon ⟨x,y⟩–ε≤⟨ψ(x),ψ(y)⟩≤⟨x,y⟩+ε以 Pr≥ 1 – ε δ ∣ Q ∣ \text{Pr}\text{≥}1\text{–}\cfrac{\varepsilon\delta}{|Q|} Pr≥1–∣Q∣εδ概率成立概率成立
- 将事实 A2 \text{A2} A2代回则有 1 ∣ S ∣ ∑ p j ∈ S ⟨ q i , p j ⟩ – ε ≤ 1 ∣ S ∣ ∑ p j ∈ S ⟨ ψ ( q i ) , ψ p j ⟩ ≤ 1 ∣ S ∣ ∑ p j ∈ S ⟨ q i , p j ⟩ + ε \displaystyle{}\cfrac{1}{|S|}\sum_{p_j\text{∈}S}{\langle{}q_i,p_j\rangle}\text{–}\varepsilon\text{≤}\cfrac{1}{|S|}\sum_{p_j\text{∈}S}\left\langle{\psi(q_i),\psi{}p_j}\right\rangle\text{≤}\cfrac{1}{|S|}\sum_{p_j\text{∈}S}{\langle{}q_i,p_j\rangle}\text{+}\varepsilon ∣S∣1pj∈S∑⟨qi,pj⟩–ε≤∣S∣1pj∈S∑⟨ψ(qi),ψpj⟩≤∣S∣1pj∈S∑⟨qi,pj⟩+ε,以 Pr≥ 1 – ε δ ∣ Q ∣ \text{Pr}\text{≥}1\text{–}\cfrac{\varepsilon\delta}{|Q|} Pr≥1–∣Q∣εδ概率成立概率成立
- 其中 FDE orgn ( q i , P ) = 1 ∣ S ∣ ∑ p j ∈ S ⟨ q i , p j ⟩ \displaystyle{}\text{FDE}_{\text{orgn}}(q_i,P)\text{=}\cfrac{1}{|S|}\sum_{p_j\text{∈}S}{\langle{}q_i,p_j\rangle} FDEorgn(qi,P)=∣S∣1pj∈S∑⟨qi,pj⟩,以及 FDE proj ( q i , P ) = 1 ∣ S ∣ ∑ p j ∈ S ⟨ ψ ( q i ) , ψ ( p j ) ⟩ \displaystyle{}\text{FDE}_{\text{proj}}(q_i,P)\text{=}\cfrac{1}{|S|}\sum_{p_j\text{∈}S}\left\langle{\psi(q_i),\psi\left(p_j\right)}\right\rangle FDEproj(qi,P)=∣S∣1pj∈S∑⟨ψ(qi),ψ(pj)⟩
- 所以 FDE orgn ( q i , P ) – ε ≤ FDE proj ( q i , P ) ≤ FDE orgn ( q i , P ) + ε \displaystyle{}\text{FDE}_{\text{orgn}}(q_i,P)\text{–}\varepsilon\text{≤}\text{FDE}_{\text{proj}}(q_i,P)\text{≤}\text{FDE}_{\text{orgn}}(q_i,P)\text{+}\varepsilon FDEorgn(qi,P)–ε≤FDEproj(qi,P)≤FDEorgn(qi,P)+ε,以 Pr≥ 1 – ε δ ∣ Q ∣ \text{Pr}\text{≥}1\text{–}\cfrac{\varepsilon\delta}{|Q|} Pr≥1–∣Q∣εδ概率成立概率成立
- 已证 MaxSim ( q i , P ) – O ( ε ) ≤ FDE proj ( q i , P ) ≤MaxSim ( q i , P ) + O ( ε ) \text{MaxSim}(q_i,P)–O(\varepsilon)\text{≤}\text{FDE}_{\text{proj}}(q_i,P)\text{≤}\text{MaxSim}(q_i,P)\text{+}O(\varepsilon) MaxSim(qi,P)–O(ε)≤FDEproj(qi,P)≤MaxSim(qi,P)+O(ε)以 Pr≥ 1 – ε δ ∣ Q ∣ \text{Pr}\text{≥}1–\cfrac{\varepsilon\delta}{|Q|} Pr≥1–∣Q∣εδ概率成立
- 代入得 FDE proj ( q i , P ) ≥ FDE orgn ( q i , P ) – ε ≥ MaxSim ( q i ) – ε – O ( ε ) \displaystyle\text{FDE}_{\text{proj}}(q_i,P)\text{≥}\text{FDE}_{\text{orgn}}(q_i,P)\text{–}\varepsilon\text{≥}\displaystyle{}\text{MaxSim}(q_i)\text{–}\varepsilon–O(\varepsilon) FDEproj(qi,P)≥FDEorgn(qi,P)–ε≥MaxSim(qi)–ε–O(ε)
- 对 ε \varepsilon ε进行常数因子缩放后,即证毕
2. \textbf{2. } 2. 定理 2.2 \textbf{2.2} 2.2证明的思路
2.0. \textbf{2.0. } 2.0. 定理 2.2 \textbf{2.2} 2.2的主要内容
👉条件 1 1 1:给定单个查询 Q Q Q以及多个段落 P = { P 1 , … , P n } P\text{=}\left\{P_{1},\ldots,P_{n}\right\} P={P1,…,Pn}并且 Q , ∀ P i ⊆ R d Q,\forall{}P_i\text{⊆}\mathbb{R}^{d} Q,∀Pi⊆Rd,并令 m = ∣ Q ∣ + max i ∈ [ n ] ∣ P i ∣ \displaystyle{}m\text{=}|Q|\text{+}\max_{i\text{∈}[n]}\left|P_{i}\right| m=∣Q∣+i∈[n]max∣Pi∣
👉条件 2 2 2:给定 ∀ ε > 0 \forall\varepsilon\text{>}0 ∀ε>0,设置参数 k sim = O ( log m ε ) , d proj = O ( 1 ε 2 log ( m ε ) ) , R reps = O ( 1 ε 2 log n ) k_{\text{sim}}\text{=}O\left(\cfrac{\log{m}}{\varepsilon}\right),d_{\text {proj}}\text{=}O\left(\cfrac{1}{\varepsilon^{2}}\log\left(\cfrac{m}{\varepsilon}\right)\right),R_{\text{reps}}\text{=}O\left(\cfrac{1}{\varepsilon^{2}}\log{n}\right) ksim=O(εlogm),dproj=O(ε21log(εm)),Rreps=O(ε21logn)
👉条件 3 3 3:令 i ∗ = arg max i ∈ [ n ] FDE ( Q , P i ) \displaystyle{}i^{*}\text{=}\arg\max_{i\text{∈}[n]}\text{FDE}(Q,P_i) i∗=argi∈[n]maxFDE(Q,Pi),即 P i ∗ P_{i^*} Pi∗是通过 Muvera \text{Muvera} Muvera方法找到的,与查询 Q Q Q最相似的段落
👉结论 1 1 1: 1 ∣ Q ∣ Chamfer ( Q , P i ∗ ) ≥ max i ∈ [ n ] 1 ∣ Q ∣ Chamfer ( Q , P i ) – ε \displaystyle{}\cfrac{1}{|Q|}\text{Chamfer}\left(Q, P_{i^{*}}\right)\text{≥}\max_{i\text{∈}[n]}\cfrac{1}{|Q|}\text{Chamfer}\left(Q, P_{i}\right)–\varepsilon ∣Q∣1Chamfer(Q,Pi∗)≥i∈[n]max∣Q∣1Chamfer(Q,Pi)–ε以 Pr= 1 – 1 poly ( n ) \text{Pr=}1\text{–}\cfrac{1}{\text{poly}(n)} Pr=1–poly(n)1概率成立
2.1. \textbf{2.1. } 2.1. 定理 2.2 \textbf{2.2} 2.2的主要内容
➡️在考虑重复 R reps R_{\text{reps}} Rreps次的情况下,对于每个重复 k ∈ [ R reps ] k\text{∈}\left[R_{\text{reps}}\right] k∈[Rreps],设定每次重复对最终相似度的贡献为 FDE k ( Q , P α ) \text{FDE}^k(Q,P_\alpha) FDEk(Q,Pα)
- 对于最终相似度,有 FDE ( Q , P α ) = ∑ k = 1 R reps FDE k ( Q , P α ) \text{FDE}(Q,P_\alpha)\text{=}\displaystyle\sum_{k\text{=}1}^{R_{\text{reps}}}\text{FDE}^k(Q,P_\alpha) FDE(Q,Pα)=k=1∑RrepsFDEk(Q,Pα),不妨设定随机变量 X k = 1 ∣ Q ∣ FDE k ( Q , P α ) X_k\text{=}\cfrac{1}{|Q|}\text{FDE}^k(Q,P_\alpha) Xk=∣Q∣1FDEk(Q,Pα)
➡️最关键的一步在于,对 X k X_k Xk尝试运用 Chernoff \text{Chernoff} Chernoff界限,即 ∀ X i ∈ [ a , b ] \forall{X_i}\text{∈}[a,b] ∀Xi∈[a,b]有 Pr [ ∣ 1 R ∑ i = 1 R X i – μ ∣ ≥ ε ] ≤ 2 e ( – 2 R ε 2 ( b – a ) 2 ) \displaystyle{}\text{Pr}\left[\left|\frac{1}{R}\sum_{i=1}^RX_i–\mu\right|\text{≥}\varepsilon\right]\text{≤}2e^{\left(–\frac{2R\varepsilon^2}{(b–a)^2}\right)} Pr[ R1i=1∑RXi–μ ≥ε]≤2e(–(b–a)22Rε2)
- 对 Chernoff \text{Chernoff} Chernoff界限中参数的确定
- 将上式中 R R R视作 R reps R_{\text{reps}} Rreps,并将 R reps = O ( 1 ε 2 log n ) R_{\text{reps}}\text{=}O\left(\cfrac{1}{\varepsilon^{2}}\log{n}\right) Rreps=O(ε21logn)作为前提引入
- 对于 μ \mu μ即均值,根据定理 2.1 \text{2.1} 2.1引入前提 k sim = O ( log m ε ) , d proj = O ( 1 ε 2 log ( m ε ) ) k_{\text{sim}}\text{=}O\left(\cfrac{\log{m}}{\varepsilon}\right),d_{\text {proj}}\text{=}O\left(\cfrac{1}{\varepsilon^{2}}\log\left(\cfrac{m}{\varepsilon}\right)\right) ksim=O(εlogm),dproj=O(ε21log(εm))后,有 E [ X k ] ∈ 1 ∣ Q ∣ Chamfer ( Q , P α ) ± ε \mathbb{E}[X_k]\text{∈}\cfrac{1}{|Q|}\text{Chamfer}(Q,P_\alpha)\text{±}\varepsilon E[Xk]∈∣Q∣1Chamfer(Q,Pα)±ε
- 对于 [ a , b ] [a,b] [a,b]即 X k X_k Xk的范围,不难得到 X k ∈ [ – m , m ] X_k\text{∈}[–m,m] Xk∈[–m,m]
- 将以上参数套用到 Chernoff \text{Chernoff} Chernoff界限则有
- 概率:以 Pr≥ 1 – 2 e ( – R reps ε 2 2 m 2 ) \text{Pr}\text{≥}1–2e^{\left(–\frac{R_{\text{reps}}\varepsilon^2}{2m^2}\right)} Pr≥1–2e(–2m2Rrepsε2)概率成立,可以进一步转化为 Pr≥ 1 – 2 e ( – R reps ε 2 2 m 2 ) ≥ 1 – 2 n C = 1 – 1 poly ( n ) \text{Pr}\text{≥}1–2e^{\left(–\frac{R_{\text{reps}}\varepsilon^2}{2m^2}\right)}\text{≥}1–\cfrac{2}{n^C}\text{=}1\text{–}\cfrac{1}{\text{poly}(n)} Pr≥1–2e(–2m2Rrepsε2)≥1–nC2=1–poly(n)1
- 事件: ∣ ∑ k = 1 R reps X k R reps – 1 ∣ Q ∣ Chamfer ( Q , P α ) ∣ ≤ 2 ε \displaystyle{}\left|\sum_{k\text{=}1}^{R_{\text{reps}}}\cfrac{X_k}{R_{\text{reps}}}–\cfrac{1}{|Q|}\text{Chamfer}(Q,P_\alpha)\right|\text{≤}2\varepsilon k=1∑RrepsRrepsXk–∣Q∣1Chamfer(Q,Pα) ≤2ε
- 代入 FED \text{FED} FED最相似文档 P α ∗ P_{\alpha^*} Pα∗后,有 1 ∣ Q ∣ Chamfer ( Q , P α ∗ ) ≥ 1 ∣ Q ∣ R reps FDE ( Q , P α ∗ ) – 2 ε = max α ∈ [ n ] 1 ∣ Q ∣ R reps FDE ( Q , P α ) – 2 ε \cfrac{1}{|Q|}\text{Chamfer}(Q,P_\alpha^*)\text{≥}\cfrac{1}{|Q|R_{\text{reps}}}\text{FDE}(Q,P_\alpha^*)–2\varepsilon\text{=}\displaystyle{}\max_{\alpha\text{∈}[n]}\cfrac{1}{|Q|R_{\text{reps}}}\text{FDE}(Q,P_\alpha)–2\varepsilon ∣Q∣1Chamfer(Q,Pα∗)≥∣Q∣Rreps1FDE(Q,Pα∗)–2ε=α∈[n]max∣Q∣Rreps1FDE(Q,Pα)–2ε
- 根据定理 2.1 \text{2.1} 2.1又可以知道 max α ∈ [ n ] 1 ∣ Q ∣ R reps FDE ( Q , P α ) ≥ max α ∈ [ n ] 1 ∣ Q ∣ Chamfer ( Q , P α ) – ε \displaystyle{}\max_{\alpha\text{∈}[n]}\cfrac{1}{|Q|R_{\text{reps}}}\text{FDE}(Q,P_\alpha)\text{≥}\max_{\alpha\text{∈}[n]}\cfrac{1}{|Q|}\text{Chamfer}(Q,P_\alpha)\text{–}\varepsilon α∈[n]max∣Q∣Rreps1FDE(Q,Pα)≥α∈[n]max∣Q∣1Chamfer(Q,Pα)–ε
- 所以最终原事件可转化为 1 ∣ Q ∣ Chamfer ( Q , P α ∗ ) ≥ max α ∈ [ n ] 1 ∣ Q ∣ Chamfer ( Q , P α ) – 3 ε \displaystyle{}\cfrac{1}{|Q|}\text{Chamfer}(Q,P_\alpha^*)\text{≥}\max_{\alpha\text{∈}[n]}\cfrac{1}{|Q|}\text{Chamfer}(Q,P_\alpha)\text{–}3\varepsilon ∣Q∣1Chamfer(Q,Pα∗)≥α∈[n]max∣Q∣1Chamfer(Q,Pα)–3ε,对 ε \varepsilon ε做常数倍变换即证毕