当前位置：首页 > news >正文

深入解析Guava RateLimiter限流机制

news 2025/9/10 6:47:00

`RateLimiter`

Guava 的 RateLimiter 是一个设计精巧、功能强大且易于使用的速率限制工具。

它通过预先计算并保留下一个可用时间点的核心思想，实现了平滑的流量控制。
它提供了突发（Bursty）和预热（Warming Up）两种模式，以适应不同的应用场景。
它的 API 设计简洁明了，acquire 用于阻塞获取，tryAcquire 用于非阻塞尝试。
内部通过 SleepingStopwatch 抽象和 synchronized 机制，实现了可测试性和线程安全。

从RateLimiter.java 的类注释中可以看到它的核心思想：

// ... existing code ...
/*** A rate limiter. Conceptually, a rate limiter distributes permits at a configurable rate. Each* {@link #acquire()} blocks if necessary until a permit is available, and then takes it. Once* acquired, permits need not be released.** <p>{@code RateLimiter} is safe for concurrent use: It will restrict the total rate of calls from* all threads. Note, however, that it does not guarantee fairness.** <p>Rate limiters are often used to restrict the rate at which some physical or logical resource* is accessed. This is in contrast to {@link java.util.concurrent.Semaphore} which restricts the* number of concurrent accesses instead of the rate (note though that concurrency and rate are* closely related, e.g. see <a href="http://en.wikipedia.org/wiki/Little%27s_law">Little's* Law</a>).
// ... existing code ...*/

核心特性:

速率控制: 与 Semaphore 控制并发数不同，RateLimiter 控制的是单位时间内的请求数（QPS，Queries Per Second）。
线程安全: 可以在多线程环境中安全使用，它会限制所有线程的总体速率。
平滑输出: RateLimiter 会平滑地分发许可，确保请求在时间上均匀分布，而不是在每秒开始时集中处理所有请求。
支持预热: 可以配置一个预热期（Warm-up Period），在此期间，分发许可的速率会逐渐增加，直到达到稳定速率。这对于需要“启动时间”的系统（如缓存、连接池）非常有用。

核心概念与设计思想

RateLimiter 的设计精髓在于它不记录“上一次请求是什么时候”，而是记录“下一次请求可以从什么时候开始”。这个设计思想在 SmoothRateLimiter.java 的注释中有详细阐述。

// ... existing code ...* This has important consequences: it means that the RateLimiter doesn't remember the time of the* _last_ request, but it remembers the (expected) time of the _next_ request. This also enables* us to tell immediately (see tryAcquire(timeout)) whether a particular timeout is enough to get
// ... existing code ...

RateLimiter 的实现可以看作是令牌桶算法的一种变体。它维护了一个“下一次允许执行的时间点”。

当一个请求 acquire(N) 到达时，RateLimiter 会计算出满足这 N 个许可需要等待的时间。
然后，它将“下一次允许执行的时间点”向未来推移这段时间。
如果计算出的“下-个可执行时间点”在当前时间之后，那么当前请求就需要阻塞等待。

如果 RateLimiter 长时间处于空闲状态，它会“积攒”一些许可。当一个新请求到来时，可以立即消费这些积攒的许可而无需等待，从而实现“突发”处理（Burst）。

SmoothRateLimiter.java 的注释解释了这一点：

// ... existing code ...* To deal with such scenarios, we add an extra dimension, that of "past underutilization",* modeled by "storedPermits" variable. This variable is zero when there is no underutilization,* and it can grow up to maxStoredPermits, for sufficiently large underutilization. So, the* requested permits, by an invocation acquire(permits), are served from:** - stored permits (if available)** - fresh permits (for any remaining permits)
// ... existing code ...

默认情况下，RateLimiter.create(double permitsPerSecond) 创建的限流器可以存储最多1秒的许可。

预热 (Warm-up)

对于某些系统，突然的高并发可能会导致性能问题（例如，缓存需要加载数据，数据库连接池需要创建连接）。RateLimiter 的预热功能就是为了解决这个问题。

在预热模式下，速率从一个较低的“冷”速率开始，经过一个预热期，逐渐线性增长到配置的稳定速率。如果限流器空闲时间超过预热期，它会重新回到“冷”状态。

create 方法的重载版本支持创建带预热的限流器：

// ... existing code ...public static RateLimiter create(double permitsPerSecond, Duration warmupPeriod) {return create(permitsPerSecond, toNanosSaturated(warmupPeriod), TimeUnit.NANOSECONDS);}// ... existing code ...@SuppressWarnings("GoodTime") // should accept a java.time.Durationpublic static RateLimiter create(double permitsPerSecond, long warmupPeriod, TimeUnit unit) {checkArgument(warmupPeriod >= 0, "warmupPeriod must not be negative: %s", warmupPeriod);return create(permitsPerSecond, warmupPeriod, unit, 3.0, SleepingStopwatch.createFromSystemTimer());}
// ... existing code ...

这里的 coldFactor (默认为3.0) 表示冷启动时的速率是稳定速率的 1/coldFactor。

入口：`RateLimiter.create()` 工厂方法

RateLimiter 提供了两种主要的限流策略，通过不同的 create 方法创建：

平滑突发（Smooth Bursty）: RateLimiter.create(double permitsPerSecond)

这是最常用的模式。它会创建一个 SmoothBursty 实例。
这种模式下，限流器会以一个稳定的速率生成许可，并且允许将一段时间内未使用的许可“存储”起来，用于应对未来的突发请求。默认情况下，它可以存储1秒钟的许可。

// ... existing code ...public static RateLimiter create(double permitsPerSecond) {/** The default RateLimiter configuration can save the unused permits of up to one second. This* is to avoid unnecessary stalls in situations like this: A RateLimiter of 1qps, and 4 threads,* all calling acquire() at these moments:** T0 at 0 seconds* T1 at 1.05 seconds* T2 at 2 seconds* T3 at 3 seconds** Due to the slight delay of T1, T2 would have to sleep till 2.05 seconds, and T3 would also* have to sleep till 3.05 seconds.*/return create(permitsPerSecond, SleepingStopwatch.createFromSystemTimer());}@VisibleForTestingstatic RateLimiter create(double permitsPerSecond, SleepingStopwatch stopwatch) {RateLimiter rateLimiter = new SmoothBursty(stopwatch, 1.0 /* maxBurstSeconds */);rateLimiter.setRate(permitsPerSecond);return rateLimiter;}
// ... existing code ...

平滑预热（Smooth Warming Up）: RateLimiter.create(double permitsPerSecond, Duration warmupPeriod)

这种模式会创建一个 SmoothWarmingUp 实例。
它适用于那些需要“热身”的系统。限流器启动时，会以一个较低的速率分发许可，然后在指定的 warmupPeriod 内逐渐提升到稳定速率。

// ... existing code ...public static RateLimiter create(double permitsPerSecond, Duration warmupPeriod) {return create(permitsPerSecond, toNanosSaturated(warmupPeriod), TimeUnit.NANOSECONDS);}// ... existing code ...@SuppressWarnings("GoodTime") // should accept a java.time.Durationpublic static RateLimiter create(double permitsPerSecond, long warmupPeriod, TimeUnit unit) {checkArgument(warmupPeriod >= 0, "warmupPeriod must not be negative: %s", warmupPeriod);return create(permitsPerSecond, warmupPeriod, unit, 3.0, SleepingStopwatch.createFromSystemTimer());}@VisibleForTestingstatic RateLimiter create(double permitsPerSecond,long warmupPeriod,TimeUnit unit,double coldFactor,SleepingStopwatch stopwatch) {RateLimiter rateLimiter = new SmoothWarmingUp(stopwatch, warmupPeriod, unit, coldFactor);rateLimiter.setRate(permitsPerSecond);return rateLimiter;}
// ... existing code ...

现在我们有了 RateLimiter 的实例，接下来看如何使用它。

核心阻塞接口：`acquire()`

acquire() 是最核心的阻塞方法，它会获取一个或多个许可，如果许可不够，则会阻塞当前线程直到满足条件。

我们来跟踪 acquire(5) 的调用路径：

acquire(int permits): 这是我们调用的入口。它的职责很简单：计算需要等待的时间，然后进行休眠。

// ... existing code ...@CanIgnoreReturnValuepublic double acquire(int permits) {long microsToWait = reserve(permits);stopwatch.sleepMicrosUninterruptibly(microsToWait);return 1.0 * microsToWait / SECONDS.toMicros(1L);}
// ... existing code ...

reserve(int permits): 这一步进入了同步区，确保线程安全。它的核心是调用 reserveAndGetWaitLength 来完成实际的计算。

stopwatch.readMicros() 的作用是获取当前时间戳，单位是微秒（microseconds）。

stopwatch 对象: 它是 RateLimiter 内部的一个 SleepingStopwatch 实例。这个类是 Guava 对时间源的一个抽象，它既能提供当前时间，也能执行休眠操作。在生产代码中，它通常由 SleepingStopwatch.createFromSystemTimer() 创建，内部包装了一个 Guava 的 Stopwatch 对象。

// ... existing code ...final long reserve(int permits) {checkPermits(permits);synchronized (mutex()) {return reserveAndGetWaitLength(permits, stopwatch.readMicros());}}
// ... existing code ...

reserveAndGetWaitLength(int permits, long nowMicros): 这个方法是连接上层API和下层具体实现策略的桥梁。它调用了抽象方法 reserveEarliestAvailable。

// ... existing code ...final long reserveAndGetWaitLength(int permits, long nowMicros) {long momentAvailable = reserveEarliestAvailable(permits, nowMicros);return max(momentAvailable - nowMicros, 0);}
// ... existing code ...

这里的 momentAvailable 是一个时间戳，代表“在什么时候，这批许可将完全可用”。用这个未来的时间戳减去当前时间，就得到了需要等待的时长。

reserveEarliestAvailable(int permits, long nowMicros): 这是 SmoothRateLimiter 中定义的抽象方法，由 SmoothBursty 和 SmoothWarmingUp 提供具体实现。这是所有限流策略的核心。它负责：
这个方法的具体实现我们稍后在分析 SmoothBursty 时再看。
- 根据当前时间和空闲时间，更新内部存储的许可数 (storedPermits)。
- 计算获取 permits 个许可需要消耗多少存储的许可，以及多少新生成的许可。
- 计算并返回下一个许可可用的时间点 (nextFreeTicketMicros)。

至此，acquire() 的调用链就清晰了：它通过层层调用，最终在同步块内，由具体的策略子类计算出需要等待的时间，然后返回到顶层进行休眠。

`mutex()` 方法分析

这段代码是 RateLimiter 内部用于获取锁对象的关键方法。

// ... existing code ...// Can't be initialized in the constructor because mocks don't call the constructor.private volatile @Nullable Object mutexDoNotUseDirectly;private Object mutex() {Object mutex = mutexDoNotUseDirectly;if (mutex == null) {synchronized (this) {mutex = mutexDoNotUseDirectly;if (mutex == null) {mutexDoNotUseDirectly = mutex = new Object();}}}return mutex;}RateLimiter(SleepingStopwatch stopwatch) {
// ... existing code ...

这个方法采用了非常经典的 双重检查锁定（Double-Checked Locking, DCL） 设计模式。它的目的是 延迟初始化（Lazy Initialization） 一个对象，并且保证这个过程是线程安全的。

mutexDoNotUseDirectly 字段的含义

命名含义: 这个名字本身就是一种强烈的警告和约定："不要直接使用这个字段"。它告诉 Guava 库的开发者，任何时候需要锁对象，都应该去调用 mutex() 方法，而不是直接访问 mutexDoNotUseDirectly 字段。直接访问可能会得到一个 null 值（如果还未初始化），从而导致 NullPointerException。
volatile 关键字: 这个字段被 volatile 修饰至关重要。
- 可见性: 确保当一个线程修改了 mutexDoNotUseDirectly 的值（即完成初始化）之后，这个变化能立刻被其他线程看到。
- 防止指令重排序: new Object() 这个操作在底层并不是原子的，它大致可以分为三步：a. 分配内存空间；b. 初始化对象；c. 将内存地址赋值给引用。如果没有 volatile，JVM 可能会为了优化而重排序指令，比如先执行c再执行b。这样，另一个线程可能会在第一次检查时看到一个非 null 但尚未完全初始化的对象，从而导致潜在的错误。volatile 禁止了这种重排序，保证了 DCL 的正确性。

为什么不直接在构造函数里初始化 mutex 呢？比如 private final Object mutex = new Object();。

代码中的注释给出了答案： // Can't be initialized in the constructor because mocks don't call the constructor. “不能在构造函数中初始化，因为 mock 对象不会调用构造函数。”

在单元测试中，经常会使用 Mockito 这样的框架来创建类的“模拟对象”（mock）。这些框架创建对象时，通常会绕过类的构造函数。如果 mutex 在构造函数中初始化，那么一个被 mock 出来的 RateLimiter 实例的 mutex 字段就会是 null。当测试代码调用到需要同步的方法（如 acquire()）时，就会因为对 null 对象加锁而抛出 NullPointerException。

通过 mutex() 方法进行延迟初始化，可以保证即使是 mock 对象，在第一次需要锁的时候，也能安全地创建一个锁对象，从而让测试得以顺利进行。

mutex() 方法和 mutexDoNotUseDirectly 字段共同构成了一个线程安全的、延迟初始化的锁对象获取机制。

mutex(): 是获取锁对象的唯一正确入口，它使用双重检查锁定模式保证了线程安全和高性能。
mutexDoNotUseDirectly: 是锁对象的实际存储字段，其命名和 volatile 关键字都是为了保证该模式能被正确、安全地使用。其根本目的是为了兼容单元测试中的 mock 场景。

核心非阻塞接口：`tryAcquire()`

tryAcquire() 尝试获取许可，但它不会无限期阻塞。如果等待时间超过了指定的 timeout，它会立刻返回 false。

调用流程分析：`tryAcquire(int permits, long timeout, TimeUnit unit)`

tryAcquire(...): 这是入口。它同样会进入同步块。

// ... existing code ...public boolean tryAcquire(int permits, long timeout, TimeUnit unit) {long timeoutMicros = max(unit.toMicros(timeout), 0);checkPermits(permits);long microsToWait;synchronized (mutex()) {long nowMicros = stopwatch.readMicros();if (!canAcquire(nowMicros, timeoutMicros)) {return false;} else {microsToWait = reserveAndGetWaitLength(permits, nowMicros);}}stopwatch.sleepMicrosUninterruptibly(microsToWait);return true;}
// ... existing code ...

关键区别：canAcquire(long nowMicros, long timeoutMicros) tryAcquire 与 acquire 的最大不同在于，它会先调用 canAcquire 进行一次“预检”。
```
// ... existing code ...private boolean canAcquire(long nowMicros, long timeoutMicros) {return queryEarliestAvailable(nowMicros) - timeoutMicros <= nowMicros;}
// ... existing code ...
```
canAcquire 调用了另一个抽象方法 queryEarliestAvailable。这个方法只查询下一个可用时间点，但不会更新任何内部状态（如 storedPermits 或 nextFreeTicketMicros）。
canAcquire 的逻辑是：下一个可用时间点 - 超时时长 <= 当前时间。如果这个条件成立，意味着在超时时间内一定能拿到许可。
如果 canAcquire 返回 true，那么后续流程就和 acquire 完全一样了：调用 reserveAndGetWaitLength 真正地去预定许可并计算等待时间，然后休眠。如果返回 false，则直接退出，不进行任何等待。

`SmoothBursty` 实现剖析

现在我们来看 acquire 和 tryAcquire 最终调用的核心逻辑在 SmoothBursty 中是如何实现的。

SmoothBursty 的策略很简单：

有一个桶，容量为 maxPermits（等于 maxBurstSeconds * permitsPerSecond）。
系统会以 stableIntervalMicros (等于 1/permitsPerSecond 秒) 的间隔向桶里放一个许可。
当请求到来时，优先从桶里拿。如果桶里够，则无需等待。
如果桶里不够，除了拿光桶里的所有许可，还需要等待新许可的生成。

它的 reserveEarliestAvailable 方法就体现了这个逻辑：

调用 resync(nowMicros)：根据离上次请求的空闲时间，计算这段时间应该生成多少新许可，并加入到 storedPermits（桶）里，但不能超过 maxPermits。
计算本次请求需要从桶里拿多少 (permitsToTakeFromStored)，以及需要等待多少新生成的 (freshPermits)。
计算消耗桶内许可的时间 (microsToWait)，这部分是0，因为是现成的。
计算等待新许可生成的时间，即 freshPermits * stableIntervalMicros。
将总的等待时间加到 nextFreeTicketMicros 上，更新下一个可用时间点。
返回更新后的 nextFreeTicketMicros。

SmoothWarmingUp 的实现与此类似，但计算等待时间的公式更复杂，它是一个梯形的面积，代表了速率从低到高变化的过程，这里就不展开了。

总结

通过以上分析，我们可以清晰地看到 RateLimiter 的调用脉络：

create() -> 创建一个具体策略的 RateLimiter 实例 (SmoothBursty 或 SmoothWarmingUp)。
acquire() -> reserve() -> synchronized -> reserveAndGetWaitLength() -> reserveEarliestAvailable() (具体策略实现) -> sleep().
tryAcquire() -> synchronized -> canAcquire() (预检) -> reserveAndGetWaitLength() -> reserveEarliestAvailable() (具体策略实现) -> sleep().

整个设计的核心在于将同步控制、时间计算和具体限流策略清晰地分离开来，使得代码结构清晰，易于理解和扩展。

图像解读

这段注释的核心是这个 ASCII 码画的函数图：

         ^ throttling (获取单个许可的耗时)|cold  +                  /
interval |                 /.|                / .|               /  .   ← "warmup period" (预热期) 是梯形部分的面积|              /   .|             /    .|            /     .|           /      .stable +----------/  WARM .
interval |          .   UP  .|          . PERIOD.|          .       .0 +----------+-------+--------------→ storedPermits (存储的许可数)0 thresholdPermits maxPermits

X轴 (storedPermits): 代表当前存储了多少个“空闲”许可。越往右，说明限流器空闲时间越长，系统越“冷”。
Y轴 (throttling): 代表在当前 storedPermits 状态下，获取一个许可需要付出的时间成本。Y值越高，代表获取许可越慢。
stableInterval: 稳定速率下的时间成本。比如速率是5 QPS，那么 stableInterval 就是200ms。这是系统“热”起来之后的正常成本。
coldInterval: 最冷状态下的时间成本。它通常是 stableInterval 的几倍（由 coldFactor 定义，默认为3）。
thresholdPermits: 阈值。当存储的许可数低于这个值时，获取许可的成本是固定的 stableInterval。
maxPermits: 最大可存储的许可数。
函数图像: 这条折线代表了“许可单价”随“库存量”变化的规律。
- 库存少时 (0 到 thresholdPermits): 单价恒定为 stableInterval。
- 库存多时 (thresholdPermits 到 maxPermits): 单价从 stableInterval 线性增长到 coldInterval。库存越多，单价越贵。

核心逻辑：面积 = 时间

这个设计的核心思想是：获取一批许可的总耗时，等于函数图像下相应区间的面积。

获取 K 个许可: 假设当前有 X 个存储许可，现在要获取 K 个。这会消耗掉 K 个存储许可，状态从 X 移动到 X-K。
总耗时: 这次操作需要等待的时间，就等于函数图像在 [X-K, X] 这个区间下的面积（积分）。

这就是为什么注释里反复强调“integral”（积分）和“area”（面积）。

整个限速过程的说明

结合这个模型，我们来完整地描述一下 SmoothWarmingUp 的限速过程：

系统空闲（向右移动）:
- 当没有请求时，RateLimiter 处于空闲状态。
- storedPermits 会随着时间的推移而增加，就像给令牌桶里添加令牌一样。
- 状态在X轴上从左向右移动。
- 注释中第4点提到，这个移动速度是固定的，从0移动到 maxPermits 正好花费一个 warmupPeriod 的时间。
请求到来（向左移动，计算面积）:
- 一个 acquire(K) 请求到来。
- 系统首先会消耗 storedPermits。状态在X轴上从右向左移动 K 个单位。
- 计算等待时间:
  - 预热区 (梯形部分): 如果消耗的许可在 thresholdPermits 右侧，那么等待时间就是这部分梯形面积。因为这部分线段是倾斜的，Y值（单价）较高，所以算出来的面积（总时间）也较大。
  - 稳定区 (矩形部分): 如果消耗的许可继续向左，进入了 thresholdPermits 左侧，那么等待时间就是这部分矩形面积。因为这部分Y值（单价）是固定的 stableInterval，所以成本是可预测的。
- 总等待时间 = 预热区消耗的面积 + 稳定区消耗的面积 + 新生成许可的耗时（如果 storedPermits 不够用）。
连续请求（系统“变热”）:
- 当请求持续不断地到来，storedPermits 会被迅速消耗光，状态点会一直停留在X轴的0点附近。
- 此时，所有请求都相当于在稳定区获取许可，或者直接获取新生成的许可。
- 获取每个许可的成本都稳定在 stableInterval。
- 系统达到了它的最大稳定速率，即完成了“预热”。

数学公式的意义

注释最后一部分的数学公式，是为了根据用户设定的 warmupPeriod（预热时长）来反推出 thresholdPermits 和 maxPermits 这两个内部参数应该设为多少。

thresholdPermits = 0.5 * warmupPeriod / stableInterval: 这个公式确保了，如果系统满负荷运行，从“阈值”状态消耗到“空”状态，正好花费 warmupPeriod / 2 的时间。
maxPermits =
thresholdPermits + 2 * warmupPeriod / (stableInterval + coldInterval)
: 这个公式确保了，如果系统满负荷运行，从“最冷”状态消耗到“阈值”状态，正好花费 warmupPeriod 的时间。

这些推导保证了 RateLimiter 的行为和用户指定的 warmupPeriod 参数是精确匹配的。

总结

SmoothWarmingUp 通过一个巧妙的“成本函数”（即那张图），将“系统空闲度”(storedPermits)和“获取许可的成本”(throttling)关联起来。

空闲时: 积累 storedPermits，系统变“冷”，获取许可的单价变高。
繁忙时: 消耗 storedPermits，系统变“热”，获取许可的单价降低并稳定下来。
总耗时: 通过计算函数下的面积，实现了平滑的速率变化，完美地模拟了需要“预热”的场景。

`SmoothRateLimiter`

它作为 RateLimiter 的一个抽象子类，承接了父类的抽象方法，并构建了一套通用的“平滑”限流模型。然后，它又定义了新的抽象方法，将具体的策略差异（突发 vs 预热）交由它的子类 SmoothBursty 和 SmoothWarmingUp 去实现。

我们来继续这个流程分析，看看 SmoothRateLimiter 是如何承上启下的。

`SmoothRateLimiter` 的角色：实现通用平滑限流逻辑

当 RateLimiter 的 acquire() -> reserve() -> reserveAndGetWaitLength() 调用链最终来到 reserveEarliestAvailable() 时，执行的正是 SmoothRateLimiter 中实现的这个核心方法。

// ... existing code ...@Overridefinal long reserveEarliestAvailable(int requiredPermits, long nowMicros) {resync(nowMicros);long returnValue = nextFreeTicketMicros;double storedPermitsToSpend = min(requiredPermits, this.storedPermits);double freshPermits = requiredPermits - storedPermitsToSpend;long waitMicros =storedPermitsToWaitTime(this.storedPermits, storedPermitsToSpend)+ (long) (freshPermits * stableIntervalMicros);this.nextFreeTicketMicros = LongMath.saturatedAdd(nextFreeTicketMicros, waitMicros);this.storedPermits -= storedPermitsToSpend;return returnValue;}/** Updates {@code storedPermits} and {@code nextFreeTicketMicros} based on the current time. */void resync(long nowMicros) {// if nextFreeTicket is in the past, resync to nowif (nowMicros > nextFreeTicketMicros) {double newPermits = (nowMicros - nextFreeTicketMicros) / coolDownIntervalMicros();storedPermits = min(maxPermits, storedPermits + newPermits);nextFreeTicketMicros = nowMicros;}}/*** Returns the number of microseconds during cool down that we have to wait to get a new permit.*/abstract double coolDownIntervalMicros();
// ... existing code ...

这个方法的流程是所有平滑限流策略的通用逻辑：

resync(nowMicros): 状态同步。这是非常关键的一步。它会检查当前时间 nowMicros 和记录的“下一个可用时间点” nextFreeTicketMicros。如果当前时间已经晚于 nextFreeTicketMicros，说明限流器在过去一段时间是空闲的。resync 会计算出这段空闲时间里本应生成多少个新的许可，并将它们添加到 storedPermits（存储的许可）中，但不会超过 maxPermits。
long returnValue = nextFreeTicketMicros;: 记录下当前（同步后）的“下一个可用时间点”。这个时间点就是本次请求可以被处理的最早时间。这个值最终会被返回给上层。
划分许可来源:
- double storedPermitsToSpend = min(requiredPermits, this.storedPermits); 计算本次请求需要消耗多少个存储的许可。
- double freshPermits = requiredPermits - storedPermitsToSpend; 计算还需要多少个新生成的许可。
计算总等待时间 waitMicros: 这是整个算法的核心，它由两部分组成：
- storedPermitsToWaitTime(this.storedPermits, storedPermitsToSpend): 计算消耗存储的许可需要等待多长时间。这是一个新的抽象方法，SmoothRateLimiter 把这个计算的责任交给了它的具体子类。这正是不同策略（突发 vs 预热）产生差异的地方。
- (long) (freshPermits * stableIntervalMicros): 计算获取新生成的许可需要等待多长时间。这个逻辑是固定的：新许可的成本就是 数量 * 稳定速率的时间间隔。
更新状态:
- this.nextFreeTicketMicros = LongMath.saturatedAdd(nextFreeTicketMicros, waitMicros); 将“下一个可用时间点”向未来推进 waitMicros。这意味着下一个请求需要等待更长的时间，从而实现了“为当前请求付费”的逻辑。
- this.storedPermits -= storedPermitsToSpend; 减去被消耗的存储许可。
return returnValue;: 返回在第2步记录的时间点。上层的 reserveAndGetWaitLength 方法会用这个返回的时间点减去当前时间，得到最终需要休眠的时间。

策略分化：`storedPermitsToWaitTime` 的不同实现

SmoothRateLimiter 通过定义新的抽象方法 storedPermitsToWaitTime，让子类来决定如何处理“存储的许可”。

1. `SmoothBursty`（平滑突发）的实现

SmoothBursty 的策略是：存储的许可可以被立即使用，没有任何时间成本。这使得系统在空闲后可以应对突发流量。

// ... existing code ...static final class SmoothBursty extends SmoothRateLimiter {
// ... existing code ...@Overridelong storedPermitsToWaitTime(double storedPermits, double permitsToTake) {return 0L;}
// ... existing code ...}

它的实现极其简单，直接返回 0L。这意味着，只要 storedPermits 足够，acquire() 调用就不会产生任何等待。

2. `SmoothWarmingUp`（平滑预热）的实现

SmoothWarmingUp 的策略是：系统在“冷却”（空闲）状态下，获取许可的成本会更高。随着请求的增多，系统逐渐“预热”，获取许可的成本才慢慢降低到稳定水平。

// ... existing code ...static final class SmoothWarmingUp extends SmoothRateLimiter {
// ... existing code ...@Overridelong storedPermitsToWaitTime(double storedPermits, double permitsToTake) {double availablePermitsAboveThreshold = storedPermits - thresholdPermits;long micros = 0;// measuring the integral on the right part of the function (the climbing line)if (availablePermitsAboveThreshold > 0.0) {double permitsAboveThresholdToTake = min(availablePermitsAboveThreshold, permitsToTake);// ... 计算梯形面积 ...double length =permitsToTime(availablePermitsAboveThreshold)+ permitsToTime(availablePermitsAboveThreshold - permitsAboveThresholdToTake);micros = (long) (permitsAboveThresholdToTake * length / 2.0);permitsToTake -= permitsAboveThresholdToTake;}// measuring the integral on the left part of the function (the horizontal line)micros += (long) (stableIntervalMicros * permitsToTake);return micros;}
// ... existing code ...}

它的实现要复杂得多，它通过计算一个梯形面积（积分）来得出等待时间。storedPermits 越多（系统越冷），单位许可的成本（图形的高度）就越高，等待时间就越长。

完整流程回顾

RateLimiter.acquire(N) 被调用。
进入 RateLimiter.reserve(N)，获取锁。
调用 RateLimiter.reserveAndGetWaitLength(N, now)。
该方法调用 SmoothRateLimiter.reserveEarliestAvailable(N, now)。
reserveEarliestAvailable 方法执行通用逻辑：
- resync() 同步状态，补充 storedPermits。
- 计算需要消耗的 storedPermits 和 freshPermits。
- 调用 storedPermitsToWaitTime() 计算消耗存储许可的成本。
  - 如果是 SmoothBursty，成本为 0。
  - 如果是 SmoothWarmingUp，成本根据预热函数计算。
- 计算消耗新许可的成本 freshPermits * stableIntervalMicros。
- 将总成本加到 nextFreeTicketMicros 上，并更新 storedPermits。
- 返回调用前的 nextFreeTicketMicros。
reserveAndGetWaitLength 拿到返回值，减去 now，得到需要休眠的总时间 microsToWait。
acquire 方法调用 stopwatch.sleepMicrosUninterruptibly(microsToWait)，线程阻塞。
休眠结束后，请求完成。