小架构step系列10:日志热更新
1 概述
日志主要是为定位问题使用的,日志一般分为5个级别:ERROR、WARN、INFO、DEBUG、TRACE,越往ERROR的方向问题越严重,越往TRACE的方向日志越详细、日志量越多,定位问题肯定是日志越详细越有帮助,但日志越详细其占用的磁盘空间越大,量过大也影响日志的检索性能,所以需要在中间做个平衡。
生产环境偏向只打印ERROR和WARN级别的,最多到INFO级别,这样大部分问题都能够得到定位。当有些问题比较棘手时,需要更详细的日志,就要配置为DEBUG级别、甚至TRACE级别,这种情况比较少,如果因为这种情况而需要重启则不划算,所以最好日志的级别能够动态修改。当然,还有其它情况,也可能需要改其它配置,比如增加一些日志的配置或者调整日志格式等。logback支持的热更新功能,刚好可以满足这类修改而不希望重启的场景。
2 原理
2.1 初始化
为了热更新,需要提前准备一些内容。第一个要准备的就是节点和处理Action之间的映射关系。加载logback.xml文件时,文件内容会被解析为一个个Event对象,每种Event对象会对应一个解析Action类,这些解析类和节点的关系是在初始化的时候指定好的,比如<configuration>节点对应于ConfigurationAction解析类。
// logback.xml文件会被解析为一个个Event对象,每种Event对象会对应一个解析Action类,
// 这些解析类是在初始化的时候指定好的,比如<configuration>节点对应于ConfigurationAction解析类
// 源码位置:ch.qos.logback.classic.joran.JoranConfigurator
public void addInstanceRules(RuleStore rs) {super.addInstanceRules(rs); // 父类JoranConfiguratorBase也初始化了一部分rs.addRule(new ElementSelector("configuration"), new ConfigurationAction());rs.addRule(new ElementSelector("configuration/contextName"), new ContextNameAction());rs.addRule(new ElementSelector("configuration/contextListener"), new LoggerContextListenerAction());rs.addRule(new ElementSelector("configuration/insertFromJNDI"), new InsertFromJNDIAction());rs.addRule(new ElementSelector("configuration/evaluator"), new EvaluatorAction());rs.addRule(new ElementSelector("configuration/appender/sift"), new SiftAction());rs.addRule(new ElementSelector("configuration/appender/sift/*"), new NOPAction());rs.addRule(new ElementSelector("configuration/logger"), new LoggerAction());rs.addRule(new ElementSelector("configuration/logger/level"), new LevelAction());rs.addRule(new ElementSelector("configuration/root"), new RootLoggerAction());rs.addRule(new ElementSelector("configuration/root/level"), new LevelAction());rs.addRule(new ElementSelector("configuration/logger/appender-ref"), new AppenderRefAction());rs.addRule(new ElementSelector("configuration/root/appender-ref"), new AppenderRefAction());rs.addRule(new ElementSelector("*/if"), new IfAction());rs.addRule(new ElementSelector("*/if/then"), new ThenAction());rs.addRule(new ElementSelector("*/if/then/*"), new NOPAction());rs.addRule(new ElementSelector("*/if/else"), new ElseAction());rs.addRule(new ElementSelector("*/if/else/*"), new NOPAction());if (PlatformInfo.hasJMXObjectName()) {rs.addRule(new ElementSelector("configuration/jmxConfigurator"), new JMXConfiguratorAction());}rs.addRule(new ElementSelector("configuration/include"), new IncludeAction());rs.addRule(new ElementSelector("configuration/consolePlugin"), new ConsolePluginAction());rs.addRule(new ElementSelector("configuration/receiver"), new ReceiverAction());
}
// 源码位置:ch.qos.logback.core.joran.JoranConfiguratorBase
protected void addInstanceRules(RuleStore rs) {rs.addRule(new ElementSelector("configuration/variable"), new PropertyAction());rs.addRule(new ElementSelector("configuration/property"), new PropertyAction());rs.addRule(new ElementSelector("configuration/substitutionProperty"), new PropertyAction());rs.addRule(new ElementSelector("configuration/timestamp"), new TimestampAction());rs.addRule(new ElementSelector("configuration/shutdownHook"), new ShutdownHookAction());rs.addRule(new ElementSelector("configuration/define"), new DefinePropertyAction());// the contextProperty pattern is deprecated. It is undocumented// and will be dropped in future versions of logbackrs.addRule(new ElementSelector("configuration/contextProperty"), new ContextPropertyAction());rs.addRule(new ElementSelector("configuration/conversionRule"), new ConversionRuleAction());rs.addRule(new ElementSelector("configuration/statusListener"), new StatusListenerAction());rs.addRule(new ElementSelector("configuration/appender"), new AppenderAction<E>());rs.addRule(new ElementSelector("configuration/appender/appender-ref"), new AppenderRefAction<E>());rs.addRule(new ElementSelector("configuration/newRule"), new NewRuleAction());rs.addRule(new ElementSelector("*/param"), new ParamAction(getBeanDescriptionCache()));
}
第二要准备的是记住文件当前的情况,这样才知道后面文件是否有更新,有更新才需要重新加载。
// 在初始化的时候,会调用doConfigure()接口,记住logback.xml的路径和文件的上次修改时间
// 源码位置:ch.qos.logback.core.joran.GenericConfigurator
public final void doConfigure(URL url) throws JoranException {InputStream in = null;try {// 1. 把信息记录到Context里informContextOfURLUsedForConfiguration(getContext(), url);URLConnection urlConnection = url.openConnection();urlConnection.setUseCaches(false);in = urlConnection.getInputStream();doConfigure(in, url.toExternalForm());}// 省略其它代码
}// 源码位置:ch.qos.logback.core.joran.GenericConfigurator
public static void informContextOfURLUsedForConfiguration(Context context, URL url) {// 2. 对Url设置WatchConfigurationWatchListUtil.setMainWatchURL(context, url);
}
// 源码位置:ch.qos.logback.core.joran.util.ConfigurationWatchListUtil
public static void setMainWatchURL(Context context, URL url) {// 3. 在Context中维护ConfigurationWatchList对象,先获取,如果没有则创建,保持单例ConfigurationWatchList cwl = getConfigurationWatchList(context);if (cwl == null) {cwl = new ConfigurationWatchList();cwl.setContext(context);context.putObject(CoreConstants.CONFIGURATION_WATCH_LIST, cwl);} else {cwl.clear();}// 4. 记住logback.xml的路径cwl.setMainURL(url);
}
// 源码位置:ch.qos.logback.core.joran.spi.ConfigurationWatchList
public void setMainURL(URL mainURL) {// 5. 记住logback.xml的路径this.mainURL = mainURL;if (mainURL != null)addAsFileToWatch(mainURL);
}
private void addAsFileToWatch(URL url) {File file = convertToFile(url);if (file != null) {// 6. 维护两个List:一个是要热更新的文件List(fileWatchList),另外一个是对应文件的上次修改时间fileWatchList.add(file); // 记住logback.xml文件lastModifiedList.add(file.lastModified()); // 记住logback.xml文件上次的修改时间}
}
2.2 配置文件热更新流程
热更新的流程主要是解析<configuration>节点中的scan属性和scanPeriod属性,前者是热更新的开关,开了才需要热更新,后者是扫描周期,按周期检查配置文件是否有修改,如果修改了就启动线程进行异步更新。
// 在加载logback.xml配置时,会调用EventPlayer来解析
// 源码位置:ch.qos.logback.core.joran.spi.EventPlayer
public void play(List<SaxEvent> aSaxEventList) {eventList = aSaxEventList;SaxEvent se;// eventList是包含logback.xml配置文件的所有节点,用Event来表示// Event大致有StartEvent、BodyEvent、EndEvent// 每个节点都有StartEvent和EndEvent,比如<property name="log.file.path" value="logs" />,<property>是StartEvent,</property>是EndEvent// BodyEvent则体现在节点内容上,比如 <Pattern>${log.msg.pattern}</Pattern>中的${log.msg.pattern}是BodyEvent// 顺序则和xml文件的顺序一致,比如第一个节点StartEvent在最前面,EndEvent则在最后面。for (currentIndex = 0; currentIndex < eventList.size(); currentIndex++) {se = eventList.get(currentIndex);if (se instanceof StartEvent) {// 1. 调用Interpreter的startElement()解析StartEventinterpreter.startElement((StartEvent) se);interpreter.getInterpretationContext().fireInPlay(se);}if (se instanceof BodyEvent) {interpreter.getInterpretationContext().fireInPlay(se);interpreter.characters((BodyEvent) se);}if (se instanceof EndEvent) {interpreter.getInterpretationContext().fireInPlay(se);interpreter.endElement((EndEvent) se);}}
}// 源码位置:ch.qos.logback.core.joran.spi.Interpreter
public void startElement(StartEvent se) {setDocumentLocator(se.getLocator());// 2. 主要解析的内容是namespace、节点名(localName, qName)、属性attributesstartElement(se.namespaceURI, se.localName, se.qName, se.attributes);
}
private void startElement(String namespaceURI, String localName, String qName, Attributes atts) {String tagName = getTagName(localName, qName);elementPath.push(tagName);if (skip != null) {pushEmptyActionList();return;}// 根据节点名称获取上面初始化的Action,elementPath是节点名称的串接,// 比如第一级节点是<configuration>,第二级节点是<property>,那么elementPath是configuration/propertyList<Action> applicableActionList = getApplicableActionList(elementPath, atts);if (applicableActionList != null) {actionListStack.add(applicableActionList);// 3. 执行Action解析节点信息callBeginAction(applicableActionList, tagName, atts);} else {pushEmptyActionList();String errMsg = "no applicable action for [" + tagName + "], current ElementPath is [" + elementPath + "]";cai.addError(errMsg);}
}
void callBeginAction(List<Action> applicableActionList, String tagName, Attributes atts) {if (applicableActionList == null) {return;}Iterator<Action> i = applicableActionList.iterator();while (i.hasNext()) {Action action = (Action) i.next();try {// 4. 遍历与此节点有关的每个Action,执行Action的统一接口begin()解析节点信息// 在此只关注ConfigurationAction的begin()方法,热更新的信息配置在这个节点上action.begin(interpretationContext, tagName, atts);} catch (ActionException e) {skip = elementPath.duplicate();cai.addError("ActionException in Action for tag [" + tagName + "]", e);} catch (RuntimeException e) {skip = elementPath.duplicate();cai.addError("RuntimeException in Action for tag [" + tagName + "]", e);}}
}// 源码位置:ch.qos.logback.classic.joran.action.ConfigurationAction
public void begin(InterpretationContext ic, String name, Attributes attributes) {this.threshold = System.currentTimeMillis();String debugAttrib = this.getSystemProperty("logback.debug");if (debugAttrib == null) {debugAttrib = ic.subst(attributes.getValue("debug"));}if (!OptionHelper.isEmpty(debugAttrib) && !debugAttrib.equalsIgnoreCase("false") && !debugAttrib.equalsIgnoreCase("null")) {StatusListenerConfigHelper.addOnConsoleListenerInstance(this.context, new OnConsoleStatusListener());} else {this.addInfo("debug attribute not set");}// 5. 处理<configuration>节点的scan属性this.processScanAttrib(ic, attributes);LoggerContext lc = (LoggerContext)this.context;boolean packagingData = OptionHelper.toBoolean(ic.subst(attributes.getValue("packagingData")), false);lc.setPackagingDataEnabled(packagingData);if (EnvUtil.isGroovyAvailable()) {ContextUtil contextUtil = new ContextUtil(this.context);contextUtil.addGroovyPackages(lc.getFrameworkPackages());}ic.pushObject(this.getContext());
}
void processScanAttrib(InterpretationContext ic, Attributes attributes) {// 6. 取出<configuration>节点scan属性值,如果配置了scan属性且值不为false,则认为需要热更新logback.xmlString scanAttrib = ic.subst(attributes.getValue("scan"));if (!OptionHelper.isEmpty(scanAttrib) && !"false".equalsIgnoreCase(scanAttrib)) {// 取出线程池ScheduledExecutorService scheduledExecutorService = this.context.getScheduledExecutorService();// 获取logback.xml的URL路径URL mainURL = ConfigurationWatchListUtil.getMainWatchURL(this.context);if (mainURL == null) {this.addWarn("Due to missing top level configuration file, reconfiguration on change (configuration file scanning) cannot be done.");return;}// 初始化Task,实现了Runnable接口,用作线程运行ReconfigureOnChangeTask rocTask = new ReconfigureOnChangeTask();rocTask.setContext(this.context);this.context.putObject("RECONFIGURE_ON_CHANGE_TASK", rocTask);// 7. 把扫描周期处理为DurationString scanPeriodAttrib = ic.subst(attributes.getValue("scanPeriod"));Duration duration = this.getDuration(scanAttrib, scanPeriodAttrib);if (duration == null) {return;}this.addInfo("Will scan for changes in [" + mainURL + "] ");this.addInfo("Setting ReconfigureOnChangeTask scanning period to " + duration);ScheduledFuture<?> scheduledFuture = scheduledExecutorService.scheduleAtFixedRate(rocTask, duration.getMilliseconds(), duration.getMilliseconds(), TimeUnit.MILLISECONDS);this.context.addScheduledFuture(scheduledFuture);}}// 源码位置:ch.qos.logback.classic.joran.action.ConfigurationAction
private Duration getDuration(String scanAttrib, String scanPeriodAttrib) {Duration duration = null;if (!OptionHelper.isEmpty(scanPeriodAttrib)) {try {// 8. 把扫描周期属性值处理为Duration对象duration = Duration.valueOf(scanPeriodAttrib);} catch (NumberFormatException var5) {this.addError("Error while converting [" + scanAttrib + "] to long", var5);}}return duration;
}// 源码位置:ch.qos.logback.core.util.Duration
private final static String DOUBLE_PART = "([0-9]*(.[0-9]+)?)"; // 匹配数字
private final static String UNIT_PART = "(|milli(second)?|second(e)?|minute|hour|day)s?"; // 匹配时间单位(毫秒、秒、分、时、天)
private static final Pattern DURATION_PATTERN = Pattern.compile(DOUBLE_PART + "\\s*" + UNIT_PART, Pattern.CASE_INSENSITIVE);
public static Duration valueOf(String durationStr) {// 9. 用正则表达式匹配扫描周期// 正则表达式主要是匹配两段,第一段是数字,第二段是时间单位(单位必须以s结尾),中间可以有空格Matcher matcher = DURATION_PATTERN.matcher(durationStr);if (matcher.matches()) {String doubleStr = matcher.group(DOUBLE_GROUP);String unitStr = matcher.group(UNIT_GROUP);double doubleValue = Double.valueOf(doubleStr);// 10. 根据不同的时间单位,都把时间转成毫秒数if (unitStr.equalsIgnoreCase("milli") || unitStr.equalsIgnoreCase("millisecond") || unitStr.length() == 0) {return buildByMilliseconds(doubleValue);} else if (unitStr.equalsIgnoreCase("second") || unitStr.equalsIgnoreCase("seconde")) {return buildBySeconds(doubleValue);} else if (unitStr.equalsIgnoreCase("minute")) {return buildByMinutes(doubleValue);} else if (unitStr.equalsIgnoreCase("hour")) {return buildByHours(doubleValue);} else if (unitStr.equalsIgnoreCase("day")) {return buildByDays(doubleValue);} else {throw new IllegalStateException("Unexpected " + unitStr);}} else {throw new IllegalArgumentException("String value [" + durationStr + "] is not in the expected format.");}
}
public static Duration buildBySeconds(double value) {// 11. 以时间单位是秒为例,用1000乘以数值得到毫秒数,SECONDS_COEFFICIENT = 1000return new Duration((long) (SECONDS_COEFFICIENT * value));
}// 回到ConfigurationAction,使用扫描周期
// 源码位置:ch.qos.logback.classic.joran.action.ConfigurationAction
void processScanAttrib(InterpretationContext ic, Attributes attributes) {// 6. 取出<configuration>节点scan属性值,如果配置了scan属性且值不为false,则认为需要热更新logback.xmlString scanAttrib = ic.subst(attributes.getValue("scan"));if (!OptionHelper.isEmpty(scanAttrib) && !"false".equalsIgnoreCase(scanAttrib)) {// 取出线程池ScheduledExecutorService scheduledExecutorService = this.context.getScheduledExecutorService();// 获取logback.xml的URL路径URL mainURL = ConfigurationWatchListUtil.getMainWatchURL(this.context);if (mainURL == null) {this.addWarn("Due to missing top level configuration file, reconfiguration on change (configuration file scanning) cannot be done.");return;}// 初始化Task,实现了Runnable接口,用作线程运行ReconfigureOnChangeTask rocTask = new ReconfigureOnChangeTask();rocTask.setContext(this.context);this.context.putObject("RECONFIGURE_ON_CHANGE_TASK", rocTask);// 7. 把扫描周期处理为DurationString scanPeriodAttrib = ic.subst(attributes.getValue("scanPeriod"));Duration duration = this.getDuration(scanAttrib, scanPeriodAttrib);if (duration == null) {return;}this.addInfo("Will scan for changes in [" + mainURL + "] ");this.addInfo("Setting ReconfigureOnChangeTask scanning period to " + duration);// 12. 用线程池按扫描周期进行定时触发ReconfigureOnChangeTaskScheduledFuture<?> scheduledFuture = scheduledExecutorService.scheduleAtFixedRate(rocTask, duration.getMilliseconds(), duration.getMilliseconds(), TimeUnit.MILLISECONDS);this.context.addScheduledFuture(scheduledFuture);}}// 源码位置:ch.qos.logback.classic.joran.ReconfigureOnChangeTask
public void run() {fireEnteredRunMethod();// 13. 获取要热更新的logback.xml文件路径ConfigurationWatchList configurationWatchList = ConfigurationWatchListUtil.getConfigurationWatchList(context);if (configurationWatchList == null) {addWarn("Empty ConfigurationWatchList in context");return;}List<File> filesToWatch = configurationWatchList.getCopyOfFileWatchList();if (filesToWatch == null || filesToWatch.isEmpty()) {addInfo("Empty watch file list. Disabling ");return;}// 14. 检查文件是否有更新,configurationWatchList为ConfigurationWatchListif (!configurationWatchList.changeDetected()) {return;}fireChangeDetected();URL mainConfigurationURL = configurationWatchList.getMainURL();addInfo(DETECTED_CHANGE_IN_CONFIGURATION_FILES);addInfo(CoreConstants.RESET_MSG_PREFIX + "named [" + context.getName() + "]");LoggerContext lc = (LoggerContext) context;if (mainConfigurationURL.toString().endsWith("xml")) {performXMLConfiguration(lc, mainConfigurationURL);} fireDoneReconfiguring();
}// 源码位置:ch.qos.logback.core.joran.spi.ConfigurationWatchList
public boolean changeDetected() {// 15. 检查文件是否有更新// 在初始化的时候,记录了两个List,一个是要热更新的文件List(fileWatchList),另外一个是对应文件的上次修改时间// 用两个List应该是为了支持多个日志配置文件,两个List的元素一一对应(等长)int len = fileWatchList.size();for (int i = 0; i < len; i++) {long lastModified = lastModifiedList.get(i);File file = fileWatchList.get(i);// 16. 从文件List取出文件并取到上次修改时间,从上次修改时间List取出对应文件的上次修改时间,如果两个时间不一致,说明文件更新了if (lastModified != file.lastModified()) {return true;}}return false;
}// 回到ReconfigureOnChangeTask的run(),继续处理定期更新
// 源码位置:ch.qos.logback.classic.joran.ReconfigureOnChangeTask
public void run() {fireEnteredRunMethod();// 13. 获取要热更新的logback.xml文件路径ConfigurationWatchList configurationWatchList = ConfigurationWatchListUtil.getConfigurationWatchList(context);if (configurationWatchList == null) {addWarn("Empty ConfigurationWatchList in context");return;}List<File> filesToWatch = configurationWatchList.getCopyOfFileWatchList();if (filesToWatch == null || filesToWatch.isEmpty()) {addInfo("Empty watch file list. Disabling ");return;}// 14. 检查文件是否有更新,configurationWatchList为ConfigurationWatchListif (!configurationWatchList.changeDetected()) {return;}fireChangeDetected();// 17. 取出logback.xml文件URLURL mainConfigurationURL = configurationWatchList.getMainURL();addInfo(DETECTED_CHANGE_IN_CONFIGURATION_FILES);addInfo(CoreConstants.RESET_MSG_PREFIX + "named [" + context.getName() + "]");LoggerContext lc = (LoggerContext) context;// 18. 调performXMLConfiguration()接口重新解析配置文件if (mainConfigurationURL.toString().endsWith("xml")) {performXMLConfiguration(lc, mainConfigurationURL);} fireDoneReconfiguring();
}// 源码位置:ch.qos.logback.classic.joran.ReconfigureOnChangeTask
private void performXMLConfiguration(LoggerContext lc, URL mainConfigurationURL) {JoranConfigurator jc = new JoranConfigurator();jc.setContext(context);StatusUtil statusUtil = new StatusUtil(context);List<SaxEvent> eventList = jc.recallSafeConfiguration();URL mainURL = ConfigurationWatchListUtil.getMainWatchURL(context);lc.reset();long threshold = System.currentTimeMillis();try {// 19. 调JoranConfigurator的doConfigure(),跟一开始的初始化一样// 一路看下来,热更新是通过比较配置文件的上次修改时间来确定配置文件是否有更新的,但并没有看到刷新上次修改时间的地方,// 这是因为doConfigure()流程会把这些值都重新初始化一遍,也就是当文件修改之后,重新初始化的时候会再次记住了上次修改时间,// 当定时线程再次去检查的时候,会拿到文件的新的上次修改时间,和初始化时取到的开始时间对比。jc.doConfigure(mainConfigurationURL);if (statusUtil.hasXMLParsingErrors(threshold)) {fallbackConfiguration(lc, eventList, mainURL);}} catch (JoranException e) {fallbackConfiguration(lc, eventList, mainURL);}
}
3 架构一小步
在logback.xml文件中配置热更新,扫描周期是30秒。如果需要改周期,注意要满足格式:数字+时间单位+s,单位支持:milli、millisecond、second、minute、hour、day,分别表示:毫秒、毫秒、秒、分钟、小时、天,数字和时间单位之间可以有空格,注意别漏掉了最后的s,这个s应该是英文复数的意思,即多秒、多分钟等。
<configuration scan="true" scanPeriod="30 seconds">