While debugging my Elasticsearch instance, I noticed a curious issue: logs would vanish consistently at midnight. No logs appeared between 23:40:00 and 00:00:05, leaving an unexplained gap. This guide walks through the debugging process, root cause identification, and a simple fix.
Initial Investigation: Where Did the Logs Go?
At first glance, the following possibilities seemed likely:
- Log Rotation: Elasticsearch rotates its logs at midnight. Could this process be causing the missing lines?
- Marvel Indices: Marvel creates daily indices at midnight. Could this interfere with log generation?
Neither explained the issue upon closer inspection, so I dug deeper.
The Real Culprit: Log4j and DailyRollingFileAppender
The issue turned out to be related to Log4j. Elasticsearch uses Log4j for logging, but instead of a traditional log4j.properties file, it employs a translated YAML configuration. After reviewing the logging configuration, I found the culprit: DailyRollingFileAppender.
What’s Wrong with DailyRollingFileAppender?
The DailyRollingFileAppender class extends Log4j’s FileAppender but introduces a major flaw—it synchronizes file rolling at user-defined intervals, which can cause:
- Data Loss: Logs might not be written during the rolling process.
- Synchronization Issues: Overlap between log files leads to missing data.
This behavior is well-documented in the Apache DailyRollingFileAppender documentation.
Root Cause: Why Were Logs Missing?
The missing logs were a direct result of using DailyRollingFileAppender, which failed to properly handle log rotation at midnight. This caused gaps in logging during the critical period when the file was being rolled over.
The Fix: Switch to RollingFileAppender
To resolve this, I replaced DailyRollingFileAppender with RollingFileAppender, which rolls logs based on file size rather than a specific time. This eliminates the synchronization issues associated with the daily rolling behavior.
Updated YAML Configuration
Here’s how I updated the configuration:
file:
type: rollingfile
file: ${path.logs}/${cluster.name}.log
maxFileSize: 100MB
maxBackupIndex: 10
layout:
type: pattern
conversionPattern: "[%d{ISO8601}][%-5p][%-25c] %m%n"
Key Changes:
- Type: Changed from
dailyRollingFiletorollingFile. - File Size Limit: Set
maxFileSizeto 100MB. - Backup: Retain up to 10 backup log files.
- Removed Date Pattern: Eliminated the problematic
datePatternfield used by DailyRollingFileAppender.
Happy Ending: Logs Restored
After implementing the fix, Elasticsearch logs stopped disappearing. Interestingly, further investigation revealed that the midnight log gap was also related to Marvel indices transitioning into a new day. This caused brief latency as new indices were created for shards and replicas.
Lessons Learned
- Understand Your Tools: Familiarity with Log4j’s appenders helped identify the issue quickly.
- Avoid Deprecated Features: DailyRollingFileAppender is prone to issues—switch to RollingFileAppender for modern setups.
- Analyze Related Systems: The Marvel index creation provided additional context for the midnight timing.
Conclusion
Debugging missing Elasticsearch logs required diving into the logging configuration and understanding how appenders handle file rolling. By switching to RollingFileAppender, I resolved the synchronisation issues and restored the missing logs.
If you’re experiencing similar issues, check your logging configuration and avoid using DailyRollingFileAppender in favor of RollingFileAppender. This can save hours of debugging in the future.
For more insights, explore Log4j Appender Documentation.
Also, to learn how to clean data coming into Elasticsearch see Cleaning Elasticsearch Data Before Indexing.
