Details
Description
SlowPeersReport is generated by the SampleStat between tow dn, so it can present on nn's jmx like this:
"SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}]
In each period, MutableRollingAverages will do a rollOverAvgs(), it will generate a SumAndCount object which is based on SampleStat, and store it in a LinkedBlockingDeque<SumAndCount>, the deque will be used to generate SlowPeersReport. And the old member of deque won't be removed until the queue is full. However, if dn1 don't send any packet to dn2 in the last of 36*300_000 ms, the deque will be filled with an old member, because the number of last SampleStat never change.I think these old SampleStats should be considered as expired message and ignore them when generating a new SlowPeersReport.
Attachments
Attachments
Issue Links
- relates to
-
HADOOP-17495 Backport HADOOP-16947 "Stale record should be remove when MutableRollingAverages generating aggregate data." to branch 2.10
- Resolved
-
YARN-10217 Expired SampleStat should ignore when generating SlowPeersReport
- Patch Available