Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-16689

Standby NameNode crashes when transitioning to Active with in-progress tailer

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      Standby NameNode crashes when transitioning to Active with a in-progress tailer. And the error message like blew:

      Caused by: java.lang.IllegalStateException: Cannot start writing at txid X when there is a stream available for read: ByteStringEditLog[X, Y], ByteStringEditLog[X, 0]
      	at org.apache.hadoop.hdfs.server.namenode.FSEditLog.openForWrite(FSEditLog.java:344)
      	at org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.openForWrite(FSEditLogAsync.java:113)
      	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1423)
      	at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:2132)
      	... 36 more
      

      After tracing and found there is a critical bug in EditlogTailer#catchupDuringFailover() when DFS_HA_TAILEDITS_INPROGRESS_KEY is true. Because catchupDuringFailover() try to replay all missed edits from JournalNodes with onlyDurableTxns=true. It may cannot replay any edits when they are some abnormal JournalNodes.

      Reproduce method, suppose:

      • There are 2 namenode, namely NN0 and NN1, and the status of echo namenode is Active, Standby respectively. And there are 3 JournalNodes, namely JN0, JN1 and JN2.
      • NN0 try to sync 3 edits to JNs with started txid 3, but only successfully synced them to JN1 and JN2 JN3. And JN0 is abnormal, such as GC, bad network or restarted.
      • NN1's lastAppliedTxId is 2, and at the moment, we are trying failover active from NN0 to NN1.
      • NN1 only got two responses from JN0 and JN1 when it try to selecting inputStreams with fromTxnId=3 and onlyDurableTxns=true, and the count txid of response is 0, 3 respectively. JN2 is abnormal, such as GC, bad network or restarted.
      • NN1 will cannot replay any Edits with fromTxnId=3 from JournalNodes because the maxAllowedTxns is 0.

      So I think Standby NameNode should catchupDuringFailover() with onlyDurableTxns=false , so that it can replay all missed edits from JournalNode.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            xuzq_zander ZanderXu
            xuzq_zander ZanderXu
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 50m
                50m

                Slack

                  Issue deployment