Skip to content
  • Fabrice Bellet's avatar
    conncheck: fix the component failed transition · b75ce5f3
    Fabrice Bellet authored
    This patch fixes the transition of a component from connecting to
    failed, that previously occured due to the propagation of the
    keep_timer_going variable, and to the final call to function
    priv_update_check_list_failed_components(), after the global agent
    timer was stopped.
    
    Previously, the code almost never entered to failed state, because the
    timer was going one, as long as the number of nominated pair was not
    enough, and as long as there were valid pairs not yet nominated. Even
    if all pair timers were over.
    
    The definition of the Failed state of a conncheck list is somewhat
    contradictory in the spec, depending on weather you read :
    
     * sect 5.7.4. "Computing States",
     "Failed:  In this state, the ICE checks have not completed successfully
     for this media stream."
    
     or
    
     * sect 7.1.3.3. "Check List and Timer State Updates",
     "If all of the pairs in the check list are now either in the Failed or
     Succeeded state: If there is not a pair in the valid list for each
     component of the media stream, the state of the check list is set to
     Failed."
    
    Our understanding of the ICE spec is that, the proper way to enter failed
    state instead in when all connchecks have no longer in-progress pairs.
    All pairs are either in state succeeded, discovered, or failed. No timer
    is still running, and we have no hope that the conncheck list changes
    again, except if an unexpected STUN packet arrives later. All pairs in
    frozen state is a special case, that is handled separately (sect
    7.1.3.3).
    
    A special grace delay is added before declaring a component in state
    Failed. This delay is not part of the RFC, and it is aimed to limit the
    cases when a conncheck list is reactivated just after it's been declared
    failed, causing a user visible transition from connecting to failed, and
    back from failed to connecting again. This is also required by the test
    suite, that counts exactly the number of time each state is entered, and
    doesn't expect these transcient failed states to happen (frequent due to
    the nature of the testsuite, less frequent in real life).
    
    Differential Revision: https://phabricator.freedesktop.org/D1111
    b75ce5f3