Skip to content
  • Lyude Paul's avatar
    drm/nouveau: Don't retry infinitely when receiving no data on i2c over AUX · c358ebf5
    Lyude Paul authored
    While I had thought I had fixed this issue in:
    
    commit 342406e4
    
     ("drm/nouveau/i2c: Disable i2c bus access after
    ->fini()")
    
    It turns out that while I did fix the error messages I was seeing on my
    P50 when trying to access i2c busses with the GPU in runtime suspend, I
    accidentally had missed one important detail that was mentioned on the
    bug report this commit was supposed to fix: that the CPU would only lock
    up when trying to access i2c busses _on connected devices_ _while the
    GPU is not in runtime suspend_. Whoops. That definitely explains why I
    was not able to get my machine to hang with i2c bus interactions until
    now, as plugging my P50 into it's dock with an HDMI monitor connected
    allowed me to finally reproduce this locally.
    
    Now that I have managed to reproduce this issue properly, it looks like
    the problem is much simpler then it looks. It turns out that some
    connected devices, such as MST laptop docks, will actually ACK i2c reads
    even if no data was actually read:
    
    [  275.063043] nouveau 0000:01:00.0: i2c: aux 000a: 1: 0000004c 1
    [  275.063447] nouveau 0000:01:00.0: i2c: aux 000a: 00 01101000 10040000
    [  275.063759] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000001
    [  275.064024] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
    [  275.064285] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
    [  275.064594] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
    
    Because we don't handle the situation of i2c ack without any data, we
    end up entering an infinite loop in nvkm_i2c_aux_i2c_xfer() since the
    value of cnt always remains at 0. This finally properly explains how
    this could result in a CPU hang like the ones observed in the
    aforementioned commit.
    
    So, fix this by retrying transactions if no data is written or received,
    and give up and fail the transaction if we continue to not write or
    receive any data after 32 retries.
    
    Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
    c358ebf5