1. 17 Apr, 2020 5 commits
  2. 31 Mar, 2020 3 commits
  3. 04 Mar, 2020 3 commits
  4. 02 Mar, 2020 2 commits
  5. 26 Feb, 2020 3 commits
  6. 25 Feb, 2020 1 commit
  7. 14 Feb, 2020 5 commits
  8. 04 Feb, 2020 1 commit
  9. 31 Jan, 2020 2 commits
  10. 28 Jan, 2020 1 commit
  11. 08 Jan, 2020 1 commit
  12. 02 Jan, 2020 1 commit
  13. 26 Dec, 2019 1 commit
  14. 24 Dec, 2019 1 commit
    • Rob Herring's avatar
      of: Rework and simplify phandle cache to use a fixed size · 90dc0d1c
      Rob Herring authored
      The phandle cache was added to speed up of_find_node_by_phandle() by
      avoiding walking the whole DT to find a matching phandle. The
      implementation has several shortcomings:
      
        - The cache is designed to work on a linear set of phandle values.
          This is true for dtc generated DTs, but not for other cases such as
          Power.
        - The cache isn't enabled until of_core_init() and a typical system
          may see hundreds of calls to of_find_node_by_phandle() before that
          point.
        - The cache is freed and re-allocated when the number of phandles
          changes.
        - It takes a raw spinlock around a memory allocation which breaks on
          RT.
      
      Change the implementation to a fixed size and use hash_32() as the
      cache index. This greatly simplifies the implementation. It avoids
      the need for any re-alloc of the cache and taking a reference on nodes
      in the cache. We only have a single source of removing cache entries
      which is of_detach_node().
      
      Using hash_32() removes any assumption on phandle values improving
      the hit rate for non-linear phandle values. The effect on linear values
      using hash_32() is about a 10% collision. The chances of thrashing on
      colliding values seems to be low.
      
      To compare performance, I used a RK3399 board which is a pretty typical
      system. I found that just measuring boot time as done previously is
      noisy and may be impacted by other things. Also bringing up secondary
      cores causes some issues with measuring, so I booted with 'nr_cpus=1'.
      With no caching, calls to of_find_node_by_phandle() take about 20124 us
      for 1248 calls. There's an additional 288 calls before time keeping is
      up. Using the average time per hit/miss with the cache, we can calculate
      these calls to take 690 us (277 hit / 11 miss) with a 128 entry cache
      and 13319 us with no cache or an uninitialized cache.
      
      Comparing the 3 implementations the time spent in
      of_find_node_by_phandle() is:
      
      no cache:        20124 us (+ 13319 us)
      128 entry cache:  5134 us (+ 690 us)
      current cache:     819 us (+ 13319 us)
      
      We could move the allocation of the cache earlier to improve the
      current cache, but that just further complicates the situation as it
      needs to be after slab is up, so we can't do it when unflattening (which
      uses memblock).
      Reported-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Segher Boessenkool <segher@kernel.crashing.org>
      Cc: Frank Rowand <frowand.list@gmail.com>
      Acked-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Reviewed-by: default avatarFrank Rowand <frowand.list@gmail.com>
      Tested-by: default avatarFrank Rowand <frowand.list@gmail.com>
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      90dc0d1c
  15. 20 Dec, 2019 1 commit
  16. 13 Dec, 2019 1 commit
  17. 10 Dec, 2019 1 commit
  18. 26 Nov, 2019 2 commits
  19. 22 Nov, 2019 1 commit
  20. 21 Nov, 2019 3 commits
  21. 14 Nov, 2019 1 commit