- May 07, 2021
-
-
Yury Norov authored
Sync implementation with the kernel and move the macro from tools/include/linux/bitmap.h to tools/include/asm-generic/bitsperlong.h Link: https://lkml.kernel.org/r/20210401003153.97325-7-yury.norov@gmail.com Signed-off-by:
Yury Norov <yury.norov@gmail.com> Acked-by:
Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Alexey Klimov <aklimov@redhat.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David Sterba <dsterba@suse.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Jianpeng Ma <jianpeng.ma@intel.com> Cc: Joe Perches <joe@perches.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Rich Felker <dalias@libc.org> Cc: Stefano Brivio <sbrivio@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Wolfram Sang <wsa+renesas@sang-engineering.com> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Yury Norov authored
Kernel version generates better code. Link: https://lkml.kernel.org/r/20210401003153.97325-4-yury.norov@gmail.com Signed-off-by:
Yury Norov <yury.norov@gmail.com> Acked-by:
Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Alexey Klimov <aklimov@redhat.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David Sterba <dsterba@suse.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Jianpeng Ma <jianpeng.ma@intel.com> Cc: Joe Perches <joe@perches.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Rich Felker <dalias@libc.org> Cc: Stefano Brivio <sbrivio@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Wolfram Sang <wsa+renesas@sang-engineering.com> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Yury Norov authored
Some functions in tools/include/linux/bitmap.h declare nbits as int. In the kernel nbits is declared as unsigned int. Link: https://lkml.kernel.org/r/20210401003153.97325-3-yury.norov@gmail.com Signed-off-by:
Yury Norov <yury.norov@gmail.com> Acked-by:
Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Alexey Klimov <aklimov@redhat.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David Sterba <dsterba@suse.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Jianpeng Ma <jianpeng.ma@intel.com> Cc: Joe Perches <joe@perches.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Rich Felker <dalias@libc.org> Cc: Stefano Brivio <sbrivio@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Wolfram Sang <wsa+renesas@sang-engineering.com> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Yury Norov authored
Patch series "lib/find_bit: fast path for small bitmaps", v6. Bitmap operations are much simpler and faster in case of small bitmaps which fit into a single word. In linux/bitmap.c we have a machinery that allows compiler to replace actual function call with a few instructions if bitmaps passed into the function are small and their size is known at compile time. find_*_bit() API lacks this functionality; but users will benefit from it a lot. One important example is cpumask subsystem when NR_CPUS <= BITS_PER_LONG. This patch (of 12): GENMASK(h, l) may be passed with unsigned types. In such case, type-limits warning is generated for example in case of GENMASK(h, 0). Link: https://lkml.kernel.org/r/20210401003153.97325-1-yury.norov@gmail.com Link: https://lkml.kernel.org/r/20210401003153.97325-2-yury.norov@gmail.com Signed-off-by:
Yury Norov <yury.norov@gmail.com> Acked-by:
Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Alexey Klimov <aklimov@redhat.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David Sterba <dsterba@suse.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Jianpeng Ma <jianpeng.ma@intel.com> Cc: Joe Perches <joe@perches.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Rich Felker <dalias@libc.org> Cc: Stefano Brivio <sbrivio@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Wolfram Sang <wsa+renesas@sang-engineering.com> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Alexey Dobriyan authored
Test that /proc instance mounted with mount -t proc -o subset=pid contains only ".", "..", "self", "thread-self" and pid directories. Note: Currently "subset=pid" doesn't return "." and ".." via readdir. This must be a bug. Link: https://lkml.kernel.org/r/YFYZZ7WGaZlsnChS@localhost.localdomain Signed-off-by:
Alexey Dobriyan <adobriyan@gmail.com> Acked-by:
Alexey Gladkov <gladkov.alexey@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Alexey Dobriyan authored
Now that proc_ops are separate from file_operations and other operations it easy to check all instances to have ->proc_lseek hook and remove check in main code. Note: nonseekable_open() files naturally don't require ->proc_lseek. Garbage collect pde_lseek() function. [adobriyan@gmail.com: smoke test lseek()] Link: https://lkml.kernel.org/r/YG4OIhChOrVTPgdN@localhost.localdomain Link: https://lkml.kernel.org/r/YFYX0Bzwxlc7aBa/@localhost.localdomain Signed-off-by:
Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- May 05, 2021
-
-
Pavel Tatashin authored
When pages are pinned they can be faulted in userland and migrated, and they can be faulted right in kernel without migration. In either case, the pinned pages must end-up being pinnable (not movable). Add a new test to gup_test, to help verify that the gup/pup (get_user_pages() / pin_user_pages()) behavior with respect to pinnable and movable pages is reasonable and correct. Specifically, provide a way to: 1) Verify that only "pinnable" pages are pinned. This is checked automatically for you. 2) Verify that gup/pup performance is reasonable. This requires comparing benchmarks between doing gup/pup on pages that have been pre-faulted in from user space, vs. doing gup/pup on pages that are not faulted in until gup/pup time (via FOLL_TOUCH). This decision is controlled with the new -z command line option. Link: https://lkml.kernel.org/r/20210215161349.246722-15-pasha.tatashin@soleen.com Signed-off-by:
Pavel Tatashin <pasha.tatashin@soleen.com> Reviewed-by:
John Hubbard <jhubbard@nvidia.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: James Morris <jmorris@namei.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin <sashal@kernel.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Tyler Hicks <tyhicks@linux.microsoft.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Pavel Tatashin authored
In gup_test both gup_flags and test_flags use the same flags field. This is broken. Farther, in the actual gup_test.c all the passed gup_flags are erased and unconditionally replaced with FOLL_WRITE. Which means that test_flags are ignored, and code like this always performs pin dump test: 155 if (gup->flags & GUP_TEST_FLAG_DUMP_PAGES_USE_PIN) 156 nr = pin_user_pages(addr, nr, gup->flags, 157 pages + i, NULL); 158 else 159 nr = get_user_pages(addr, nr, gup->flags, 160 pages + i, NULL); 161 break; Add a new test_flags field, to allow raw gup_flags to work. Add a new subcommand for DUMP_USER_PAGES_TEST to specify that pin test should be performed. Remove unconditional overwriting of gup_flags via FOLL_WRITE. But, preserve the previous behaviour where FOLL_WRITE was the default flag, and add a new option "-W" to unset FOLL_WRITE. Rename flags with gup_flags. With the fix, dump works like this: root@virtme:/# gup_test -c ---- page #0, starting from user virt addr: 0x7f8acb9e4000 page:00000000d3d2ee27 refcount:2 mapcount:1 mapping:0000000000000000 index:0x0 pfn:0x100bcf anon flags: 0x300000000080016(referenced|uptodate|lru|swapbacked) raw: 0300000000080016 ffffd0e204021608 ffffd0e208df2e88 ffff8ea04243ec61 raw: 0000000000000000 0000000000000000 0000000200000000 0000000000000000 page dumped because: gup_test: dump_pages() test DUMP_USER_PAGES_TEST: done root@virtme:/# gup_test -c -p ---- page #0, starting from user virt addr: 0x7fd19701b000 page:00000000baed3c7d refcount:1025 mapcount:1 mapping:0000000000000000 index:0x0 pfn:0x108008 anon flags: 0x300000000080014(uptodate|lru|swapbacked) raw: 0300000000080014 ffffd0e204200188 ffffd0e205e09088 ffff8ea04243ee71 raw: 0000000000000000 0000000000000000 0000040100000000 0000000000000000 page dumped because: gup_test: dump_pages() test DUMP_USER_PAGES_TEST: done Refcount shows the difference between pin vs no-pin case. Also change type of nr from int to long, as it counts number of pages. Link: https://lkml.kernel.org/r/20210215161349.246722-14-pasha.tatashin@soleen.com Signed-off-by:
Pavel Tatashin <pasha.tatashin@soleen.com> Reviewed-by:
John Hubbard <jhubbard@nvidia.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: James Morris <jmorris@namei.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin <sashal@kernel.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Tyler Hicks <tyhicks@linux.microsoft.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Axel Rasmussen authored
Fix a dormant bug in userfaultfd_events_test(), where we did `return faulting_process(0)` instead of `exit(faulting_process(0))`. This caused the forked process to keep running, trying to execute any further test cases after the events test in parallel with the "real" process. Add a simple test case which exercises minor faults. In short, it does the following: 1. "Sets up" an area (area_dst) and a second shared mapping to the same underlying pages (area_dst_alias). 2. Register one of these areas with userfaultfd, in minor fault mode. 3. Start a second thread to handle any minor faults. 4. Populate the underlying pages with the non-UFFD-registered side of the mapping. Basically, memset() each page with some arbitrary contents. 5. Then, using the UFFD-registered mapping, read all of the page contents, asserting that the contents match expectations (we expect the minor fault handling thread can modify the page contents before resolving the fault). The minor fault handling thread, upon receiving an event, flips all the bits (~) in that page, just to prove that it can modify it in some arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the mapping and resolve the fault. The reading thread should wake up and see this modification. Currently the minor fault test is only enabled in hugetlb_shared mode, as this is the only configuration the kernel feature supports. Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com Signed-off-by:
Axel Rasmussen <axelrasmussen@google.com> Reviewed-by:
Peter Xu <peterx@redhat.com> Cc: Adam Ruprecht <ruprecht@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Cannon Matthews <cannonmatthews@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chinwen Chang <chinwen.chang@mediatek.com> Cc: David Rientjes <rientjes@google.com> Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Michal Koutn" <mkoutny@suse.com> Cc: Michel Lespinasse <walken@google.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oliver Upton <oupton@google.com> Cc: Shaohua Li <shli@fb.com> Cc: Shawn Anastasio <shawn@anastas.io> Cc: Steven Price <steven.price@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Zi Yan authored
Further extend <debugfs>/split_huge_pages to accept "<path>,<pgoff_start>,<pgoff_end>" for file-backed THP split tests since tmpfs may have file backed by THP that mapped nowhere. Update selftest program to test file-backed THP split too. Link: https://lkml.kernel.org/r/20210331235309.332292-2-zi.yan@sent.com Signed-off-by:
Zi Yan <ziy@nvidia.com> Suggested-by:
Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by:
Yang Shi <shy828301@gmail.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Sandipan Das <sandipan@linux.ibm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Mika Penttila <mika.penttila@nextfour.com> Cc: David Rientjes <rientjes@google.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Zi Yan authored
We did not have a direct user interface of splitting the compound page backing a THP and there is no need unless we want to expose the THP implementation details to users. Make <debugfs>/split_huge_pages accept a new command to do that. By writing "<pid>,<vaddr_start>,<vaddr_end>" to <debugfs>/split_huge_pages, THPs within the given virtual address range from the process with the given pid are split. It is used to test split_huge_page function. In addition, a selftest program is added to tools/testing/selftests/vm to utilize the interface by splitting PMD THPs and PTE-mapped THPs. This does not change the old behavior, i.e., writing 1 to the interface to split all THPs in the system. Link: https://lkml.kernel.org/r/20210331235309.332292-1-zi.yan@sent.com Signed-off-by:
Zi Yan <ziy@nvidia.com> Reviewed-by:
Yang Shi <shy828301@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mika Penttila <mika.penttila@nextfour.com> Cc: Sandipan Das <sandipan@linux.ibm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Apr 30, 2021
-
-
Uladzislau Rezki (Sony) authored
A 'single_cpu_test' parameter is odd and it does not exist anymore. Instead there was introduced a 'nr_threads' one. If it is not set it behaves as the former parameter. That is why update a "stress mode" according to this change specifying number of workers which are equal to number of CPUs. Also update an output of help message based on a new interface. Link: https://lkml.kernel.org/r/20210402202237.20334-3-urezki@gmail.com Signed-off-by:
Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Hillf Danton <hdanton@sina.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Brian Geffon authored
This test extends the current mremap tests to validate that the MREMAP_DONTUNMAP operation can be performed on shmem mappings. Link: https://lkml.kernel.org/r/20210323182520.2712101-3-bgeffon@google.com Signed-off-by:
Brian Geffon <bgeffon@google.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Peter Xu <peterx@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: "Michael S . Tsirkin" <mst@redhat.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Sonny Rao <sonnyrao@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Dmitry Safonov <dima@arista.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Alejandro Colomar <alx.manpages@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Johannes Weiner authored
With memcg having switched to rstat, memory.stat output is precise. Update the cgroup selftest to reflect the expectations and error tolerances of the new implementation. Also add newly tracked types of memory to the memory.stat side of the equation, since they're included in memory.current and could throw false positives. Link: https://lkml.kernel.org/r/20210209163304.77088-9-hannes@cmpxchg.org Signed-off-by:
Johannes Weiner <hannes@cmpxchg.org> Reviewed-by:
Shakeel Butt <shakeelb@google.com> Reviewed-by:
Michal Koutný <mkoutny@suse.com> Acked-by:
Roman Gushchin <guro@fb.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Apr 27, 2021
-
-
Pedro Tammela authored
Follows the same logic as the hashtable tests. Signed-off-by:
Pedro Tammela <pctammela@mojatatu.com> Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210424214510.806627-3-pctammela@mojatatu.com
-
Daniel Borkmann authored
Similarly as b0270958 ("bpf: Fix propagation of 32-bit signed bounds from 64-bit bounds."), we also need to fix the propagation of 32 bit unsigned bounds from 64 bit counterparts. That is, really only set the u32_{min,max}_value when /both/ {umin,umax}_value safely fit in 32 bit space. For example, the register with a umin_value == 1 does /not/ imply that u32_min_value is also equal to 1, since umax_value could be much larger than 32 bit subregister can hold, and thus u32_min_value is in the interval [0,1] instead. Before fix, invalid tracking result of R2_w=inv1: [...] 5: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0) R10=fp0 5: (35) if r2 >= 0x1 goto pc+1 [...] // goto path 7: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,umin_value=1) R10=fp0 7: (b6) if w2 <= 0x1 goto pc+1 [...] // goto path 9: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,smin_value=-9223372036854775807,smax_value=9223372032559808513,umin_value=1,umax_value=18446744069414584321,var_off=(0x1; 0xffffffff00000000),s32_min_value=1,s32_max_value=1,u32_max_value=1) R10=fp0 9: (bc) w2 = w2 10: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv1 R10=fp0 [...] After fix, correct tracking result of R2_w=inv(id=0,umax_value=1,var_off=(0x0; 0x1)): [...] 5: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0) R10=fp0 5: (35) if r2 >= 0x1 goto pc+1 [...] // goto path 7: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,umin_value=1) R10=fp0 7: (b6) if w2 <= 0x1 goto pc+1 [...] // goto path 9: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,smax_value=9223372032559808513,umax_value=18446744069414584321,var_off=(0x0; 0xffffffff00000001),s32_min_value=0,s32_max_value=1,u32_max_value=1) R10=fp0 9: (bc) w2 = w2 10: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,umax_value=1,var_off=(0x0; 0x1)) R10=fp0 [...] Thus, same issue as in b0270958 holds for unsigned subregister tracking. Also, align __reg64_bound_u32() similarly to __reg64_bound_s32() as done in b0270958 to make them uniform again. Fixes: 3f50f132 ("bpf: Verifier, do explicit ALU32 bounds tracking") Reported-by:
Manfred Paul <(@_manfp)> Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Reviewed-by:
John Fastabend <john.fastabend@gmail.com> Acked-by:
Alexei Starovoitov <ast@kernel.org>
-
Andrii Nakryiko authored
Fix failed tests checks in core_reloc test runner, which allowed failing tests to pass quietly. Also add extra check to make sure that expected to fail test cases with invalid names are caught as test failure anyway, as this is not an expected failure mode. Also fix mislabeled probed vs direct bitfield test cases. Fixes: 124a892d ("selftests/bpf: Test TYPE_EXISTS and TYPE_SIZE CO-RE relocations") Reported-by:
Lorenz Bauer <lmb@cloudflare.com> Signed-off-by:
Andrii Nakryiko <andrii@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20210426192949.416837-6-andrii@kernel.org
-
Andrii Nakryiko authored
Negative field existence cases for have a broken assumption that FIELD_EXISTS CO-RE relo will fail for fields that match the name but have incompatible type signature. That's not how CO-RE relocations generally behave. Types and fields that match by name but not by expected type are treated as non-matching candidates and are skipped. Error later is reported if no matching candidate was found. That's what happens for most relocations, but existence relocations (FIELD_EXISTS and TYPE_EXISTS) are more permissive and they are designed to return 0 or 1, depending if a match is found. This allows to handle name-conflicting but incompatible types in BPF code easily. Combined with ___flavor suffixes, it's possible to handle pretty much any structural type changes in kernel within the compiled once BPF source code. So, long story short, negative field existence test cases are invalid in their assumptions, so this patch reworks them into a single consolidated positive case that doesn't match any of the fields. Fixes: c7566a69 ("selftests/bpf: Add field existence CO-RE relocs tests") Reported-by:
Lorenz Bauer <lmb@cloudflare.com> Signed-off-by:
Andrii Nakryiko <andrii@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20210426192949.416837-5-andrii@kernel.org
-
Andrii Nakryiko authored
Fix BPF_CORE_READ_BITFIELD() macro used for reading CO-RE-relocatable bitfields. Missing breaks in a switch caused 8-byte reads always. This can confuse libbpf because it does strict checks that memory load size corresponds to the original size of the field, which in this case quite often would be wrong. After fixing that, we run into another problem, which quite subtle, so worth documenting here. The issue is in Clang optimization and CO-RE relocation interactions. Without that asm volatile construct (also known as barrier_var()), Clang will re-order BYTE_OFFSET and BYTE_SIZE relocations and will apply BYTE_OFFSET 4 times for each switch case arm. This will result in the same error from libbpf about mismatch of memory load size and original field size. I.e., if we were reading u32, we'd still have *(u8 *), *(u16 *), *(u32 *), and *(u64 *) memory loads, three of which will fail. Using barrier_var() forces Clang to apply BYTE_OFFSET relocation first (and once) to calculate p, after which value of p is used without relocation in each of switch case arms, doing appropiately-sized memory load. Here's the list of relevant relocations and pieces of generated BPF code before and after this patch for test_core_reloc_bitfields_direct selftests. BEFORE ===== #45: core_reloc: insn #160 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32 #46: core_reloc: insn #167 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #47: core_reloc: insn #174 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #48: core_reloc: insn #178 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #49: core_reloc: insn #182 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 157: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll 159: 7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1 160: b7 02 00 00 04 00 00 00 r2 = 4 ; BYTE_SIZE relocation here ^^^ 161: 66 02 07 00 03 00 00 00 if w2 s> 3 goto +7 <LBB0_63> 162: 16 02 0d 00 01 00 00 00 if w2 == 1 goto +13 <LBB0_65> 163: 16 02 01 00 02 00 00 00 if w2 == 2 goto +1 <LBB0_66> 164: 05 00 12 00 00 00 00 00 goto +18 <LBB0_69> 0000000000000528 <LBB0_66>: 165: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 167: 69 11 08 00 00 00 00 00 r1 = *(u16 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 168: 05 00 0e 00 00 00 00 00 goto +14 <LBB0_69> 0000000000000548 <LBB0_63>: 169: 16 02 0a 00 04 00 00 00 if w2 == 4 goto +10 <LBB0_67> 170: 16 02 01 00 08 00 00 00 if w2 == 8 goto +1 <LBB0_68> 171: 05 00 0b 00 00 00 00 00 goto +11 <LBB0_69> 0000000000000560 <LBB0_68>: 172: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 174: 79 11 08 00 00 00 00 00 r1 = *(u64 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 175: 05 00 07 00 00 00 00 00 goto +7 <LBB0_69> 0000000000000580 <LBB0_65>: 176: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 178: 71 11 08 00 00 00 00 00 r1 = *(u8 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 179: 05 00 03 00 00 00 00 00 goto +3 <LBB0_69> 00000000000005a0 <LBB0_67>: 180: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 182: 61 11 08 00 00 00 00 00 r1 = *(u32 *)(r1 + 8) ; BYTE_OFFSET relo here w/ RIGHT size ^^^^^^^^^^^^^^^^ 00000000000005b8 <LBB0_69>: 183: 67 01 00 00 20 00 00 00 r1 <<= 32 184: b7 02 00 00 00 00 00 00 r2 = 0 185: 16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71> 186: c7 01 00 00 20 00 00 00 r1 s>>= 32 187: 05 00 01 00 00 00 00 00 goto +1 <LBB0_72> 00000000000005e0 <LBB0_71>: 188: 77 01 00 00 20 00 00 00 r1 >>= 32 AFTER ===== #30: core_reloc: insn #132 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #31: core_reloc: insn #134 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32 129: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll 131: 7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1 132: b7 01 00 00 08 00 00 00 r1 = 8 ; BYTE_OFFSET relo here ^^^ ; no size check for non-memory dereferencing instructions 133: 0f 12 00 00 00 00 00 00 r2 += r1 134: b7 03 00 00 04 00 00 00 r3 = 4 ; BYTE_SIZE relocation here ^^^ 135: 66 03 05 00 03 00 00 00 if w3 s> 3 goto +5 <LBB0_63> 136: 16 03 09 00 01 00 00 00 if w3 == 1 goto +9 <LBB0_65> 137: 16 03 01 00 02 00 00 00 if w3 == 2 goto +1 <LBB0_66> 138: 05 00 0a 00 00 00 00 00 goto +10 <LBB0_69> 0000000000000458 <LBB0_66>: 139: 69 21 00 00 00 00 00 00 r1 = *(u16 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 140: 05 00 08 00 00 00 00 00 goto +8 <LBB0_69> 0000000000000468 <LBB0_63>: 141: 16 03 06 00 04 00 00 00 if w3 == 4 goto +6 <LBB0_67> 142: 16 03 01 00 08 00 00 00 if w3 == 8 goto +1 <LBB0_68> 143: 05 00 05 00 00 00 00 00 goto +5 <LBB0_69> 0000000000000480 <LBB0_68>: 144: 79 21 00 00 00 00 00 00 r1 = *(u64 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 145: 05 00 03 00 00 00 00 00 goto +3 <LBB0_69> 0000000000000490 <LBB0_65>: 146: 71 21 00 00 00 00 00 00 r1 = *(u8 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 147: 05 00 01 00 00 00 00 00 goto +1 <LBB0_69> 00000000000004a0 <LBB0_67>: 148: 61 21 00 00 00 00 00 00 r1 = *(u32 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 00000000000004a8 <LBB0_69>: 149: 67 01 00 00 20 00 00 00 r1 <<= 32 150: b7 02 00 00 00 00 00 00 r2 = 0 151: 16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71> 152: c7 01 00 00 20 00 00 00 r1 s>>= 32 153: 05 00 01 00 00 00 00 00 goto +1 <LBB0_72> 00000000000004d0 <LBB0_71>: 154: 77 01 00 00 20 00 00 00 r1 >>= 323 Fixes: ee26dade ("libbpf: Add support for relocatable bitfields") Signed-off-by:
Andrii Nakryiko <andrii@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20210426192949.416837-4-andrii@kernel.org
-
Andrii Nakryiko authored
Add BTF_KIND_FLOAT support when doing CO-RE field type compatibility check. Without this, relocations against float/double fields will fail. Also adjust one error message to emit instruction index instead of less convenient instruction byte offset. Fixes: 22541a9e ("libbpf: Add BTF_KIND_FLOAT support") Signed-off-by:
Andrii Nakryiko <andrii@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20210426192949.416837-3-andrii@kernel.org
-
Andrii Nakryiko authored
Add ASSERT_TRUE/ASSERT_FALSE for conditions calculated with custom logic to true/false. Also add remaining arithmetical assertions: - ASSERT_LE -- less than or equal; - ASSERT_GT -- greater than; - ASSERT_GE -- greater than or equal. This should cover most scenarios where people fall back to error-prone CHECK()s. Also extend ASSERT_ERR() to print out errno, in addition to direct error. Also convert few CHECK() instances to ensure new ASSERT_xxx() variants work as expected. Subsequent patch will also use ASSERT_TRUE/ASSERT_FALSE more extensively. Signed-off-by:
Andrii Nakryiko <andrii@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20210426192949.416837-2-andrii@kernel.org
-
- Apr 26, 2021
-
-
Jiri Olsa authored
Replacing CHECK with ASSERT macros. Suggested-by:
Andrii Nakryiko <andrii@kernel.org> Signed-off-by:
Jiri Olsa <jolsa@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210414195147.1624932-8-jolsa@kernel.org
-
Jiri Olsa authored
Adding test to verify that once we attach module's trampoline, the module can't be unloaded. Signed-off-by:
Jiri Olsa <jolsa@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210414195147.1624932-7-jolsa@kernel.org
-
Jiri Olsa authored
Adding the test to re-attach (detach/attach again) lsm programs, plus check that already linked program can't be attached again. Signed-off-by:
Jiri Olsa <jolsa@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210414195147.1624932-6-jolsa@kernel.org
-
Jiri Olsa authored
Adding the test to re-attach (detach/attach again) tracing fexit programs, plus check that already linked program can't be attached again. Also switching to ASSERT* macros. Signed-off-by:
Jiri Olsa <jolsa@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210414195147.1624932-5-jolsa@kernel.org
-
Jiri Olsa authored
Adding the test to re-attach (detach/attach again) tracing fentry programs, plus check that already linked program can't be attached again. Signed-off-by:
Jiri Olsa <jolsa@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210414195147.1624932-4-jolsa@kernel.org
-
- Apr 24, 2021
-
-
Masahiro Yamada authored
Since commit 57fd251c ("kbuild: split cc-option and friends to scripts/Makefile.compiler"), some kselftests fail to build. The tools/ directory opted out Kbuild, and went in a different direction. People copied scripts and Makefiles to the tools/ directory to create their own build system. tools/build/Build.include mimics scripts/Kbuild.include, but some tool Makefiles include the Kbuild one to import a feature that is missing in tools/build/Build.include: - Commit ec04aa3a ("tools/thermal: tmon: use "-fstack-protector" only if supported") included scripts/Kbuild.include from tools/thermal/tmon/Makefile to import the cc-option macro. - Commit c2390f16 ("selftests: kvm: fix for compilers that do not support -no-pie") included scripts/Kbuild.include from tools/testing/selftests/kvm/Makefile to import the try-run macro. - Commit 9cae4ace ("selftests/bpf: do not ignore clang failures") included scripts/Kbuild.include from tools/testing/selftests/bpf/Makefile to import the .DELETE_ON_ERROR target. - Commit 0695f8bc ("selftests/powerpc: Handle Makefile for unrecognized option") included scripts/Kbuild.include from tools/testing/selftests/powerpc/pmu/ebb/Makefile to import the try-run macro. Copy what they need into tools/build/Build.include, and make them include it instead of scripts/Kbuild.include. Link: https://lore.kernel.org/lkml/86dadf33-70f7-a5ac-cb8c-64966d2f45a1@linux.ibm.com/ Fixes: 57fd251c ("kbuild: split cc-option and friends to scripts/Makefile.compiler") Reported-by:
Janosch Frank <frankja@linux.ibm.com> Reported-by:
Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by:
Masahiro Yamada <masahiroy@kernel.org> Tested-by:
Christian Borntraeger <borntraeger@de.ibm.com> Acked-by:
Yonghong Song <yhs@fb.com>
-
- Apr 23, 2021
-
-
Vasily Averin authored
slabinfo.py script does not work with actual kernel version. First, it was unable to recognise SLUB susbsytem, and when I specified it manually it failed again with AttributeError: 'struct page' has no member 'obj_cgroups' .. and then again with File "tools/cgroup/memcg_slabinfo.py", line 221, in main memcg.kmem_caches.address_of_(), AttributeError: 'struct mem_cgroup' has no member 'kmem_caches' Link: https://lkml.kernel.org/r/cec1a75e-43b4-3d64-2084-d9f98fda037f@virtuozzo.com Signed-off-by:
Vasily Averin <vvs@virtuozzo.com> Tested-by:
Roman Gushchin <guro@fb.com> Acked-by:
Roman Gushchin <guro@fb.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Po-Hsu Lin authored
We found that with the latest mainline kernel (5.12.0-051200rc8) on some KVM instances / bare-metal systems, the following tests will take longer than the kselftest framework default timeout (45 seconds) to run and thus got terminated with TIMEOUT error: * xfrm_policy.sh - took about 2m20s * pmtu.sh - took about 3m5s * udpgso_bench.sh - took about 60s Bump the timeout setting to 5 minutes to allow them have a chance to finish. https://bugs.launchpad.net/bugs/1856010 Signed-off-by:
Po-Hsu Lin <po-hsu.lin@canonical.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Yonglong Li authored
Extend mptcp_connect tool with MSG_PEEK support and add a test case in mptcp_connect.sh that checks the data received from/after recv() with MSG_PEEK. Acked-by:
Paolo Abeni <pabeni@redhat.com> Co-developed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Yonglong Li <liyonglong@chinatelecom.cn> Signed-off-by:
Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Andrii Nakryiko authored
Document which fixes are required to generate correct static linking selftests. Signed-off-by:
Andrii Nakryiko <andrii@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-19-andrii@kernel.org
-
Andrii Nakryiko authored
Add selftest validating various aspects of statically linking BTF-defined map definitions. Legacy map definitions do not support extern resolution between object files. Some of the aspects validated: - correct resolution of extern maps against concrete map definitions; - extern maps can currently only specify map type and key/value size and/or type information; - weak concrete map definitions are resolved properly. Static map definitions are not yet supported by libbpf, so they are not explicitly tested, though manual testing showes that BPF linker handles them properly. Signed-off-by:
Andrii Nakryiko <andrii@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-18-andrii@kernel.org
-
Andrii Nakryiko authored
Add selftest validating various aspects of statically linking global variables: - correct resolution of extern variables across .bss, .data, and .rodata sections; - correct handling of weak definitions; - correct de-duplication of repeating special externs (.kconfig, .ksyms). Signed-off-by:
Andrii Nakryiko <andrii@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-17-andrii@kernel.org
-
Andrii Nakryiko authored
Add selftest validating various aspects of statically linking functions: - no conflicts and correct resolution for name-conflicting static funcs; - correct resolution of extern functions; - correct handling of weak functions, both resolution itself and libbpf's handling of unused weak function that "lost" (it leaves gaps in code with no ELF symbols); - correct handling of hidden visibility to turn global function into "static" for the purpose of BPF verification. Signed-off-by:
Andrii Nakryiko <andrii@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-16-andrii@kernel.org
-
Andrii Nakryiko authored
Skip generating individual BPF skeletons for files that are supposed to be linked together to form the final BPF object file. Very often such files are "incomplete" BPF object files, which will fail libbpf bpf_object__open() step, if used individually, thus failing BPF skeleton generation. This is by design, so skip individual BPF skeletons and only validate them as part of their linked final BPF object file and skeleton. Signed-off-by:
Andrii Nakryiko <andrii@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-15-andrii@kernel.org
-
Andrii Nakryiko authored
While -Og is designed to work well with debugger, it's still inferior to -O0 in terms of debuggability experience. It will cause some variables to still be inlined, it will also prevent single-stepping some statements and otherwise interfere with debugging experience. So switch to -O0 which turns off any optimization and provides the best debugging experience. Signed-off-by:
Andrii Nakryiko <andrii@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-14-andrii@kernel.org
-
Andrii Nakryiko authored
Add extra logic to handle map externs (only BTF-defined maps are supported for linking). Re-use the map parsing logic used during bpf_object__open(). Map externs are currently restricted to always match complete map definition. So all the specified attributes will be compared (down to pining, map_flags, numa_node, etc). In the future this restriction might be relaxed with no backwards compatibility issues. If any attribute is mismatched between extern and actual map definition, linker will report an error, pointing out which one mismatches. The original intent was to allow for extern to specify attributes that matters (to user) to enforce. E.g., if you specify just key information and omit value, then any value fits. Similarly, it should have been possible to enforce map_flags, pinning, and any other possible map attribute. Unfortunately, that means that multiple externs can be only partially overlapping with each other, which means linker would need to combine their type definitions to end up with the most restrictive and fullest map definition. This requires an extra amount of BTF manipulation which at this time was deemed unnecessary and would require further extending generic BTF writer APIs. So that is left for future follow ups, if there will be demand for that. But the idea seems intresting and useful, so I want to document it here. Weak definitions are also supported, but are pretty strict as well, just like externs: all weak map definitions have to match exactly. In the follow up patches this most probably will be relaxed, with __weak map definitions being able to differ between each other (with non-weak definition always winning, of course). Signed-off-by:
Andrii Nakryiko <andrii@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-13-andrii@kernel.org
-
Andrii Nakryiko authored
Add BPF static linker logic to resolve extern variables and functions across multiple linked together BPF object files. For that, linker maintains a separate list of struct glob_sym structures, which keeps track of few pieces of metadata (is it extern or resolved global, is it a weak symbol, which ELF section it belongs to, etc) and ties together BTF type info and ELF symbol information and keeps them in sync. With adding support for extern variables/funcs, it's now possible for some sections to contain both extern and non-extern definitions. This means that some sections may start out as ephemeral (if only externs are present and thus there is not corresponding ELF section), but will be "upgraded" to actual ELF section as symbols are resolved or new non-extern definitions are appended. Additional care is taken to not duplicate extern entries in sections like .kconfig and .ksyms. Given libbpf requires BTF type to always be present for .kconfig/.ksym externs, linker extends this requirement to all the externs, even those that are supposed to be resolved during static linking and which won't be visible to libbpf. With BTF information always present, static linker will check not just ELF symbol matches, but entire BTF type signature match as well. That logic is stricter that BPF CO-RE checks. It probably should be re-used by .ksym resolution logic in libbpf as well, but that's left for follow up patches. To make it unnecessary to rewrite ELF symbols and minimize BTF type rewriting/removal, ELF symbols that correspond to externs initially will be updated in place once they are resolved. Similarly for BTF type info, VAR/FUNC and var_secinfo's (sec_vars in struct bpf_linker) are staying stable, but types they point to might get replaced when extern is resolved. This might leave some left-over types (even though we try to minimize this for common cases of having extern funcs with not argument names vs concrete function with names properly specified). That can be addresses later with a generic BTF garbage collection. That's left for a follow up as well. Given BTF type appending phase is separate from ELF symbol appending/resolution, special struct glob_sym->underlying_btf_id variable is used to communicate resolution and rewrite decisions. 0 means underlying_btf_id needs to be appended (it's not yet in final linker->btf), <0 values are used for temporary storage of source BTF type ID (not yet rewritten), so -glob_sym->underlying_btf_id is BTF type id in obj-btf. But by the end of linker_append_btf() phase, that underlying_btf_id will be remapped and will always be > 0. This is the uglies part of the whole process, but keeps the other parts much simpler due to stability of sec_var and VAR/FUNC types, as well as ELF symbol, so please keep that in mind while reviewing. BTF-defined maps require some extra custom logic and is addressed separate in the next patch, so that to keep this one smaller and easier to review. Signed-off-by:
Andrii Nakryiko <andrii@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-12-andrii@kernel.org
-
Andrii Nakryiko authored
It should never fail, but if it does, it's better to know about this rather than end up with nonsensical type IDs. Signed-off-by:
Andrii Nakryiko <andrii@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-11-andrii@kernel.org
-
Andrii Nakryiko authored
Add logic to validate extern symbols, plus some other minor extra checks, like ELF symbol #0 validation, general symbol visibility and binding validations. Signed-off-by:
Andrii Nakryiko <andrii@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-10-andrii@kernel.org
-