Skip to content
Snippets Groups Projects
Select Git revision
  • a4a3ede2132ae0863e2d43e06f9b5697c51a7a3b
  • vme-testing default
  • ci-test
  • master
  • remoteproc
  • am625-sk-ov5640
  • pcal6534-upstreaming
  • lps22df-upstreaming
  • msc-upstreaming
  • imx8mp
  • iio/noa1305
  • vme-next
  • vme-next-4.14-rc4
  • v4.14-rc4
  • v4.14-rc3
  • v4.14-rc2
  • v4.14-rc1
  • v4.13
  • vme-next-4.13-rc7
  • v4.13-rc7
  • v4.13-rc6
  • v4.13-rc5
  • v4.13-rc4
  • v4.13-rc3
  • v4.13-rc2
  • v4.13-rc1
  • v4.12
  • v4.12-rc7
  • v4.12-rc6
  • v4.12-rc5
  • v4.12-rc4
  • v4.12-rc3
32 results

page_alloc.c

Blame
    • Pavel Tatashin's avatar
      a4a3ede2
      mm: zero reserved and unavailable struct pages · a4a3ede2
      Pavel Tatashin authored
      Some memory is reserved but unavailable: not present in memblock.memory
      (because not backed by physical pages), but present in memblock.reserved.
      Such memory has backing struct pages, but they are not initialized by
      going through __init_single_page().
      
      In some cases these struct pages are accessed even if they do not
      contain any data.  One example is page_to_pfn() might access page->flags
      if this is where section information is stored (CONFIG_SPARSEMEM,
      SECTION_IN_PAGE_FLAGS).
      
      One example of such memory: trim_low_memory_range() unconditionally
      reserves from pfn 0, but e820__memblock_setup() might provide the
      exiting memory from pfn 1 (i.e.  KVM).
      
      Since struct pages are zeroed in __init_single_page(), and not during
      allocation time, we must zero such struct pages explicitly.
      
      The patch involves adding a new memblock iterator:
      	for_each_resv_unavail_range(i, p_start, p_end)
      
      Which iterates through reserved && !memory lists, and we zero struct pages
      explicitly by calling mm_zero_struct_page().
      
      ===
      
      Here is more detailed example of problem that this patch is addressing:
      
      Run tested on qemu with the following arguments:
      
      	-enable-kvm -cpu kvm64 -m 512 -smp 2
      
      This patch reports that there are 98 unavailable pages.
      
      They are: pfn 0 and pfns in range [159, 255].
      
      Note, trim_low_memory_range() reserves only pfns in range [0, 15], it does
      not reserve [159, 255] ones.
      
      e820__memblock_setup() reports linux that the following physical ranges are
      available:
          [1 , 158]
      [256, 130783]
      
      Notice, that exactly unavailable pfns are missing!
      
      Now, lets check what we have in zone 0: [1, 131039]
      
      pfn 0, is not part of the zone, but pfns [1, 158], are.
      
      However, the bigger problem we have if we do not initialize these struct
      pages is with memory hotplug.  Because, that path operates at 2M
      boundaries (section_nr).  And checks if 2M range of pages is hot
      removable.  It starts with first pfn from zone, rounds it down to 2M
      boundary (sturct pages are allocated at 2M boundaries when vmemmap is
      created), and checks if that section is hot removable.  In this case
      start with pfn 1 and convert it down to pfn 0.  Later pfn is converted
      to struct page, and some fields are checked.  Now, if we do not zero
      struct pages, we get unpredictable results.
      
      In fact when CONFIG_VM_DEBUG is enabled, and we explicitly set all
      vmemmap memory to ones, the following panic is observed with kernel test
      without this patch applied:
      
        BUG: unable to handle kernel NULL pointer dereference at          (null)
        IP: is_pageblock_removable_nolock+0x35/0x90
        PGD 0 P4D 0
        Oops: 0000 [#1] PREEMPT
        ...
        task: ffff88001f4e2900 task.stack: ffffc90000314000
        RIP: 0010:is_pageblock_removable_nolock+0x35/0x90
        Call Trace:
         ? is_mem_section_removable+0x5a/0xd0
         show_mem_removable+0x6b/0xa0
         dev_attr_show+0x1b/0x50
         sysfs_kf_seq_show+0xa1/0x100
         kernfs_seq_show+0x22/0x30
         seq_read+0x1ac/0x3a0
         kernfs_fop_read+0x36/0x190
         ? security_file_permission+0x90/0xb0
         __vfs_read+0x16/0x30
         vfs_read+0x81/0x130
         SyS_read+0x44/0xa0
         entry_SYSCALL_64_fastpath+0x1f/0xbd
      
      Link: http://lkml.kernel.org/r/20171013173214.27300-7-pasha.tatashin@oracle.com
      
      
      Signed-off-by: default avatarPavel Tatashin <pasha.tatashin@oracle.com>
      Reviewed-by: default avatarSteven Sistare <steven.sistare@oracle.com>
      Reviewed-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Reviewed-by: default avatarBob Picco <bob.picco@oracle.com>
      Tested-by: default avatarBob Picco <bob.picco@oracle.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a4a3ede2
      History
      mm: zero reserved and unavailable struct pages
      Pavel Tatashin authored
      Some memory is reserved but unavailable: not present in memblock.memory
      (because not backed by physical pages), but present in memblock.reserved.
      Such memory has backing struct pages, but they are not initialized by
      going through __init_single_page().
      
      In some cases these struct pages are accessed even if they do not
      contain any data.  One example is page_to_pfn() might access page->flags
      if this is where section information is stored (CONFIG_SPARSEMEM,
      SECTION_IN_PAGE_FLAGS).
      
      One example of such memory: trim_low_memory_range() unconditionally
      reserves from pfn 0, but e820__memblock_setup() might provide the
      exiting memory from pfn 1 (i.e.  KVM).
      
      Since struct pages are zeroed in __init_single_page(), and not during
      allocation time, we must zero such struct pages explicitly.
      
      The patch involves adding a new memblock iterator:
      	for_each_resv_unavail_range(i, p_start, p_end)
      
      Which iterates through reserved && !memory lists, and we zero struct pages
      explicitly by calling mm_zero_struct_page().
      
      ===
      
      Here is more detailed example of problem that this patch is addressing:
      
      Run tested on qemu with the following arguments:
      
      	-enable-kvm -cpu kvm64 -m 512 -smp 2
      
      This patch reports that there are 98 unavailable pages.
      
      They are: pfn 0 and pfns in range [159, 255].
      
      Note, trim_low_memory_range() reserves only pfns in range [0, 15], it does
      not reserve [159, 255] ones.
      
      e820__memblock_setup() reports linux that the following physical ranges are
      available:
          [1 , 158]
      [256, 130783]
      
      Notice, that exactly unavailable pfns are missing!
      
      Now, lets check what we have in zone 0: [1, 131039]
      
      pfn 0, is not part of the zone, but pfns [1, 158], are.
      
      However, the bigger problem we have if we do not initialize these struct
      pages is with memory hotplug.  Because, that path operates at 2M
      boundaries (section_nr).  And checks if 2M range of pages is hot
      removable.  It starts with first pfn from zone, rounds it down to 2M
      boundary (sturct pages are allocated at 2M boundaries when vmemmap is
      created), and checks if that section is hot removable.  In this case
      start with pfn 1 and convert it down to pfn 0.  Later pfn is converted
      to struct page, and some fields are checked.  Now, if we do not zero
      struct pages, we get unpredictable results.
      
      In fact when CONFIG_VM_DEBUG is enabled, and we explicitly set all
      vmemmap memory to ones, the following panic is observed with kernel test
      without this patch applied:
      
        BUG: unable to handle kernel NULL pointer dereference at          (null)
        IP: is_pageblock_removable_nolock+0x35/0x90
        PGD 0 P4D 0
        Oops: 0000 [#1] PREEMPT
        ...
        task: ffff88001f4e2900 task.stack: ffffc90000314000
        RIP: 0010:is_pageblock_removable_nolock+0x35/0x90
        Call Trace:
         ? is_mem_section_removable+0x5a/0xd0
         show_mem_removable+0x6b/0xa0
         dev_attr_show+0x1b/0x50
         sysfs_kf_seq_show+0xa1/0x100
         kernfs_seq_show+0x22/0x30
         seq_read+0x1ac/0x3a0
         kernfs_fop_read+0x36/0x190
         ? security_file_permission+0x90/0xb0
         __vfs_read+0x16/0x30
         vfs_read+0x81/0x130
         SyS_read+0x44/0xa0
         entry_SYSCALL_64_fastpath+0x1f/0xbd
      
      Link: http://lkml.kernel.org/r/20171013173214.27300-7-pasha.tatashin@oracle.com
      
      
      Signed-off-by: default avatarPavel Tatashin <pasha.tatashin@oracle.com>
      Reviewed-by: default avatarSteven Sistare <steven.sistare@oracle.com>
      Reviewed-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Reviewed-by: default avatarBob Picco <bob.picco@oracle.com>
      Tested-by: default avatarBob Picco <bob.picco@oracle.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    drm_fbdev_generic.c 9.37 KiB
    // SPDX-License-Identifier: MIT
    
    #include <linux/moduleparam.h>
    #include <linux/vmalloc.h>
    
    #include <drm/drm_crtc_helper.h>
    #include <drm/drm_drv.h>
    #include <drm/drm_fb_helper.h>
    #include <drm/drm_framebuffer.h>
    #include <drm/drm_gem.h>
    #include <drm/drm_print.h>
    
    #include <drm/drm_fbdev_generic.h>
    
    /* @user: 1=userspace, 0=fbcon */
    static int drm_fbdev_generic_fb_open(struct fb_info *info, int user)
    {
    	struct drm_fb_helper *fb_helper = info->par;
    
    	/* No need to take a ref for fbcon because it unbinds on unregister */
    	if (user && !try_module_get(fb_helper->dev->driver->fops->owner))
    		return -ENODEV;
    
    	return 0;
    }
    
    static int drm_fbdev_generic_fb_release(struct fb_info *info, int user)
    {
    	struct drm_fb_helper *fb_helper = info->par;
    
    	if (user)
    		module_put(fb_helper->dev->driver->fops->owner);
    
    	return 0;
    }
    
    FB_GEN_DEFAULT_DEFERRED_SYSMEM_OPS(drm_fbdev_generic,
    				   drm_fb_helper_damage_range,
    				   drm_fb_helper_damage_area);
    
    static void drm_fbdev_generic_fb_destroy(struct fb_info *info)
    {
    	struct drm_fb_helper *fb_helper = info->par;
    	void *shadow = info->screen_buffer;
    
    	if (!fb_helper->dev)
    		return;
    
    	fb_deferred_io_cleanup(info);
    	drm_fb_helper_fini(fb_helper);
    	vfree(shadow);
    	drm_client_framebuffer_delete(fb_helper->buffer);
    
    	drm_client_release(&fb_helper->client);
    	drm_fb_helper_unprepare(fb_helper);
    	kfree(fb_helper);
    }
    
    static const struct fb_ops drm_fbdev_generic_fb_ops = {
    	.owner		= THIS_MODULE,
    	.fb_open	= drm_fbdev_generic_fb_open,
    	.fb_release	= drm_fbdev_generic_fb_release,
    	FB_DEFAULT_DEFERRED_OPS(drm_fbdev_generic),
    	DRM_FB_HELPER_DEFAULT_OPS,
    	.fb_destroy	= drm_fbdev_generic_fb_destroy,
    };
    
    /*
     * This function uses the client API to create a framebuffer backed by a dumb buffer.
     */
    static int drm_fbdev_generic_helper_fb_probe(struct drm_fb_helper *fb_helper,
    					     struct drm_fb_helper_surface_size *sizes)
    {
    	struct drm_client_dev *client = &fb_helper->client;
    	struct drm_device *dev = fb_helper->dev;
    	struct drm_client_buffer *buffer;
    	struct fb_info *info;
    	size_t screen_size;
    	void *screen_buffer;
    	u32 format;
    	int ret;
    
    	drm_dbg_kms(dev, "surface width(%d), height(%d) and bpp(%d)\n",
    		    sizes->surface_width, sizes->surface_height,
    		    sizes->surface_bpp);
    
    	format = drm_driver_legacy_fb_format(dev, sizes->surface_bpp,
    					     sizes->surface_depth);
    	buffer = drm_client_framebuffer_create(client, sizes->surface_width,
    					       sizes->surface_height, format);
    	if (IS_ERR(buffer))
    		return PTR_ERR(buffer);
    
    	fb_helper->buffer = buffer;
    	fb_helper->fb = buffer->fb;
    
    	screen_size = buffer->gem->size;
    	screen_buffer = vzalloc(screen_size);
    	if (!screen_buffer) {
    		ret = -ENOMEM;
    		goto err_drm_client_framebuffer_delete;
    	}
    
    	info = drm_fb_helper_alloc_info(fb_helper);
    	if (IS_ERR(info)) {
    		ret = PTR_ERR(info);
    		goto err_vfree;
    	}
    
    	drm_fb_helper_fill_info(info, fb_helper, sizes);
    
    	info->fbops = &drm_fbdev_generic_fb_ops;
    
    	/* screen */
    	info->flags |= FBINFO_VIRTFB | FBINFO_READS_FAST;
    	info->screen_buffer = screen_buffer;
    	info->fix.smem_len = screen_size;
    
    	/* deferred I/O */
    	fb_helper->fbdefio.delay = HZ / 20;
    	fb_helper->fbdefio.deferred_io = drm_fb_helper_deferred_io;
    
    	info->fbdefio = &fb_helper->fbdefio;
    	ret = fb_deferred_io_init(info);
    	if (ret)
    		goto err_drm_fb_helper_release_info;
    
    	return 0;
    
    err_drm_fb_helper_release_info:
    	drm_fb_helper_release_info(fb_helper);
    err_vfree:
    	vfree(screen_buffer);
    err_drm_client_framebuffer_delete:
    	fb_helper->fb = NULL;
    	fb_helper->buffer = NULL;
    	drm_client_framebuffer_delete(buffer);
    	return ret;
    }
    
    static void drm_fbdev_generic_damage_blit_real(struct drm_fb_helper *fb_helper,
    					       struct drm_clip_rect *clip,
    					       struct iosys_map *dst)
    {
    	struct drm_framebuffer *fb = fb_helper->fb;
    	size_t offset = clip->y1 * fb->pitches[0];
    	size_t len = clip->x2 - clip->x1;
    	unsigned int y;
    	void *src;
    
    	switch (drm_format_info_bpp(fb->format, 0)) {
    	case 1:
    		offset += clip->x1 / 8;
    		len = DIV_ROUND_UP(len + clip->x1 % 8, 8);
    		break;
    	case 2:
    		offset += clip->x1 / 4;
    		len = DIV_ROUND_UP(len + clip->x1 % 4, 4);
    		break;
    	case 4:
    		offset += clip->x1 / 2;
    		len = DIV_ROUND_UP(len + clip->x1 % 2, 2);
    		break;
    	default:
    		offset += clip->x1 * fb->format->cpp[0];
    		len *= fb->format->cpp[0];
    		break;
    	}
    
    	src = fb_helper->info->screen_buffer + offset;
    	iosys_map_incr(dst, offset); /* go to first pixel within clip rect */
    
    	for (y = clip->y1; y < clip->y2; y++) {
    		iosys_map_memcpy_to(dst, 0, src, len);
    		iosys_map_incr(dst, fb->pitches[0]);
    		src += fb->pitches[0];
    	}
    }
    
    static int drm_fbdev_generic_damage_blit(struct drm_fb_helper *fb_helper,
    					 struct drm_clip_rect *clip)
    {
    	struct drm_client_buffer *buffer = fb_helper->buffer;
    	struct iosys_map map, dst;
    	int ret;
    
    	/*
    	 * We have to pin the client buffer to its current location while
    	 * flushing the shadow buffer. In the general case, concurrent
    	 * modesetting operations could try to move the buffer and would
    	 * fail. The modeset has to be serialized by acquiring the reservation
    	 * object of the underlying BO here.
    	 *
    	 * For fbdev emulation, we only have to protect against fbdev modeset
    	 * operations. Nothing else will involve the client buffer's BO. So it
    	 * is sufficient to acquire struct drm_fb_helper.lock here.
    	 */
    	mutex_lock(&fb_helper->lock);
    
    	ret = drm_client_buffer_vmap_local(buffer, &map);
    	if (ret)
    		goto out;
    
    	dst = map;
    	drm_fbdev_generic_damage_blit_real(fb_helper, clip, &dst);
    
    	drm_client_buffer_vunmap_local(buffer);
    
    out:
    	mutex_unlock(&fb_helper->lock);
    
    	return ret;
    }
    
    static int drm_fbdev_generic_helper_fb_dirty(struct drm_fb_helper *helper,
    					     struct drm_clip_rect *clip)
    {
    	struct drm_device *dev = helper->dev;
    	int ret;
    
    	/* Call damage handlers only if necessary */
    	if (!(clip->x1 < clip->x2 && clip->y1 < clip->y2))
    		return 0;
    
    	ret = drm_fbdev_generic_damage_blit(helper, clip);
    	if (drm_WARN_ONCE(dev, ret, "Damage blitter failed: ret=%d\n", ret))
    		return ret;
    
    	if (helper->fb->funcs->dirty) {
    		ret = helper->fb->funcs->dirty(helper->fb, NULL, 0, 0, clip, 1);
    		if (drm_WARN_ONCE(dev, ret, "Dirty helper failed: ret=%d\n", ret))
    			return ret;
    	}
    
    	return 0;
    }
    
    static const struct drm_fb_helper_funcs drm_fbdev_generic_helper_funcs = {
    	.fb_probe = drm_fbdev_generic_helper_fb_probe,
    	.fb_dirty = drm_fbdev_generic_helper_fb_dirty,
    };
    
    static void drm_fbdev_generic_client_unregister(struct drm_client_dev *client)
    {
    	struct drm_fb_helper *fb_helper = drm_fb_helper_from_client(client);
    
    	if (fb_helper->info) {
    		drm_fb_helper_unregister_info(fb_helper);
    	} else {
    		drm_client_release(&fb_helper->client);
    		drm_fb_helper_unprepare(fb_helper);
    		kfree(fb_helper);
    	}
    }
    
    static int drm_fbdev_generic_client_restore(struct drm_client_dev *client)
    {
    	drm_fb_helper_lastclose(client->dev);
    
    	return 0;
    }
    
    static int drm_fbdev_generic_client_hotplug(struct drm_client_dev *client)
    {
    	struct drm_fb_helper *fb_helper = drm_fb_helper_from_client(client);
    	struct drm_device *dev = client->dev;
    	int ret;
    
    	if (dev->fb_helper)
    		return drm_fb_helper_hotplug_event(dev->fb_helper);
    
    	ret = drm_fb_helper_init(dev, fb_helper);
    	if (ret)
    		goto err_drm_err;
    
    	if (!drm_drv_uses_atomic_modeset(dev))
    		drm_helper_disable_unused_functions(dev);
    
    	ret = drm_fb_helper_initial_config(fb_helper);
    	if (ret)
    		goto err_drm_fb_helper_fini;
    
    	return 0;
    
    err_drm_fb_helper_fini:
    	drm_fb_helper_fini(fb_helper);
    err_drm_err:
    	drm_err(dev, "fbdev: Failed to setup generic emulation (ret=%d)\n", ret);
    	return ret;
    }
    
    static const struct drm_client_funcs drm_fbdev_generic_client_funcs = {
    	.owner		= THIS_MODULE,
    	.unregister	= drm_fbdev_generic_client_unregister,
    	.restore	= drm_fbdev_generic_client_restore,
    	.hotplug	= drm_fbdev_generic_client_hotplug,
    };
    
    /**
     * drm_fbdev_generic_setup() - Setup generic fbdev emulation
     * @dev: DRM device
     * @preferred_bpp: Preferred bits per pixel for the device.
     *
     * This function sets up generic fbdev emulation for drivers that supports
     * dumb buffers with a virtual address and that can be mmap'ed.
     * drm_fbdev_generic_setup() shall be called after the DRM driver registered
     * the new DRM device with drm_dev_register().
     *
     * Restore, hotplug events and teardown are all taken care of. Drivers that do
     * suspend/resume need to call drm_fb_helper_set_suspend_unlocked() themselves.
     * Simple drivers might use drm_mode_config_helper_suspend().
     *
     * In order to provide fixed mmap-able memory ranges, generic fbdev emulation
     * uses a shadow buffer in system memory. The implementation blits the shadow
     * fbdev buffer onto the real buffer in regular intervals.
     *
     * This function is safe to call even when there are no connectors present.
     * Setup will be retried on the next hotplug event.
     *
     * The fbdev is destroyed by drm_dev_unregister().
     */
    void drm_fbdev_generic_setup(struct drm_device *dev, unsigned int preferred_bpp)
    {
    	struct drm_fb_helper *fb_helper;
    	int ret;
    
    	drm_WARN(dev, !dev->registered, "Device has not been registered.\n");
    	drm_WARN(dev, dev->fb_helper, "fb_helper is already set!\n");
    
    	fb_helper = kzalloc(sizeof(*fb_helper), GFP_KERNEL);
    	if (!fb_helper)
    		return;
    	drm_fb_helper_prepare(dev, fb_helper, preferred_bpp, &drm_fbdev_generic_helper_funcs);
    
    	ret = drm_client_init(dev, &fb_helper->client, "fbdev", &drm_fbdev_generic_client_funcs);
    	if (ret) {
    		drm_err(dev, "Failed to register client: %d\n", ret);
    		goto err_drm_client_init;
    	}
    
    	drm_client_register(&fb_helper->client);
    
    	return;
    
    err_drm_client_init:
    	drm_fb_helper_unprepare(fb_helper);
    	kfree(fb_helper);
    	return;
    }
    EXPORT_SYMBOL(drm_fbdev_generic_setup);