Skip to content
Snippets Groups Projects
Select Git revision
  • 147e1a97c4a0bdd43f55a582a9416bb9092563a9
  • vme-testing default
  • ci-test
  • master
  • remoteproc
  • am625-sk-ov5640
  • pcal6534-upstreaming
  • lps22df-upstreaming
  • msc-upstreaming
  • imx8mp
  • iio/noa1305
  • vme-next
  • vme-next-4.14-rc4
  • v4.14-rc4
  • v4.14-rc3
  • v4.14-rc2
  • v4.14-rc1
  • v4.13
  • vme-next-4.13-rc7
  • v4.13-rc7
  • v4.13-rc6
  • v4.13-rc5
  • v4.13-rc4
  • v4.13-rc3
  • v4.13-rc2
  • v4.13-rc1
  • v4.12
  • v4.12-rc7
  • v4.12-rc6
  • v4.12-rc5
  • v4.12-rc4
  • v4.12-rc3
32 results

file.c

Blame
    • Johannes Weiner's avatar
      147e1a97
      fs: kernfs: add poll file operation · 147e1a97
      Johannes Weiner authored
      Patch series "psi: pressure stall monitors", v3.
      
      Android is adopting psi to detect and remedy memory pressure that
      results in stuttering and decreased responsiveness on mobile devices.
      
      Psi gives us the stall information, but because we're dealing with
      latencies in the millisecond range, periodically reading the pressure
      files to detect stalls in a timely fashion is not feasible.  Psi also
      doesn't aggregate its averages at a high enough frequency right now.
      
      This patch series extends the psi interface such that users can
      configure sensitive latency thresholds and use poll() and friends to be
      notified when these are breached.
      
      As high-frequency aggregation is costly, it implements an aggregation
      method that is optimized for fast, short-interval averaging, and makes
      the aggregation frequency adaptive, such that high-frequency updates
      only happen while monitored stall events are actively occurring.
      
      With these patches applied, Android can monitor for, and ward off,
      mounting memory shortages before they cause problems for the user.  For
      example, using memory stall monitors in userspace low memory killer
      daemon (lmkd) we can detect mounting pressure and kill less important
      processes before device becomes visibly sluggish.
      
      In our memory stress testing psi memory monitors produce roughly 10x
      less false positives compared to vmpressure signals.  Having ability to
      specify multiple triggers for the same psi metric allows other parts of
      Android framework to monitor memory state of the device and act
      accordingly.
      
      The new interface is straightforward.  The user opens one of the
      pressure files for writing and writes a trigger description into the
      file descriptor that defines the stall state - some or full, and the
      maximum stall time over a given window of time.  E.g.:
      
              /* Signal when stall time exceeds 100ms of a 1s window */
              char trigger[] = "full 100000 1000000";
              fd = open("/proc/pressure/memory");
              write(fd, trigger, sizeof(trigger));
              while (poll() >= 0) {
                      ...
              }
              close(fd);
      
      When the monitored stall state is entered, psi adapts its aggregation
      frequency according to what the configured time window requires in order
      to emit event signals in a timely fashion.  Once the stalling subsides,
      aggregation reverts back to normal.
      
      The trigger is associated with the open file descriptor.  To stop
      monitoring, the user only needs to close the file descriptor and the
      trigger is discarded.
      
      Patches 1-4 prepare the psi code for polling support.  Patch 5
      implements the adaptive polling logic, the pressure growth detection
      optimized for short intervals, and hooks up write() and poll() on the
      pressure files.
      
      The patches were developed in collaboration with Johannes Weiner.
      
      This patch (of 5):
      
      Kernfs has a standardized poll/notification mechanism for waking all
      pollers on all fds when a filesystem node changes.  To allow polling for
      custom events, add a .poll callback that can override the default.
      
      This is in preparation for pollable cgroup pressure files which have
      per-fd trigger configurations.
      
      Link: http://lkml.kernel.org/r/20190124211518.244221-2-surenb@google.com
      
      
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      147e1a97
      History
      fs: kernfs: add poll file operation
      Johannes Weiner authored
      Patch series "psi: pressure stall monitors", v3.
      
      Android is adopting psi to detect and remedy memory pressure that
      results in stuttering and decreased responsiveness on mobile devices.
      
      Psi gives us the stall information, but because we're dealing with
      latencies in the millisecond range, periodically reading the pressure
      files to detect stalls in a timely fashion is not feasible.  Psi also
      doesn't aggregate its averages at a high enough frequency right now.
      
      This patch series extends the psi interface such that users can
      configure sensitive latency thresholds and use poll() and friends to be
      notified when these are breached.
      
      As high-frequency aggregation is costly, it implements an aggregation
      method that is optimized for fast, short-interval averaging, and makes
      the aggregation frequency adaptive, such that high-frequency updates
      only happen while monitored stall events are actively occurring.
      
      With these patches applied, Android can monitor for, and ward off,
      mounting memory shortages before they cause problems for the user.  For
      example, using memory stall monitors in userspace low memory killer
      daemon (lmkd) we can detect mounting pressure and kill less important
      processes before device becomes visibly sluggish.
      
      In our memory stress testing psi memory monitors produce roughly 10x
      less false positives compared to vmpressure signals.  Having ability to
      specify multiple triggers for the same psi metric allows other parts of
      Android framework to monitor memory state of the device and act
      accordingly.
      
      The new interface is straightforward.  The user opens one of the
      pressure files for writing and writes a trigger description into the
      file descriptor that defines the stall state - some or full, and the
      maximum stall time over a given window of time.  E.g.:
      
              /* Signal when stall time exceeds 100ms of a 1s window */
              char trigger[] = "full 100000 1000000";
              fd = open("/proc/pressure/memory");
              write(fd, trigger, sizeof(trigger));
              while (poll() >= 0) {
                      ...
              }
              close(fd);
      
      When the monitored stall state is entered, psi adapts its aggregation
      frequency according to what the configured time window requires in order
      to emit event signals in a timely fashion.  Once the stalling subsides,
      aggregation reverts back to normal.
      
      The trigger is associated with the open file descriptor.  To stop
      monitoring, the user only needs to close the file descriptor and the
      trigger is discarded.
      
      Patches 1-4 prepare the psi code for polling support.  Patch 5
      implements the adaptive polling logic, the pressure growth detection
      optimized for short intervals, and hooks up write() and poll() on the
      pressure files.
      
      The patches were developed in collaboration with Johannes Weiner.
      
      This patch (of 5):
      
      Kernfs has a standardized poll/notification mechanism for waking all
      pollers on all fds when a filesystem node changes.  To allow polling for
      custom events, add a .poll callback that can override the default.
      
      This is in preparation for pollable cgroup pressure files which have
      per-fd trigger configurations.
      
      Link: http://lkml.kernel.org/r/20190124211518.244221-2-surenb@google.com
      
      
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    file.c 25.72 KiB
    /*
     * fs/kernfs/file.c - kernfs file implementation
     *
     * Copyright (c) 2001-3 Patrick Mochel
     * Copyright (c) 2007 SUSE Linux Products GmbH
     * Copyright (c) 2007, 2013 Tejun Heo <tj@kernel.org>
     *
     * This file is released under the GPLv2.
     */
    
    #include <linux/fs.h>
    #include <linux/seq_file.h>
    #include <linux/slab.h>
    #include <linux/poll.h>
    #include <linux/pagemap.h>
    #include <linux/sched/mm.h>
    #include <linux/fsnotify.h>
    
    #include "kernfs-internal.h"
    
    /*
     * There's one kernfs_open_file for each open file and one kernfs_open_node
     * for each kernfs_node with one or more open files.
     *
     * kernfs_node->attr.open points to kernfs_open_node.  attr.open is
     * protected by kernfs_open_node_lock.
     *
     * filp->private_data points to seq_file whose ->private points to
     * kernfs_open_file.  kernfs_open_files are chained at
     * kernfs_open_node->files, which is protected by kernfs_open_file_mutex.
     */
    static DEFINE_SPINLOCK(kernfs_open_node_lock);
    static DEFINE_MUTEX(kernfs_open_file_mutex);
    
    struct kernfs_open_node {
    	atomic_t		refcnt;
    	atomic_t		event;
    	wait_queue_head_t	poll;
    	struct list_head	files; /* goes through kernfs_open_file.list */
    };
    
    /*
     * kernfs_notify() may be called from any context and bounces notifications
     * through a work item.  To minimize space overhead in kernfs_node, the
     * pending queue is implemented as a singly linked list of kernfs_nodes.
     * The list is terminated with the self pointer so that whether a
     * kernfs_node is on the list or not can be determined by testing the next
     * pointer for NULL.
     */
    #define KERNFS_NOTIFY_EOL			((void *)&kernfs_notify_list)
    
    static DEFINE_SPINLOCK(kernfs_notify_lock);
    static struct kernfs_node *kernfs_notify_list = KERNFS_NOTIFY_EOL;
    
    static struct kernfs_open_file *kernfs_of(struct file *file)
    {
    	return ((struct seq_file *)file->private_data)->private;
    }
    
    /*
     * Determine the kernfs_ops for the given kernfs_node.  This function must
     * be called while holding an active reference.
     */
    static const struct kernfs_ops *kernfs_ops(struct kernfs_node *kn)
    {
    	if (kn->flags & KERNFS_LOCKDEP)
    		lockdep_assert_held(kn);
    	return kn->attr.ops;
    }