Skip to content
Snippets Groups Projects
  1. Jan 21, 2024
  2. Jan 06, 2024
  3. Jan 01, 2024
    • Kent Overstreet's avatar
      e06af207
    • Kent Overstreet's avatar
      bcachefs: btree write buffer now slurps keys from journal · 09caeabe
      Kent Overstreet authored
      
      Previosuly, the transaction commit path would have to add keys to the
      btree write buffer as a separate operation, requiring additional global
      synchronization.
      
      This patch introduces a new journal entry type, which indicates that the
      keys need to be copied into the btree write buffer prior to being
      written out. We switch the journal entry type back to
      JSET_ENTRY_btree_keys prior to write, so this is not an on disk format
      change.
      
      Flushing the btree write buffer may require pulling keys out of journal
      entries yet to be written, and quiescing outstanding journal
      reservations; we previously added journal->buf_lock for synchronization
      with the journal write path.
      
      We also can't put strict bounds on the number of keys in the journal
      destined for the write buffer, which means we might overflow the size of
      the preallocated buffer and have to reallocate - this introduces a
      potentially fatal memory allocation failure. This is something we'll
      have to watch for, if it becomes an issue in practice we can do
      additional mitigation.
      
      The transaction commit path no longer has to explicitly check if the
      write buffer is full and wait on flushing; this is another performance
      optimization. Instead, when the btree write buffer is close to full we
      change the journal watermark, so that only reservations for journal
      reclaim are allowed.
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      09caeabe
    • Kent Overstreet's avatar
      bcachefs: Improve btree write buffer tracepoints · 56db2429
      Kent Overstreet authored
      
       - add a tracepoint for write_buffer_flush_sync; this is expensive
       - fix the write_buffer_flush_slowpath tracepoint
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      56db2429
    • Kent Overstreet's avatar
      bcachefs: Kill dev_usage->buckets_ec · 9b34f02c
      Kent Overstreet authored
      
      This counter is redundant; it's simply the sum of BCH_DATA_stripe and
      BCH_DATA_parity buckets.
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      9b34f02c
    • Kent Overstreet's avatar
      bcachefs: Rename bch_replicas_entry -> bch_replicas_entry_v1 · 086a52f7
      Kent Overstreet authored
      
      Prep work for introducing bch_replicas_entry_v2
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      086a52f7
    • Kent Overstreet's avatar
    • Kent Overstreet's avatar
      bcachefs: bch_sb_field_downgrade · 84f16387
      Kent Overstreet authored
      
      Add a new superblock section that contains a list of
        { minor version, recovery passes, errors_to_fix }
      
      that is - a list of recovery passes that must be run when downgrading
      past a given version, and a list of errors to silently fix.
      
      The upcoming disk accounting rewrite is not going to be fully
      compatible: we're going to have to regenerate accounting both when
      upgrading to the new version, and also from downgrading from the new
      version, since the new method of doing disk space accounting is a
      completely different architecture based on deltas, and synchronizing
      them for every jounal entry write to maintain compatibility is going to
      be too expensive and impractical.
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      84f16387
    • Kent Overstreet's avatar
      bcachefs: bch_sb.recovery_passes_required · 8b16413c
      Kent Overstreet authored
      
      Add two new superblock fields. Since the main section of the superblock
      is now fully, we have to add a new variable length section for them -
      bch_sb_field_ext.
      
       - recovery_passes_requried: recovery passes that must be run on the
         next mount
       - errors_silent: errors that will be silently fixed
      
      These are to improve upgrading and dwongrading: these fields won't be
      cleared until after recovery successfully completes, so there won't be
      any issues with crashing partway through an upgrade or a downgrade.
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      8b16413c
  4. Nov 28, 2023
  5. Nov 26, 2023
  6. Nov 05, 2023
  7. Nov 04, 2023
  8. Nov 02, 2023
    • Kent Overstreet's avatar
      bcachefs: bch_sb_field_errors · f5d26fa3
      Kent Overstreet authored
      
      Add a new superblock section to keep counts of errors seen since
      filesystem creation: we'll be addingcounters for every distinct fsck
      error.
      
      The new superblock section has entries of the for [ id, count,
      time_of_last_error ]; this is intended to let us see what errors are
      occuring - and getting fixed - via show-super output.
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      f5d26fa3
    • Kent Overstreet's avatar
      bcachefs: Add IO error counts to bch_member · 94119eeb
      Kent Overstreet authored
      
      We now track IO errors per device since filesystem creation.
      
      IO error counts can be viewed in sysfs, or with the 'bcachefs
      show-super' command.
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      94119eeb
    • Kent Overstreet's avatar
      bcachefs: rebalance_work · fb3f57bb
      Kent Overstreet authored
      
      This adds a new btree, rebalance_work, to eliminate scanning required
      for finding extents that need work done on them in the background - i.e.
      for the background_target and background_compression options.
      
      rebalance_work is a bitset btree, where a KEY_TYPE_set corresponds to an
      extent in the extents or reflink btree at the same pos.
      
      A new extent field is added, bch_extent_rebalance, which indicates that
      this extent has work that needs to be done in the background - and which
      options to use. This allows per-inode options to be propagated to
      indirect extents - at least in some circumstances. In this patch,
      changing IO options on a file will not propagate the new options to
      indirect extents pointed to by that file.
      
      Updating (setting/clearing) the rebalance_work btree is done by the
      extent trigger, which looks at the bch_extent_rebalance field.
      
      Scanning is still requrired after changing IO path options - either just
      for a given inode, or for the whole filesystem. We indicate that
      scanning is required by adding a KEY_TYPE_cookie key to the
      rebalance_work btree: the cookie counter is so that we can detect that
      scanning is still required when an option has been flipped mid-way
      through an existing scan.
      
      Future possible work:
       - Propagate options to indirect extents when being changed
       - Add other IO path options - nr_replicas, ec, to rebalance_work so
         they can be applied in the background when they change
       - Add a counter, for bcachefs fs usage output, showing the pending
         amount of rebalance work: we'll probably want to do this after the
         disk space accounting rewrite (moving it to a new btree)
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      fb3f57bb
  9. Oct 22, 2023
Loading