Skip to content
  • Simon Marchi's avatar
    gdb: move displaced stepping logic to gdbarch, allow starting concurrent displaced steps · 187b041e
    Simon Marchi authored
    Today, GDB only allows a single displaced stepping operation to happen
    per inferior at a time.  There is a single displaced stepping buffer per
    inferior, whose address is fixed (obtained with
    gdbarch_displaced_step_location), managed by infrun.c.
    
    In the case of the AMD ROCm target [1] (in the context of which this
    work has been done), it is typical to have thousands of threads (or
    waves, in SMT terminology) executing the same code, hitting the same
    breakpoint (possibly conditional) and needing to to displaced step it at
    the same time.  The limitation of only one displaced step executing at a
    any given time becomes a real bottleneck.
    
    To fix this bottleneck, we want to make it possible for threads of a
    same inferior to execute multiple displaced steps in parallel.  This
    patch builds the foundation for that.
    
    In essence, this patch moves the task of preparing a displaced step and
    cleaning up after to gdbarch functions.  This allows using different
    schemes for allocating and managing displaced stepping buffers for
    different platforms.  The gdbarch decides how to assign a buffer to a
    thread that needs to execute a displaced step.
    
    On the ROCm target, we are able to allocate one displaced stepping
    buffer per thread, so a thread will never have to wait to execute a
    displaced step.
    
    On Linux, the entry point of the executable if used as the displaced
    stepping buffer, since we assume that this code won't get used after
    startup.  From what I saw (I checked with a binary generated against
    glibc and musl), on AMD64 we have enough space there to fit two
    displaced stepping buffers.  A subsequent patch makes AMD64/Linux use
    two buffers.
    
    In addition to having multiple displaced stepping buffers, there is also
    the idea of sharing displaced stepping buffers between threads.  Two
    threads doing displaced steps for the same PC could use the same buffer
    at the same time.  Two threads stepping over the same instruction (same
    opcode) at two different PCs may also be able to share a displaced
    stepping buffer.  This is an idea for future patches, but the
    architecture built by this patch is made to allow this.
    
    Now, the implementation details.  The main part of this patch is moving
    the responsibility of preparing and finishing a displaced step to the
    gdbarch.  Before this patch, preparing a displaced step is driven by the
    displaced_step_prepare_throw function.  It does some calls to the
    gdbarch to do some low-level operations, but the high-level logic is
    there.  The steps are roughly:
    
    - Ask the gdbarch for the displaced step buffer location
    - Save the existing bytes in the displaced step buffer
    - Ask the gdbarch to copy the instruction into the displaced step buffer
    - Set the pc of the thread to the beginning of the displaced step buffer
    
    Similarly, the "fixup" phase, executed after the instruction was
    successfully single-stepped, is driven by the infrun code (function
    displaced_step_finish).  The steps are roughly:
    
    - Restore the original bytes in the displaced stepping buffer
    - Ask the gdbarch to fixup the instruction result (adjust the target's
      registers or memory to do as if the instruction had been executed in
      its original location)
    
    The displaced_step_inferior_state::step_thread field indicates which
    thread (if any) is currently using the displaced stepping buffer, so it
    is used by displaced_step_prepare_throw to check if the displaced
    stepping buffer is free to use or not.
    
    This patch defers the whole task of preparing and cleaning up after a
    displaced step to the gdbarch.  Two new main gdbarch methods are added,
    with the following semantics:
    
      - gdbarch_displaced_step_prepare: Prepare for the given thread to
        execute a displaced step of the instruction located at its current PC.
        Upon return, everything should be ready for GDB to resume the thread
        (with either a single step or continue, as indicated by
        gdbarch_displaced_step_hw_singlestep) to make it displaced step the
        instruction.
    
      - gdbarch_displaced_step_finish: Called when the thread stopped after
        having started a displaced step.  Verify if the instruction was
        executed, if so apply any fixup required to compensate for the fact
        that the instruction was executed at a different place than its
        original pc.  Release any resources that were allocated for this
        displaced step.  Upon return, everything should be ready for GDB to
        resume the thread in its "normal" code path.
    
    The displaced_step_prepare_throw function now pretty much just offloads
    to gdbarch_displaced_step_prepare and the displaced_step_finish function
    offloads to gdbarch_displaced_step_finish.
    
    The gdbarch_displaced_step_location method is now unnecessary, so is
    removed.  Indeed, the core of GDB doesn't know how many displaced step
    buffers there are nor where they are.
    
    To keep the existing behavior for existing architectures, the logic that
    was previously implemented in infrun.c for preparing and finishing a
    displaced step is moved to displaced-stepping.c, to the
    displaced_step_buffer class.  Architectures are modified to implement
    the new gdbarch methods using this class.  The behavior is not expected
    to change.
    
    The other important change (which arises from the above) is that the
    core of GDB no longer prevents concurrent displaced steps.  Before this
    patch, start_step_over walks the global step over chain and tries to
    initiate a step over (whether it is in-line or displaced).  It follows
    these rules:
    
      - if an in-line step is in progress (in any inferior), don't start any
        other step over
      - if a displaced step is in progress for an inferior, don't start
        another displaced step for that inferior
    
    After starting a displaced step for a given inferior, it won't start
    another displaced step for that inferior.
    
    In the new code, start_step_over simply tries to initiate step overs for
    all the threads in the list.  But because threads may be added back to
    the global list as it iterates the global list, trying to initiate step
    overs, start_step_over now starts by stealing the global queue into a
    local queue and iterates on the local queue.  In the typical case, each
    thread will either:
    
      - have initiated a displaced step and be resumed
      - have been added back by the global step over queue by
        displaced_step_prepare_throw, because the gdbarch will have returned
        that there aren't enough resources (i.e. buffers) to initiate a
        displaced step for that thread
    
    Lastly, if start_step_over initiates an in-line step, it stops
    iterating, and moves back whatever remaining threads it had in its local
    step over queue to the global step over queue.
    
    Two other gdbarch methods are added, to handle some slightly annoying
    corner cases.  They feel awkwardly specific to these cases, but I don't
    see any way around them:
    
      - gdbarch_displaced_step_copy_insn_closure_by_addr: in
        arm_pc_is_thumb, arm-tdep.c wants to get the closure for a given
        buffer address.
    
      - gdbarch_displaced_step_restore_all_in_ptid: when a process forks
        (at least on Linux), the address space is copied.  If some displaced
        step buffers were in use at the time of the fork, we need to restore
        the original bytes in the child's address space.
    
    These two adjustments are also made in infrun.c:
    
      - prepare_for_detach: there may be multiple threads doing displaced
        steps when we detach, so wait until all of them are done
    
      - handle_inferior_event: when we handle a fork event for a given
        thread, it's possible that other threads are doing a displaced step at
        the same time.  Make sure to restore the displaced step buffer
        contents in the child for them.
    
    [1] https://github.com/ROCm-Developer-Tools/ROCgdb
    
    gdb/ChangeLog:
    
    	* displaced-stepping.h (struct
    	displaced_step_copy_insn_closure): Adjust comments.
    	(struct displaced_step_inferior_state) <step_thread,
    	step_gdbarch, step_closure, step_original, step_copy,
    	step_saved_copy>: Remove fields.
    	(struct displaced_step_thread_state): New.
    	(struct displaced_step_buffer): New.
    	* displaced-stepping.c (displaced_step_buffer::prepare): New.
    	(write_memory_ptid): Move from infrun.c.
    	(displaced_step_instruction_executed_successfully): New,
    	factored out of displaced_step_finish.
    	(displaced_step_buffer::finish): New.
    	(displaced_step_buffer::copy_insn_closure_by_addr): New.
    	(displaced_step_buffer::restore_in_ptid): New.
    	* gdbarch.sh (displaced_step_location): Remove.
    	(displaced_step_prepare, displaced_step_finish,
    	displaced_step_copy_insn_closure_by_addr,
    	displaced_step_restore_all_in_ptid): New.
    	* gdbarch.c: Re-generate.
    	* gdbarch.h: Re-generate.
    	* gdbthread.h (class thread_info) <displaced_step_state>: New
    	field.
    	(thread_step_over_chain_remove): New declaration.
    	(thread_step_over_chain_next): New declaration.
    	(thread_step_over_chain_length): New declaration.
    	* thread.c (thread_step_over_chain_remove): Make non-static.
    	(thread_step_over_chain_next): New.
    	(global_thread_step_over_chain_next): Use
    	thread_step_over_chain_next.
    	(thread_step_over_chain_length): New.
    	(global_thread_step_over_chain_enqueue): Add debug print.
    	(global_thread_step_over_chain_remove): Add debug print.
    	* infrun.h (get_displaced_step_copy_insn_closure_by_addr):
    	Remove.
    	* infrun.c (get_displaced_stepping_state): New.
    	(displaced_step_in_progress_any_inferior): Remove.
    	(displaced_step_in_progress_thread): Adjust.
    	(displaced_step_in_progress): Adjust.
    	(displaced_step_in_progress_any_thread): New.
    	(get_displaced_step_copy_insn_closure_by_addr): Remove.
    	(gdbarch_supports_displaced_stepping): Use
    	gdbarch_displaced_step_prepare_p.
    	(displaced_step_reset): Change parameter from inferior to
    	thread.
    	(displaced_step_prepare_throw): Implement using
    	gdbarch_displaced_step_prepare.
    	(write_memory_ptid): Move to displaced-step.c.
    	(displaced_step_restore): Remove.
    	(displaced_step_finish): Implement using
    	gdbarch_displaced_step_finish.
    	(start_step_over): Allow starting more than one displaced step.
    	(prepare_for_detach): Handle possibly multiple threads doing
    	displaced steps.
    	(handle_inferior_event): Handle possibility that fork event
    	happens while another thread displaced steps.
    	* linux-tdep.h (linux_displaced_step_prepare): New.
    	(linux_displaced_step_finish): New.
    	(linux_displaced_step_copy_insn_closure_by_addr): New.
    	(linux_displaced_step_restore_all_in_ptid): New.
    	(linux_init_abi): Add supports_displaced_step parameter.
    	* linux-tdep.c (struct linux_info) <disp_step_buf>: New field.
    	(linux_displaced_step_prepare): New.
    	(linux_displaced_step_finish): New.
    	(linux_displaced_step_copy_insn_closure_by_addr): New.
    	(linux_displaced_step_restore_all_in_ptid): New.
    	(linux_init_abi): Add supports_displaced_step parameter,
    	register displaced step methods if true.
    	(_initialize_linux_tdep): Register inferior_execd observer.
    	* amd64-linux-tdep.c (amd64_linux_init_abi_common): Add
    	supports_displaced_step parameter, adjust call to
    	linux_init_abi.  Remove call to
    	set_gdbarch_displaced_step_location.
    	(amd64_linux_init_abi): Adjust call to
    	amd64_linux_init_abi_common.
    	(amd64_x32_linux_init_abi): Likewise.
    	* aarch64-linux-tdep.c (aarch64_linux_init_abi): Adjust call to
    	linux_init_abi.  Remove call to
    	set_gdbarch_displaced_step_location.
    	* arm-linux-tdep.c (arm_linux_init_abi): Likewise.
    	* i386-linux-tdep.c (i386_linux_init_abi): Likewise.
    	* alpha-linux-tdep.c (alpha_linux_init_abi): Adjust call to
    	linux_init_abi.
    	* arc-linux-tdep.c (arc_linux_init_osabi): Likewise.
    	* bfin-linux-tdep.c (bfin_linux_init_abi): Likewise.
    	* cris-linux-tdep.c (cris_linux_init_abi): Likewise.
    	* csky-linux-tdep.c (csky_linux_init_abi): Likewise.
    	* frv-linux-tdep.c (frv_linux_init_abi): Likewise.
    	* hppa-linux-tdep.c (hppa_linux_init_abi): Likewise.
    	* ia64-linux-tdep.c (ia64_linux_init_abi): Likewise.
    	* m32r-linux-tdep.c (m32r_linux_init_abi): Likewise.
    	* m68k-linux-tdep.c (m68k_linux_init_abi): Likewise.
    	* microblaze-linux-tdep.c (microblaze_linux_init_abi): Likewise.
    	* mips-linux-tdep.c (mips_linux_init_abi): Likewise.
    	* mn10300-linux-tdep.c (am33_linux_init_osabi): Likewise.
    	* nios2-linux-tdep.c (nios2_linux_init_abi): Likewise.
    	* or1k-linux-tdep.c (or1k_linux_init_abi): Likewise.
    	* riscv-linux-tdep.c (riscv_linux_init_abi): Likewise.
    	* s390-linux-tdep.c (s390_linux_init_abi_any): Likewise.
    	* sh-linux-tdep.c (sh_linux_init_abi): Likewise.
    	* sparc-linux-tdep.c (sparc32_linux_init_abi): Likewise.
    	* sparc64-linux-tdep.c (sparc64_linux_init_abi): Likewise.
    	* tic6x-linux-tdep.c (tic6x_uclinux_init_abi): Likewise.
    	* tilegx-linux-tdep.c (tilegx_linux_init_abi): Likewise.
    	* xtensa-linux-tdep.c (xtensa_linux_init_abi): Likewise.
    	* ppc-linux-tdep.c (ppc_linux_init_abi): Adjust call to
    	linux_init_abi.  Remove call to
    	set_gdbarch_displaced_step_location.
    	* arm-tdep.c (arm_pc_is_thumb): Call
    	gdbarch_displaced_step_copy_insn_closure_by_addr instead of
    	get_displaced_step_copy_insn_closure_by_addr.
    	* rs6000-aix-tdep.c (rs6000_aix_init_osabi): Adjust calls to
    	clear gdbarch methods.
    	* rs6000-tdep.c (struct ppc_inferior_data): New structure.
    	(get_ppc_per_inferior): New function.
    	(ppc_displaced_step_prepare): New function.
    	(ppc_displaced_step_finish): New function.
    	(ppc_displaced_step_restore_all_in_ptid): New function.
    	(rs6000_gdbarch_init): Register new gdbarch methods.
    	* s390-tdep.c (s390_gdbarch_init): Don't call
    	set_gdbarch_displaced_step_location, set new gdbarch methods.
    
    gdb/testsuite/ChangeLog:
    
    	* gdb.arch/amd64-disp-step-avx.exp: Adjust pattern.
    	* gdb.threads/forking-threads-plus-breakpoint.exp: Likewise.
    	* gdb.threads/non-stop-fair-events.exp: Likewise.
    
    Change-Id: I387cd235a442d0620ec43608fd3dc0097fcbf8c8
    187b041e