Skip to content

ld: Always consider -L args and allow a Gentoo-specific prefix override#6

Open
lu-zero wants to merge 2 commits into
gentoo:gentoo/binutils-2.43from
lu-zero:gentoo-2.43
Open

ld: Always consider -L args and allow a Gentoo-specific prefix override#6
lu-zero wants to merge 2 commits into
gentoo:gentoo/binutils-2.43from
lu-zero:gentoo-2.43

Conversation

@lu-zero

@lu-zero lu-zero commented Aug 5, 2024

Copy link
Copy Markdown
Member

Rebase of #5

chewi added 2 commits August 5, 2024 21:17
This is for consistency with other linkers, including gold. Without
this, we typically rely on ld.so.conf to find libraries such as
libstdc++.so.6, while other linkers do not use this file at all.
This prefix is only used to locate $prefix/etc/ld.so.conf, with $prefix
usually being /usr. This file is important on Gentoo Prefix systems,
where the /usr prefix is within another directory. The problem is that
Gentoo already passes the same directory as the sysroot, and ld.bfd
therefore looks for /myprefix/myprefix/usr/etc/ld.so.conf.

The sysroot is dynamic, while the prefix is hardcoded. A hardcoded
prefix that isn't just /usr is unhelpful, not just because of the
doubled prefix issue above, but also because it prevents ld.bfd from
working effectively outside its native environment. We will therefore
hardcode it to just /usr.

This change does not simply do that though and sets up a $gentoo_prefix
variable instead, with $prefix as a fallback. This is necessary because
Gentoo prefix-guest systems, which use the host's libc, do not apply a
sysroot like RAP prefix systems do. In that case, we must preserve the
existing behaviour. The binutils ebuild will be responsible for setting
this variable appropriately.
@chewi

chewi commented Aug 7, 2024

Copy link
Copy Markdown
Member

I've rebased #5 now, so you can close this. Thanks!

gentoo-bot pushed a commit that referenced this pull request Nov 28, 2024
The commit:

  commit c6b4867
  Date:   Thu Mar 30 19:21:22 2023 +0100

      gdb: parse pending breakpoint thread/task immediately

Introduce a use bug where the value of a temporary variable was being
used after it had gone out of scope.  This was picked up by the
address sanitizer and would result in this error:

  (gdb) maintenance selftest create_breakpoint_parse_arg_string
  Running selftest create_breakpoint_parse_arg_string.
  =================================================================
  ==2265825==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7fbb08046511 at pc 0x000001632230 bp 0x7fff7c2fb770 sp 0x7fff7c2fb768
  READ of size 1 at 0x7fbb08046511 thread T0
      #0 0x163222f in create_breakpoint_parse_arg_string(char const*, std::unique_ptr<char, gdb::xfree_deleter<char> >*, int*, int*, int*, std::unique_ptr<char, gdb::xfree_deleter<char> >*, bool*) ../../src/gdb/break-cond-parse.c:496
      #1 0x1633026 in test ../../src/gdb/break-cond-parse.c:582
      #2 0x163391b in create_breakpoint_parse_arg_string_tests ../../src/gdb/break-cond-parse.c:649
      #3 0x12cfebc in void std::__invoke_impl<void, void (*&)()>(std::__invoke_other, void (*&)()) /usr/include/c++/13/bits/invoke.h:61
      #4 0x12cc8ee in std::enable_if<is_invocable_r_v<void, void (*&)()>, void>::type std::__invoke_r<void, void (*&)()>(void (*&)()) /usr/include/c++/13/bits/invoke.h:111
      #5 0x12c81e5 in std::_Function_handler<void (), void (*)()>::_M_invoke(std::_Any_data const&) /usr/include/c++/13/bits/std_function.h:290
      #6 0x18bb51d in std::function<void ()>::operator()() const /usr/include/c++/13/bits/std_function.h:591
      #7 0x4193ef9 in selftests::run_tests(gdb::array_view<char const* const>, bool) ../../src/gdbsupport/selftest.cc:100
      #8 0x21c2206 in maintenance_selftest ../../src/gdb/maint.c:1172
      ... etc ...

The problem was caused by three lines like this one:

  thread_info *thr
    = parse_thread_id (std::string (t.get_value ()).c_str (), &tmptok);

After parsing the thread-id TMPTOK would be left pointing into the
temporary string which had been created on this line.  When on the
next line we did this:

  gdb_assert (*tmptok == '\0');

The value of *TMPTOK is undefined.

Fix this by creating the std::string earlier in the scope.  Now the
contents of the string will remain valid when we check *TMPTOK.  The
address sanitizer issue is now resolved.
gentoo-bot pushed a commit that referenced this pull request Nov 28, 2024
The binary provided with bug 32165 [1] has 36139 ELF sections.  GDB
crashes on it with (note that my GDB is build with -D_GLIBCXX_DEBUG=1:

    $ ./gdb  -nx -q --data-directory=data-directory ./vmlinux
    Reading symbols from ./vmlinux...
    (No debugging symbols found in ./vmlinux)
    (gdb) info func
    /usr/include/c++/14.2.1/debug/vector:508:
    In function:
        std::debug::vector<_Tp, _Allocator>::reference std::debug::vector<_Tp,
        _Allocator>::operator[](size_type) [with _Tp = long unsigned int;
        _Allocator = std::allocator<long unsigned int>; reference = long
        unsigned int&; size_type = long unsigned int]

    Error: attempt to subscript container with out-of-bounds index -29445, but
    container only holds 36110 elements.

    Objects involved in the operation:
        sequence "this" @ 0x514000007340 {
          type = std::debug::vector<unsigned long, std::allocator<unsigned long> >;
        }

The crash occurs here:

    #3  0x00007ffff5e334c3 in __GI_abort () at abort.c:79
    #4  0x00007ffff689afc4 in __gnu_debug::_Error_formatter::_M_error (this=<optimized out>) at /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/debug.cc:1320
    #5  0x0000555561119a16 in std::__debug::vector<unsigned long, std::allocator<unsigned long> >::operator[] (this=0x514000007340, __n=18446744073709522171)
        at /usr/include/c++/14.2.1/debug/vector:508
    #6  0x0000555562e288e8 in minimal_symbol::value_address (this=0x5190000bb698, objfile=0x514000007240) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:517
    #7  0x0000555562e5a131 in global_symbol_searcher::expand_symtabs (this=0x7ffff0f5c340, objfile=0x514000007240, preg=std::optional [no contained value])
        at /home/smarchi/src/binutils-gdb/gdb/symtab.c:4983
    #8  0x0000555562e5d2ed in global_symbol_searcher::search (this=0x7ffff0f5c340) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:5189
    #9  0x0000555562e5ffa4 in symtab_symbol_info (quiet=false, exclude_minsyms=false, regexp=0x0, kind=FUNCTION_DOMAIN, t_regexp=0x0, from_tty=1)
        at /home/smarchi/src/binutils-gdb/gdb/symtab.c:5361
    #10 0x0000555562e6131b in info_functions_command (args=0x0, from_tty=1) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:5525

That is, at this line of `minimal_symbol::value_address`, where
`objfile->section_offsets` is an `std::vector`:

    return (CORE_ADDR (this->unrelocated_address ())
	    + objfile->section_offsets[this->section_index ()]);

A section index of -29445 is suspicious.  The minimal_symbol at play
here is:

    (top-gdb) p m_name
    $1 = 0x521001de10af "_sinittext"

So I restarted debugging, breaking on:

   (top-gdb) b general_symbol_info::set_section_index if $_streq("_sinittext", m_name)

And I see that weird -29445 value:

    (top-gdb) frame
    #0  general_symbol_info::set_section_index (this=0x525000082390, idx=-29445) at /home/smarchi/src/binutils-gdb/gdb/symtab.h:611
    611       { m_section = idx; }

But going up one frame, the section index is 36091:

    (top-gdb) frame
    #1  0x0000555562426526 in minimal_symbol_reader::record_full (this=0x7ffff0ead560, name="_sinittext", copy_name=false,
        address=-2111475712, ms_type=mst_text, section=36091) at /home/smarchi/src/binutils-gdb/gdb/minsyms.c:1228
    1228      msymbol->set_section_index (section);

It seems like the problem is just that the type used for the section
index (short) is not big enough.  Change from short to int.  If somebody
insists, we could even go long long / int64_t, but I doubt it's
necessary.

With that fixed, I get:

    (gdb) info func
    All defined functions:

    Non-debugging symbols:
    0xffffffff81000000  _stext
    0xffffffff82257000  _sinittext
    0xffffffff822b4ebb  _einittext

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=32165

Change-Id: Icb1c3de9474ff5adef7e0bbbf5e0b67b279dee04
Reviewed-By: Tom de Vries <tdevries@suse.de>
Reviewed-by: Keith Seitz <keiths@redhat.com>
gentoo-bot pushed a commit that referenced this pull request Nov 28, 2024
When building gdb with gcc 12 and -fsanitize=threads while renabling
background dwarf reading by setting dwarf_synchronous to false, I run into:
...
(gdb) file amd64-watchpoint-downgrade
Reading symbols from amd64-watchpoint-downgrade...
(gdb) watch global_var
==================
WARNING: ThreadSanitizer: data race (pid=20124)
  Read of size 8 at 0x7b80000500d8 by main thread:
    #0 cooked_index_entry::full_name(obstack*, bool) const cooked-index.c:220
    #1 cooked_index::get_main_name(obstack*, language*) const cooked-index.c:735
    #2 cooked_index_worker::wait(cooked_state, bool) cooked-index.c:559
    #3 cooked_index::wait(cooked_state, bool) cooked-index.c:631
    #4 cooked_index_functions::wait(objfile*, bool) cooked-index.h:729
    #5 cooked_index_functions::compute_main_name(objfile*) cooked-index.h:806
    #6 objfile::compute_main_name() symfile-debug.c:461
    #7 find_main_name symtab.c:6503
    #8 main_language() symtab.c:6608
    #9 set_initial_language_callback symfile.c:1634
    #10 get_current_language() language.c:96
    ...

  Previous write of size 8 at 0x7b80000500d8 by thread T1:
    #0 cooked_index_shard::finalize(parent_map_map const*) \
         dwarf2/cooked-index.c:409
    #1 operator() cooked-index.c:663
    ...

  ...

SUMMARY: ThreadSanitizer: data race cooked-index.c:220 in \
  cooked_index_entry::full_name(obstack*, bool) const
==================
Hardware watchpoint 1: global_var
(gdb) PASS: gdb.arch/amd64-watchpoint-downgrade.exp: watch global_var
...

This was also reported in PR31715.

This is due do gcc PR110799 [1], generating wrong code with
-fhoist-adjacent-loads, and causing a false positive for
-fsanitize=threads.

Work around the gcc PR by forcing -fno-hoist-adjacent-loads for gcc <= 13
and -fsanitize=threads.

Tested in that same configuration on x86_64-linux.  Remaining ThreadSanitizer
problems are the ones reported in PR31626 (gdb.rust/dwindex.exp) and
PR32247 (gdb.trace/basic-libipa.exp).

PR gdb/31715
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31715

Tested-By: Bernd Edlinger <bernd.edlinger@hotmail.de>
Approved-By: Tom Tromey <tom@tromey.com>

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799
gentoo-bot pushed a commit that referenced this pull request Nov 28, 2024
When calling a function with double arguments, I get this asan error:

==7920==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x0053131ece38 at pc 0x7ff79697a68f bp 0x0053131ec790 sp 0x0053131ebf40
READ of size 16 at 0x0053131ece38 thread T0
    #0 0x7ff79697a68e in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long long), void const*, void const*, unsigned long long) C:/gcc/src/gcc-14.2.0/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:814
    #1 0x7ff79697aebd in memcmp C:/gcc/src/gcc-14.2.0/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:845
    #2 0x7ff79697aebd in memcmp C:/gcc/src/gcc-14.2.0/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:840
    #3 0x7ff7927e237f in regcache::raw_write(int, gdb::array_view<unsigned char const>) C:/gdb/src/gdb.git/gdb/regcache.c:874
    #4 0x7ff7927e3c85 in regcache::cooked_write(int, gdb::array_view<unsigned char const>) C:/gdb/src/gdb.git/gdb/regcache.c:914
    #5 0x7ff7927e5d89 in regcache::cooked_write(int, unsigned char const*) C:/gdb/src/gdb.git/gdb/regcache.c:933
    #6 0x7ff7911d5965 in amd64_windows_store_arg_in_reg C:/gdb/src/gdb.git/gdb/amd64-windows-tdep.c:216

Address 0x0053131ece38 is located in stack of thread T0 at offset 40 in frame
    #0 0x7ff7911d565f in amd64_windows_store_arg_in_reg C:/gdb/src/gdb.git/gdb/amd64-windows-tdep.c:208

  This frame has 4 object(s):
    [32, 40) 'buf' (line 211) <== Memory access at offset 40 overflows this variable

It's because the first 4 double arguments are passed via XMM registers,
and they need a buffer of 16 bytes, even if we only use 8 bytes of them.

Approved-By: Tom Tromey <tom@tromey.com>
gentoo-bot pushed a commit that referenced this pull request Nov 28, 2024
On Windows gcore is not implemented, and if you try it, you get an
heap-use-after-free error:

(gdb) gcore C:/gdb/build64/gdb-git-python3/gdb/testsuite/outputs/gdb.base/gcore-buffer-overflow/gcore-buffer-overflow.test
warning: cannot close "=================================================================
==10108==ERROR: AddressSanitizer: heap-use-after-free on address 0x1259ea503110 at pc 0x7ff6806e3936 bp 0x0062e01ed990 sp 0x0062e01ed140
READ of size 111 at 0x1259ea503110 thread T0
    #0 0x7ff6806e3935 in strlen C:/gcc/src/gcc-14.2.0/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:391
    #1 0x7ff6807169c4 in __pformat_puts C:/gcc/src/mingw-w64-v12.0.0/mingw-w64-crt/stdio/mingw_pformat.c:558
    #2 0x7ff6807186c1 in __mingw_pformat C:/gcc/src/mingw-w64-v12.0.0/mingw-w64-crt/stdio/mingw_pformat.c:2514
    #3 0x7ff680713614 in __mingw_vsnprintf C:/gcc/src/mingw-w64-v12.0.0/mingw-w64-crt/stdio/mingw_vsnprintf.c:41
    #4 0x7ff67f34419f in vsnprintf(char*, unsigned long long, char const*, char*) C:/msys64/mingw64/x86_64-w64-mingw32/include/stdio.h:484
    #5 0x7ff67f34419f in string_vprintf[abi:cxx11](char const*, char*) C:/gdb/src/gdb.git/gdbsupport/common-utils.cc:106
    #6 0x7ff67b37b739 in cli_ui_out::do_message(ui_file_style const&, char const*, char*) C:/gdb/src/gdb.git/gdb/cli-out.c:227
    #7 0x7ff67ce3d030 in ui_out::call_do_message(ui_file_style const&, char const*, ...) C:/gdb/src/gdb.git/gdb/ui-out.c:571
    #8 0x7ff67ce4255a in ui_out::vmessage(ui_file_style const&, char const*, char*) C:/gdb/src/gdb.git/gdb/ui-out.c:740
    #9 0x7ff67ce2c873 in ui_file::vprintf(char const*, char*) C:/gdb/src/gdb.git/gdb/ui-file.c:73
    #10 0x7ff67ce7f83d in gdb_vprintf(ui_file*, char const*, char*) C:/gdb/src/gdb.git/gdb/utils.c:1881
    #11 0x7ff67ce7f83d in vwarning(char const*, char*) C:/gdb/src/gdb.git/gdb/utils.c:181
    #12 0x7ff67f3530eb in warning(char const*, ...) C:/gdb/src/gdb.git/gdbsupport/errors.cc:33
    #13 0x7ff67baed27f in gdb_bfd_close_warning C:/gdb/src/gdb.git/gdb/gdb_bfd.c:437
    #14 0x7ff67baed27f in gdb_bfd_close_or_warn C:/gdb/src/gdb.git/gdb/gdb_bfd.c:646
    #15 0x7ff67baed27f in gdb_bfd_unref(bfd*) C:/gdb/src/gdb.git/gdb/gdb_bfd.c:739
    #16 0x7ff68094b6f2 in gdb_bfd_ref_policy::decref(bfd*) C:/gdb/src/gdb.git/gdb/gdb_bfd.h:82
    #17 0x7ff68094b6f2 in gdb::ref_ptr<bfd, gdb_bfd_ref_policy>::~ref_ptr() C:/gdb/src/gdb.git/gdbsupport/gdb_ref_ptr.h:91
    #18 0x7ff67badf4d2 in gcore_command C:/gdb/src/gdb.git/gdb/gcore.c:176

0x1259ea503110 is located 16 bytes inside of 4064-byte region [0x1259ea503100,0x1259ea5040e0)
freed by thread T0 here:
    #0 0x7ff6806b1687 in free C:/gcc/src/gcc-14.2.0/libsanitizer/asan/asan_malloc_win.cpp:90
    #1 0x7ff67f2ae807 in objalloc_free C:/gdb/src/gdb.git/libiberty/objalloc.c:187
    #2 0x7ff67d7f56e3 in _bfd_free_cached_info C:/gdb/src/gdb.git/bfd/opncls.c:247
    #3 0x7ff67d7f2782 in _bfd_delete_bfd C:/gdb/src/gdb.git/bfd/opncls.c:180
    #4 0x7ff67d7f5df9 in bfd_close_all_done C:/gdb/src/gdb.git/bfd/opncls.c:960
    #5 0x7ff67d7f62ec in bfd_close C:/gdb/src/gdb.git/bfd/opncls.c:925
    #6 0x7ff67baecd27 in gdb_bfd_close_or_warn C:/gdb/src/gdb.git/gdb/gdb_bfd.c:643
    #7 0x7ff67baecd27 in gdb_bfd_unref(bfd*) C:/gdb/src/gdb.git/gdb/gdb_bfd.c:739
    #8 0x7ff68094b6f2 in gdb_bfd_ref_policy::decref(bfd*) C:/gdb/src/gdb.git/gdb/gdb_bfd.h:82
    #9 0x7ff68094b6f2 in gdb::ref_ptr<bfd, gdb_bfd_ref_policy>::~ref_ptr() C:/gdb/src/gdb.git/gdbsupport/gdb_ref_ptr.h:91
    #10 0x7ff67badf4d2 in gcore_command C:/gdb/src/gdb.git/gdb/gcore.c:176

It happens because gdb_bfd_close_or_warn uses a bfd-internal name for
the failing-close warning, after the close is finished, and the name
already freed:

static int
gdb_bfd_close_or_warn (struct bfd *abfd)
{
  int ret;
  const char *name = bfd_get_filename (abfd);

  for (asection *sect : gdb_bfd_sections (abfd))
    free_one_bfd_section (sect);

  ret = bfd_close (abfd);

  if (!ret)
    gdb_bfd_close_warning (name,
			   bfd_errmsg (bfd_get_error ()));

  return ret;
}

Fixed by making a copy of the name for the warning.

Approved-By: Andrew Burgess <aburgess@redhat.com>
gentoo-bot pushed a commit that referenced this pull request Feb 3, 2025
This commit adds support for a `gstack' command which Fedora has
been carrying for many years. gstack is a natural counterpart to
the gcore command. Whereas gcore dumps a core file, gstack prints
stack traces of a running process.

There are many improvements over Fedora's version of this script.
The dependency on procfs is gone; gstack will run anywhere gdb
runs. The only runtime dependencies are bash and awk.

The script includes suggestions from gdb/32325 to include
versioning and help. [If this approach to gdb/32325 is acceptable,
I could propagate the solution to gcore/gdb-add-index.]

I've rewritten the documentation, integrating it into the User Manual.
The manpage is now output using this one source.

Example run (on x86_64 Fedora 40)

$ gstack --help
Usage: gstack [-h|--help] [-v|--version] PID
Print a stack trace of a running program

  -h, --help         Print this message then exit.
  -v, --version      Print version information then exit.
$ gstack -v
GNU gstack (GDB) 16.0.50.20241119-git
$ gstack 12345678
Process 12345678 not found.
$ gstack $(pidof emacs)
Thread 6 (Thread 0x7fd5ec1c06c0 (LWP 2491423) "pool-spawner"):
#0  0x00007fd6015ca3dd in syscall () at /lib64/libc.so.6
#1  0x00007fd60b31eccd in g_cond_wait () at /lib64/libglib-2.0.so.0
#2  0x00007fd60b28a61b in g_async_queue_pop_intern_unlocked () at /lib64/libglib-2.0.so.0
#3  0x00007fd60b2f1a03 in g_thread_pool_spawn_thread () at /lib64/libglib-2.0.so.0
#4  0x00007fd60b2f0813 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#5  0x00007fd6015486d7 in start_thread () at /lib64/libc.so.6
#6  0x00007fd6015cc60c in clone3 () at /lib64/libc.so.6
#7  0x0000000000000000 in ??? ()

Thread 5 (Thread 0x7fd5eb9bf6c0 (LWP 2491424) "gmain"):
#0  0x00007fd6015be87d in poll () at /lib64/libc.so.6
#1  0x0000000000000001 in ??? ()
#2  0xffffffff00000001 in ??? ()
#3  0x0000000000000001 in ??? ()
#4  0x000000002104cfd0 in ??? ()
#5  0x00007fd5eb9be320 in ??? ()
#6  0x00007fd60b321c34 in g_main_context_iterate_unlocked.isra () at /lib64/libglib-2.0.so.0

Thread 4 (Thread 0x7fd5eb1be6c0 (LWP 2491425) "gdbus"):
#0  0x00007fd6015be87d in poll () at /lib64/libc.so.6
#1  0x0000000020f9b558 in ??? ()
#2  0xffffffff00000003 in ??? ()
#3  0x0000000000000003 in ??? ()
#4  0x00007fd5d8000b90 in ??? ()
#5  0x00007fd5eb1bd320 in ??? ()
#6  0x00007fd60b321c34 in g_main_context_iterate_unlocked.isra () at /lib64/libglib-2.0.so.0

Thread 3 (Thread 0x7fd5ea9bd6c0 (LWP 2491426) "emacs"):
#0  0x00007fd6015ca3dd in syscall () at /lib64/libc.so.6
#1  0x00007fd60b31eccd in g_cond_wait () at /lib64/libglib-2.0.so.0
#2  0x00007fd60b28a61b in g_async_queue_pop_intern_unlocked () at /lib64/libglib-2.0.so.0
#3  0x00007fd60b28a67c in g_async_queue_pop () at /lib64/libglib-2.0.so.0
#4  0x00007fd603f4d0d9 in fc_thread_func () at /lib64/libpangoft2-1.0.so.0
#5  0x00007fd60b2f0813 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#6  0x00007fd6015486d7 in start_thread () at /lib64/libc.so.6
#7  0x00007fd6015cc60c in clone3 () at /lib64/libc.so.6
#8  0x0000000000000000 in ??? ()

Thread 2 (Thread 0x7fd5e9e6d6c0 (LWP 2491427) "dconf worker"):
#0  0x00007fd6015be87d in poll () at /lib64/libc.so.6
#1  0x0000000000000001 in ??? ()
#2  0xffffffff00000001 in ??? ()
#3  0x0000000000000001 in ??? ()
#4  0x00007fd5cc000b90 in ??? ()
#5  0x00007fd5e9e6c320 in ??? ()
#6  0x00007fd60b321c34 in g_main_context_iterate_unlocked.isra () at /lib64/libglib-2.0.so.0

Thread 1 (Thread 0x7fd5fcc45280 (LWP 2491417) "emacs"):
#0  0x00007fd6015c9197 in pselect () at /lib64/libc.so.6
#1  0x0000000000000000 in ??? ()

Since this is essentially a complete rewrite of the original
script and documentation, I've chosen to only keep a 2024 copyright date.

Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Approved-By: Tom Tromey <tom@tromey.com>
gentoo-bot pushed a commit that referenced this pull request Feb 3, 2025
…read call

Commit 7fcdec0 ("GDB: Use gdb::array_view for buffers used in
register reading and unwinding") introduces a regression in
gdb.base/jit-reader.exp:

    $ ./gdb -q -nx --data-directory=data-directory testsuite/outputs/gdb.base/jit-reader/jit-reader -ex 'jit-reader-load /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/jit-reader/jit-reader.so' -ex r -batch

    This GDB supports auto-downloading debuginfo from the following URLs:
      <https://debuginfod.archlinux.org>
    Enable debuginfod for this session? (y or [n]) [answered N; input not from terminal]
    Debuginfod has been disabled.
    To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
    [Thread debugging using libthread_db enabled]
    Using host libthread_db library "/usr/lib/../lib/libthread_db.so.1".

    Program received signal SIGTRAP, Trace/breakpoint trap.
    Recursive internal problem.

The "Recusive internal problem" part is not good, but it's not the point
of this patch.  It still means we hit an internal error.

The stack trace is:

    #0  internal_error_loc (file=0x55555ebefb20 "/home/simark/src/binutils-gdb/gdb/frame.c", line=1207, fmt=0x55555ebef500 "%s: Assertion `%s' failed.") at /home/simark/src/binutils-gdb/gdbsupport/errors.cc:53
    #1  0x0000555561604d83 in frame_register_unwind (next_frame=..., regnum=16, optimizedp=0x7ffff12e5a20, unavailablep=0x7ffff12e5a30, lvalp=0x7ffff12e5a40, addrp=0x7ffff12e5a60, realnump=0x7ffff12e5a50, buffer=...)
        at /home/simark/src/binutils-gdb/gdb/frame.c:1207
    #2  0x0000555561608334 in deprecated_frame_register_read (frame=..., regnum=16, myaddr=...) at /home/simark/src/binutils-gdb/gdb/frame.c:1496
    #3  0x0000555561a74259 in jit_unwind_reg_get_impl (cb=0x7ffff1049ca0, regnum=16) at /home/simark/src/binutils-gdb/gdb/jit.c:988
    #4  0x00007fffd26e634e in read_register (callbacks=0x7ffff1049ca0, dw_reg=16, value=0x7fffffffb4c8) at /home/simark/src/binutils-gdb/gdb/testsuite/gdb.base/jit-reader.c:100
    #5  0x00007fffd26e645f in unwind_frame (self=0x50400000ac10, cbs=0x7ffff1049ca0) at /home/simark/src/binutils-gdb/gdb/testsuite/gdb.base/jit-reader.c:143
    #6  0x0000555561a74a12 in jit_frame_sniffer (self=0x55556374d040 <jit_frame_unwind>, this_frame=..., cache=0x5210002905f8) at /home/simark/src/binutils-gdb/gdb/jit.c:1042
    #7  0x00005555615f499e in frame_unwind_try_unwinder (this_frame=..., this_cache=0x5210002905f8, unwinder=0x55556374d040 <jit_frame_unwind>) at /home/simark/src/binutils-gdb/gdb/frame-unwind.c:138
    #8  0x00005555615f512c in frame_unwind_find_by_frame (this_frame=..., this_cache=0x5210002905f8) at /home/simark/src/binutils-gdb/gdb/frame-unwind.c:209
    #9  0x00005555616178d0 in get_frame_type (frame=...) at /home/simark/src/binutils-gdb/gdb/frame.c:2996
    #10 0x000055556282db03 in do_print_frame_info (uiout=0x511000027500, fp_opts=..., frame=..., print_level=0, print_what=SRC_AND_LOC, print_args=1, set_current_sal=1) at /home/simark/src/binutils-gdb/gdb/stack.c:1033

The problem is that function `jit_unwind_reg_get_impl` passes field
`gdb_reg_value::value`, a gdb_byte array of 1 element (used as a
flexible array member), as the array view parameter of
`deprecated_frame_register_read`.  This results in an array view of size
1.  The assertion in `frame_register_unwind` that verifies the passed in
buffer is larger enough to hold the unwound register value then fails.

Fix this by explicitly creating an array view of the right size.

Change-Id: Ie170da438ec9085863e7be8b455a067b531635dc
Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
gentoo-bot pushed a commit that referenced this pull request May 22, 2025
Recent work in the TUI has improved GDB's use of the curses
wnoutrefresh and doupdate mechanism, which improves performance by
batching together updates and then doing a single set of writes to the
screen when doupdate is finally called.

The tui_batch_rendering type is a RAII class which, in its destructor,
calls doupdate to send the batched updates to the screen.

However, if there is no tui_batch_rendering active on the call stack
then any wnoutrefresh calls will remain batched but undisplayed until
the next time doupdate happens to be called.

This problem can be seen in PR gdb/32623.  When an inferior is started
the 'Starting program' message is not immediately displayed to the
user.

The 'Starting program' message originates from run_command_1 in
infcmd.c, the message is sent to the current_uiout, which will be the
TUI ui_out.  After the message is sent, ui_out::flush() is called,
here's the backtrace when that happens:

  #0  tui_file::flush (this=0x36e4ab0) at ../../src/gdb/tui/tui-file.c:42
  #1  0x0000000001004f4b in pager_file::flush (this=0x36d35f0) at ../../src/gdb/utils.c:1531
  #2  0x0000000001004f71 in gdb_flush (stream=0x36d35f0) at ../../src/gdb/utils.c:1539
  #3  0x00000000006975ab in cli_ui_out::do_flush (this=0x35a50b0) at ../../src/gdb/cli-out.c:250
  #4  0x00000000009fd1f9 in ui_out::flush (this=0x35a50b0) at ../../src/gdb/ui-out.h:263
  #5  0x00000000009f56ad in run_command_1 (args=0x0, from_tty=1, run_how=RUN_NORMAL) at ../../src/gdb/infcmd.c:449
  #6  0x00000000009f599a in run_command (args=0x0, from_tty=1) at ../../src/gdb/infcmd.c:511

And if we check out tui_file::flush (tui-file.c) we can see that this
just calls tui_win_info::refresh_window(), which in turn, just uses
wnoutrefresh to batch any pending output.

The problem is that, in the above backtrace, there is no
tui_batch_rendering active, and so there will be no doupdate call to
flush the output to the screen.

We could add a tui_batch_rendering into tui_file::flush.  And
tui_file::write.  And tui_file::puts .....

... but that all seems a bit unnecessary.  Instead, I propose that
tui_win_info::refresh_window() should be changed.  If suppress_output
is true (i.e. a tui_batch_rendering is active) then we should continue
to call wnoutrefresh().  But if suppress_output is false, meaning that
no tui_batch_rendering is in place, then we should call wrefresh(),
which immediately writes the output to the screen.

Testing but PR gdb/32623 was a little involved.  We need to 'run' the
inferior and check for the 'Starting program' message.  But DejaGNUU
can only check for the message once it knows the message should have
appeared.  But, as the bug is that output is not displayed, we don't
have any output hints that the inferior is started yet...

In the end, I have the inferior create a file in the test's output
directory.  Now DejaGNU can send the 'run' command, and wait for the
file to appear.  Once that happens, we know that the 'Starting
program' message should have appeared.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32623

Approved-By: Tom Tromey <tom@tromey.com>
gentoo-bot pushed a commit that referenced this pull request Jul 28, 2025
This resolves the following memory leak reported by ASAN:

Direct leak of 17 byte(s) in 1 object(s) allocated from:
    #0 0x3ffb32fbb1d in malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
    #1 0x2aa149861cf in xmalloc ../../libiberty/xmalloc.c:149
    #2 0x2aa149868ff in xstrdup ../../libiberty/xstrdup.c:34
    #3 0x2aa1312391f in s390_machinemode ../../gas/config/tc-s390.c:2241
    #4 0x2aa130ddc7b in read_a_source_file ../../gas/read.c:1293
    #5 0x2aa1304f7bf in perform_an_assembly_pass ../../gas/as.c:1223
    #6 0x2aa1304f7bf in main ../../gas/as.c:1436
    #7 0x3ffb282be35 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #8 0x3ffb282bf33 in __libc_start_main_impl ../csu/libc-start.c:360
    #9 0x2aa1305758f  (/home/jremus/git/binutils/build-asan/gas/as-new+0x2d5758f) (BuildId: ...)

gas/
	* config/tc-s390.c (s390_machinemode): Free mode_string before
	returning.

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
gentoo-bot pushed a commit that referenced this pull request Jul 28, 2025
Simplify the .machine directive parsing logic, so that cpu_string is
always xstrdup'd and can therefore always be xfree'd before returning
to the caller.

This resolves the following memory leak reported by ASAN:

Direct leak of 13 byte(s) in 3 object(s) allocated from:
    #0 0x3ff8aafbb1d in malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
    #1 0x2aa338861cf in xmalloc ../../libiberty/xmalloc.c:149
    #2 0x2aa338868ff in xstrdup ../../libiberty/xstrdup.c:34
    #3 0x2aa320253cb in s390_machine ../../gas/config/tc-s390.c:2172
    #4 0x2aa31fddc7b in read_a_source_file ../../gas/read.c:1293
    #5 0x2aa31f4f7bf in perform_an_assembly_pass ../../gas/as.c:1223
    #6 0x2aa31f4f7bf in main ../../gas/as.c:1436
    #7 0x3ff8a02be35 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #8 0x3ff8a02bf33 in __libc_start_main_impl ../csu/libc-start.c:360
    #9 0x2aa31f5758f  (/home/jremus/git/binutils/build-asan/gas/as-new+0x2d5758f) (BuildId: ...)

While at it add tests with double quoted .machine
"<cpu>[+<extension>...]" values.

gas/
	* config/tc-s390.c (s390_machine): Simplify parsing and free
	cpu_string before returning.

gas/testsuite/
	* gas/s390/machine-parsing-1.l: Add tests with double quoted
	values.
	* gas/s390/machine-parsing-1.s: Likewise.

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
gentoo-bot pushed a commit that referenced this pull request Jul 28, 2025
Recent work in the TUI has improved GDB's use of the curses
wnoutrefresh and doupdate mechanism, which improves performance by
batching together updates and then doing a single set of writes to the
screen when doupdate is finally called.

The tui_batch_rendering type is a RAII class which, in its destructor,
calls doupdate to send the batched updates to the screen.

However, if there is no tui_batch_rendering active on the call stack
then any wnoutrefresh calls will remain batched but undisplayed until
the next time doupdate happens to be called.

This problem can be seen in PR gdb/32623.  When an inferior is started
the 'Starting program' message is not immediately displayed to the
user.

The 'Starting program' message originates from run_command_1 in
infcmd.c, the message is sent to the current_uiout, which will be the
TUI ui_out.  After the message is sent, ui_out::flush() is called,
here's the backtrace when that happens:

  #0  tui_file::flush (this=0x36e4ab0) at ../../src/gdb/tui/tui-file.c:42
  #1  0x0000000001004f4b in pager_file::flush (this=0x36d35f0) at ../../src/gdb/utils.c:1531
  #2  0x0000000001004f71 in gdb_flush (stream=0x36d35f0) at ../../src/gdb/utils.c:1539
  #3  0x00000000006975ab in cli_ui_out::do_flush (this=0x35a50b0) at ../../src/gdb/cli-out.c:250
  #4  0x00000000009fd1f9 in ui_out::flush (this=0x35a50b0) at ../../src/gdb/ui-out.h:263
  #5  0x00000000009f56ad in run_command_1 (args=0x0, from_tty=1, run_how=RUN_NORMAL) at ../../src/gdb/infcmd.c:449
  #6  0x00000000009f599a in run_command (args=0x0, from_tty=1) at ../../src/gdb/infcmd.c:511

And if we check out tui_file::flush (tui-file.c) we can see that this
just calls tui_win_info::refresh_window(), which in turn, just uses
wnoutrefresh to batch any pending output.

The problem is that, in the above backtrace, there is no
tui_batch_rendering active, and so there will be no doupdate call to
flush the output to the screen.

We could add a tui_batch_rendering into tui_file::flush.  And
tui_file::write.  And tui_file::puts .....

... but that all seems a bit unnecessary.  Instead, I propose that
tui_win_info::refresh_window() should be changed.  If suppress_output
is true (i.e. a tui_batch_rendering is active) then we should continue
to call wnoutrefresh().  But if suppress_output is false, meaning that
no tui_batch_rendering is in place, then we should call wrefresh(),
which immediately writes the output to the screen.

Testing but PR gdb/32623 was a little involved.  We need to 'run' the
inferior and check for the 'Starting program' message.  But DejaGNUU
can only check for the message once it knows the message should have
appeared.  But, as the bug is that output is not displayed, we don't
have any output hints that the inferior is started yet...

In the end, I have the inferior create a file in the test's output
directory.  Now DejaGNU can send the 'run' command, and wait for the
file to appear.  Once that happens, we know that the 'Starting
program' message should have appeared.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32623

Approved-By: Tom Tromey <tom@tromey.com>
gentoo-bot pushed a commit that referenced this pull request Jul 28, 2025
Consider the test-case sources main.c and foo.c:

  $ cat main.c
  extern int foo (void);

  int
  main (void)
  {
    return foo ();
  }
  $ cat foo.c
  extern int foo (void);

  int
  foo (void)
  {
    return 0;
  }

and main.c compiled with debug info, and foo.c without:

  $ gcc -g main.c -c
  $ gcc foo.c -c
  $ gcc -g main.o foo.o

In TUI mode, if we run to foo:

  $ gdb -q a.out -tui -ex "b foo" -ex run

it gets us "[ No Source Available ]":

  ┌─main.c─────────────────────────────────────────┐
  │                                                │
  │                                                │
  │                                                │
  │            [ No Source Available ]             │
  │                                                │
  │                                                │
  └────────────────────────────────────────────────┘
  (src) In: foo                  L??   PC: 0x400566
  ...
  Breakpoint 1, 0x0000000000400566 in foo ()
  (gdb)

But after resizing (pressing ctrl-<minus> in the gnome-terminal), we
get instead the source for main.c:

  ┌─main.c─────────────────────────────────────────┐
  │        3 int                                   │
  │        4 main (void)                           │
  │        5 {                                     │
  │        6   return foo ();                      │
  │        7 }                                     │
  │                                                │
  │                                                │
  └────────────────────────────────────────────────┘
  (src) In: foo                   L??   PC: 0x400566
  ...
  Breakpoint 1, 0x0000000000400566 in foo ()
  (gdb)

which is inappropriate because we're stopped in function foo, which is
not in main.c.

The problem is that, when the window is resized, GDB ends up calling
tui_source_window_base::rerender.  The rerender function has three
cases, one for when the window already has some source code
content (which is not the case here), a case for when the inferior is
active, and we have a selected frame (which is the case that applies
here), and a final case for when the inferior is not running.

For the case which we end up in, the source code window has no
content, but the inferior is running, so we have a selected frame, GDB
calls the get_current_source_symtab_and_line() function to get the
symtab_and_line for the current location.

The get_current_source_symtab_and_line() will actually return the last
recorded symtab and line location, not the current symtab and line
location.

What this means, is that, if the current location has no debug
information, get_current_source_symtab_and_line() will return any
previously recorded location, or failing that, the default (main)
location.

This behaviour of get_current_source_symtab_and_line() also causes
problems for the 'list' command.  Consider this pure CLI session:

  (gdb) break foo
  Breakpoint 1 at 0x40110a
  (gdb) run
  Starting program: /tmp/a.out

  Breakpoint 1, 0x000000000040110a in foo ()
  (gdb) list
  1	extern int foo (void);
  2
  3	int
  4	main (void)
  5	{
  6	  return foo ();
  7	}
  (gdb) list .
  Insufficient debug info for showing source lines at current PC (0x40110a).
  (gdb)

However, if we look at how GDB's TUI updates the source window during
a normal stop, we see that GDB does a better job of displaying the
expected contents.  Going back to our original example, when we start
GDB with:

  $ gdb -q a.out -tui -ex "b foo" -ex run

we do get the "[ No Source Available ]" message as expected.  Why is
that?

The answer is that, in this case GDB uses tui_show_frame_info to
update the source window, tui_show_frame_info is called each time a
prompt is displayed, like this:

  #0  tui_show_frame_info (fi=...) at ../../src/gdb/tui/tui-status.c:269
  #1  0x0000000000f55975 in tui_refresh_frame_and_register_information () at ../../src/gdb/tui/tui-hooks.c:118
  #2  0x0000000000f55ae8 in tui_before_prompt (current_gdb_prompt=0x31ef930 <top_prompt+16> "(gdb) ") at ../../src/gdb/tui/tui-hooks.c:165
  #3  0x000000000090ea45 in std::_Function_handler<void(char const*), void (*)(char const*)>::_M_invoke (__functor=..., __args#0=@0x7ffc955106b0: 0x31ef930 <top_prompt+16> "(gdb) ") at /usr/include/c++/9/bits/std_function.h:300
  #4  0x00000000009020df in std::function<void(char const*)>::operator() (this=0x5281260, __args#0=0x31ef930 <top_prompt+16> "(gdb) ") at /usr/include/c++/9/bits/std_function.h:688
  #5  0x0000000000901c35 in gdb::observers::observable<char const*>::notify (this=0x31dda00 <gdb::observers::before_prompt>, args#0=0x31ef930 <top_prompt+16> "(gdb) ") at ../../src/gdb/../gdbsupport/observable.h:166
  #6  0x00000000008ffed8 in notify_before_prompt (prompt=0x31ef930 <top_prompt+16> "(gdb) ") at ../../src/gdb/event-top.c:518
  #7  0x00000000008fff08 in top_level_prompt () at ../../src/gdb/event-top.c:534
  #8  0x00000000008ffdeb in display_gdb_prompt (new_prompt=0x0) at ../../src/gdb/event-top.c:487

If we look at how tui_show_frame_info figures out what source to
display, it doesn't use get_current_source_symtab_and_line(), instead,
it finds a symtab_and_line directly from a frame_info_pt.  This means
we are not dependent on get_current_source_symtab_and_line() returning
the current location (which it does not).

I propose that we change tui_source_window_base::rerender() so that,
for the case we are discussing here (the inferior has a selected
frame, but the source window has no contents), we move away from using
get_current_source_symtab_and_line(), and instead use find_frame_sal
instead, like tui_show_frame_info does.

This means that we will always use the inferior's current location.

Tested on x86_64-linux.

Reviewed-By: Tom de Vries <tdevries@suse.de>
Reported-By: Andrew Burgess <aburgess@redhat.com>
Co-Authored-By: Andrew Burgess <aburgess@redhat.com>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32614
gentoo-bot pushed a commit that referenced this pull request Jul 28, 2025
…get_file_names

PR 32742 shows this failing:

    $ make check TESTS="gdb.ada/access_to_unbounded_array.exp" RUNTESTFLAGS="--target_board=fission"
    Running /home/simark/src/binutils-gdb/gdb/testsuite/gdb.ada/access_to_unbounded_array.exp ...
    FAIL: gdb.ada/access_to_unbounded_array.exp: scenario=all: gdb_breakpoint: set breakpoint at foo.adb:23 (GDB internal error)

Or, interactively:

    $ ./gdb -q -nx --data-directory=data-directory testsuite/outputs/gdb.ada/access_to_unbounded_array/foo-all -ex 'b foo.adb:23' -batch
    /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:19567: internal-error: set_lang: Assertion `old_value == language_unknown || old_value == language_minimal || old_value == lang' failed.

The symptom is that for a given dwarf2_per_cu, the language gets set
twice.  First, set to `language_ada`, and then, to `language_minimal`.
It's unexpected for the language of a CU to get changed like this.

The CU at offset 0x0 in the main file looks like:

    0x00000000: Compile Unit: length = 0x00000030, format = DWARF32, version = 0x0004, abbr_offset = 0x0000, addr_size = 0x08 (next unit at 0x00000034)

    0x0000000b: DW_TAG_compile_unit
                  DW_AT_low_pc [DW_FORM_addr]       (0x000000000000339a)
                  DW_AT_high_pc [DW_FORM_data8]     (0x0000000000000432)
                  DW_AT_stmt_list [DW_FORM_sec_offset]      (0x00000000)
                  DW_AT_GNU_dwo_name [DW_FORM_strp] ("b~foo.dwo")
                  DW_AT_comp_dir [DW_FORM_strp]     ("/home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.ada/access_to_unbounded_array")
                  DW_AT_GNU_pubnames [DW_FORM_flag_present] (true)
                  DW_AT_GNU_addr_base [DW_FORM_sec_offset]  (0x00000000)
                  DW_AT_GNU_dwo_id [DW_FORM_data8]  (0x277aee54e7bd47f7)

This refers to the DWO file b~foo.dwo, whose top-level DIE is:

    .debug_info.dwo contents:
    0x00000000: Compile Unit: length = 0x00000b63, format = DWARF32, version = 0x0004, abbr_offset = 0x0000, addr_size = 0x08 (next unit at 0x00000b67)

    0x0000000b: DW_TAG_compile_unit
                  DW_AT_producer [DW_FORM_GNU_str_index]    ("GNU Ada 14.2.1 20250207 -fgnat-encodings=minimal -gdwarf-4 -fdebug-types-section -fuse-ld=gold -gnatA -gnatWb -gnatiw -gdwarf-4 -gsplit-dwarf -ggnu-pubnames -gnatws -mtune=generic -march=x86-64")
                  DW_AT_language [DW_FORM_data1]    (DW_LANG_Ada95)
                  DW_AT_name [DW_FORM_GNU_str_index]        ("/home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.ada/access_to_unbounded_array/b~foo.adb")
                  DW_AT_comp_dir [DW_FORM_GNU_str_index]    ("/home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.ada/access_to_unbounded_array")
                  DW_AT_GNU_dwo_id [DW_FORM_data8]  (0xdbeffefab180a2cb)

The thing to note is that the language attribute is only present in the
DIE in the DWO file, not on the DIE in the main file.

The first time the language gets set is here:

    #0  dwarf2_per_cu::set_lang (this=0x50f0000044b0, lang=language_ada, dw_lang=DW_LANG_Ada95) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:20788
    #1  0x0000555561666af6 in cutu_reader::prepare_one_comp_unit (this=0x7ffff10bf2b0, cu=0x51700008e000, pretend_language=language_minimal) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:21029
    #2  0x000055556159f740 in cutu_reader::cutu_reader (this=0x7ffff10bf2b0, this_cu=0x50f0000044b0, per_objfile=0x516000066080, abbrev_table=0x510000004640, existing_cu=0x0, skip_partial=false, pretend_language=language_minimal, cache=0x7ffff11b95e0) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:3371
    #3  0x00005555615a547a in process_psymtab_comp_unit (this_cu=0x50f0000044b0, per_objfile=0x516000066080, storage=0x7ffff11b95e0) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:3799
    #4  0x00005555615a9292 in cooked_index_worker_debug_info::process_cus (this=0x51700008dc80, task_number=0, first=std::unique_ptr<dwarf2_per_cu> = {...}, end=std::unique_ptr<dwarf2_per_cu> = {...}) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:4122

In this code path (particularly this specific cutu_reader constructir),
the work is done to find and read the DWO file.  So the language is
properly identifier as language_ada, all good so far.

The second time the language gets set is:

    #0  dwarf2_per_cu::set_lang (this=0x50f0000044b0, lang=language_minimal, dw_lang=0) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:20788
    #1  0x0000555561666af6 in cutu_reader::prepare_one_comp_unit (this=0x7ffff0f42730, cu=0x517000091b80, pretend_language=language_minimal) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:21029
    #2  0x00005555615a1822 in cutu_reader::cutu_reader (this=0x7ffff0f42730, this_cu=0x50f0000044b0, per_objfile=0x516000066080, pretend_language=language_minimal, parent_cu=0x0, dwo_file=0x0) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:3464
    #3  0x000055556158c850 in dw2_get_file_names (this_cu=0x50f0000044b0, per_objfile=0x516000066080) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:1956
    #4  0x000055556158f4f5 in dw_expand_symtabs_matching_file_matcher (per_objfile=0x516000066080, file_matcher=...) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:2157
    #5  0x00005555616329e2 in cooked_index_functions::expand_symtabs_matching (this=0x50200002ab50, objfile=0x516000065780, file_matcher=..., lookup_name=0x0, symbol_matcher=..., expansion_notify=..., search_flags=..., domain=..., lang_matcher=...) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:15912
    #6  0x0000555562ca8a14 in objfile::map_symtabs_matching_filename (this=0x516000065780, name=0x50200002ad90 "break pck.adb", real_path=0x0, callback=...) at /home/smarchi/src/binutils-gdb/gdb/symfile-debug.c:207
    #7  0x0000555562d68775 in iterate_over_symtabs (pspace=0x513000005600, name=0x50200002ad90 "break pck.adb", callback=...) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:727

Here, we use the other cutu_reader constructor, the one that does not
look up the DWO file for the passed CU.  If a DWO file exists for this
CU, the caller is expected to pass it as a parameter.  That cutu_reader
constructor also ends up setting the language of the CU.  But because it
didn't read the DWO file, it didn't figure out the language is
language_ada, so it tries to set the language to the default,
language_minimal.

A question is: why do we end up trying to set the CU's language is this
context.  This is completely unrelated to what we're trying to do, that
is get the file names from the line table.  Setting the language is a
side-effect of just constructing a cutu_reader, which we need to look up
attributes in dw2_get_file_names_reader.  There are probably some
cleanups to be done here, to avoid doing useless work like looking up
and setting the CU's language when all we need is an object to help
reading the DIEs and attributes.  But that is future work.

The same cutu_reader constructor is used in
`dwarf2_per_cu::ensure_lang`.  Since this is the version of cutu_reader
that does not look up the DWO file, it will conclude that the language
is language_minimal and set that as the CU's language.  In other words,
`dwarf2_per_cu::ensure_lang` will get the language wrong, pretty ironic.

Fix this by using the other cutu_reader constructor in those two spots.
Pass `per_objfile->get_cu (this_cu)`, as the `existing_cu` parameter.  I
think this is necessary, because that constructor has an assert to check
that if `existing_cu` is nullptr, then there must not be an existing
`dwarf2_cu` in the per_objfile.

To avoid getting things wrong like this, I think that the second
cutu_reader constructor should be reserved for the spots that do pass a
non-nullptr dwo_file.  The only spot at the moment in
create_cus_hash_table, where we read multiple units from the same DWO
file.  In this context, I guess it makes sense for efficiency to get the
dwo_file once and pass it down to cutu_reader.  For that constructor,
make the parameters non-optional, add "non-nullptr" asserts, and update
the code to assume the passed values are not nullptr.

What I don't know is if this change is problematic thread-wise, if the
functions I have modified to use the other cutu_reader constructor can
be called concurrently in worker threads.  If so, I think it would be
problematic.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32742
Change-Id: I980d16875b9a43ab90e251504714d0d41165c7c8
Approved-By: Tom Tromey <tom@tromey.com>
gentoo-bot pushed a commit that referenced this pull request Jul 28, 2025
…_all_symtabs

Commit 2920415 ("gdb/dwarf: use ranged for loop in some spots")
broke some tests notably gdb.base/maint.exp with the fission board.

    $ ./gdb -nx -q --data-directory=data-directory testsuite/outputs/gdb.base/maint/maint -ex start -ex "maint expand-sym" -batch
    ...
    Temporary breakpoint 1, main (argc=1, argv=0x7fffffffdc48, envp=0x7fffffffdc58) at /home/smarchi/src/binutils-gdb/gdb/testsuite/gdb.base/break.c:43
    43          if (argc == 12345) {  /* an unlikely value < 2^16, in case uninited */ /* set breakpoint 6 here */
    /usr/include/c++/14.2.1/debug/safe_iterator.h:392:
    In function:
        gnu_debug::_Safe_iterator<_Iterator, _Sequence, _Category>&
        gnu_debug::_Safe_iterator<_Iterator, _Sequence, _Category>::operator++()
        [with _Iterator = gnu_cxx::
        normal_iterator<std::unique_ptr<dwarf2_per_cu, dwarf2_per_cu_deleter>*,
        std::vector<std::unique_ptr<dwarf2_per_cu, dwarf2_per_cu_deleter>,
        std::allocator<std::unique_ptr<dwarf2_per_cu, dwarf2_per_cu_deleter> > >
        >; _Sequence = std::debug::vector<std::unique_ptr<dwarf2_per_cu,
        dwarf2_per_cu_deleter> >; _Category = std::forward_iterator_tag]

    Error: attempt to increment a singular iterator.

Note that this is caught because I build with -D_GLIBCXX_DEBUG=1.
Otherwise, it might crash more randomly, or just not crash at all (but
still be buggy).

While iterating on the all_units vector, some type units get added
there:

    #0  add_type_unit (per_bfd=0x51b000044b80, section=0x50e0000c2280, sect_off=0, length=74, sig=4367013491293299229) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:2576
    #1  0x00005555618a3a40 in lookup_dwo_signatured_type (cu=0x51700009b580, sig=4367013491293299229) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:2664
    #2  0x00005555618ee176 in queue_and_load_dwo_tu (dwo_unit=0x521000120e00, cu=0x51700009b580) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:8329
    #3  0x00005555618eeafe in queue_and_load_all_dwo_tus (cu=0x51700009b580) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:8366
    #4  0x00005555618966a6 in dw2_do_instantiate_symtab (per_cu=0x50f0000043c0, per_objfile=0x516000065a80, skip_partial=true) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:1695
    #5  0x00005555618968d4 in dw2_instantiate_symtab (per_cu=0x50f0000043c0, per_objfile=0x516000065a80, skip_partial=true) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:1719
    #6  0x000055556189ac3f in dwarf2_base_index_functions::expand_all_symtabs (this=0x502000024390, objfile=0x516000065780) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:1977

This invalidates the iterator in
dwarf2_base_index_functions::expand_all_symtabs, which is caught by the
libstdc++ debug mode.

I'm not entirely sure that it is correct to append type units from dwo
files to the all_units vector like this.  The
dwarf2_find_containing_comp_unit function expects a precise ordering of
the elements of the all_units vector, to be able to do a binary search.
Appending a type unit at the end at this point certainly doesn't respect
that ordering.

For now I'd just like to undo the regression.  Do that by using
all_units_range in the ranged for loop.  I will keep in mind to
investigate whether this insertion of type units in all_units after the
fact really makes sense or not.

Change-Id: Iec131e59281cf2dbd12d3f3d163b59018fdc54da
gentoo-bot pushed a commit that referenced this pull request Jul 28, 2025
On Debian 12, with gcc 12 and ld 2.40, I get some failures when running:

    $ make check TESTS="gdb.base/style.exp" RUNTESTFLAGS="--target_board=fission"

I think I stumble on this bug [1], preventing the test from doing
anything that requires expanding the compilation unit:

    $ ./gdb -nx -q --data-directory=data-directory testsuite/outputs/gdb.base/style/style
    Reading symbols from testsuite/outputs/gdb.base/style/style...
    (gdb) p main
    DW_FORM_strp pointing outside of .debug_str section [in module /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/style/style]
    (gdb)

The error is thrown here:

    #0  0x00007ffff693f0a1 in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6
    #1  0x0000555569ce6852 in throw_it(return_reason, errors, const char *, typedef __va_list_tag __va_list_tag *) (reason=RETURN_ERROR, error=GENERIC_ERROR, fmt=0x555562a9fc40 "%s pointing outside of %s section [in module %s]", ap=0x7fffffff8df0) at /home/smarchi/src/binutils-gdb/gdbsupport/common-exceptions.cc:203
    #2  0x0000555569ce690f in throw_verror (error=GENERIC_ERROR, fmt=0x555562a9fc40 "%s pointing outside of %s section [in module %s]", ap=0x7fffffff8df0) at /home/smarchi/src/binutils-gdb/gdbsupport/common-exceptions.cc:211
    #3  0x000055556879c0cb in verror (string=0x555562a9fc40 "%s pointing outside of %s section [in module %s]", args=0x7fffffff8df0) at /home/smarchi/src/binutils-gdb/gdb/utils.c:193
    #4  0x0000555569cfa88d in error (fmt=0x555562a9fc40 "%s pointing outside of %s section [in module %s]") at /home/smarchi/src/binutils-gdb/gdbsupport/errors.cc:45
    #5  0x000055556667dbff in dwarf2_section_info::read_string (this=0x61b000042a08, objfile=0x616000055e80, str_offset=262811, form_name=0x555562886b40 "DW_FORM_strp") at /home/smarchi/src/binutils-gdb/gdb/dwarf2/section.c:211
    #6  0x00005555662486b7 in dwarf_decode_macro_bytes (per_objfile=0x616000056180, builder=0x614000006040, abfd=0x6120000f4b40, mac_ptr=0x60300004f5be "", mac_end=0x60300004f5bb "\002\004", current_file=0x62100007ad70, lh=0x60f000028bd0, section=0x61700008ba78, section_is_gnu=1, section_is_dwz=0, offset_size=4, str_section=0x61700008bac8, str_offsets_section=0x61700008baf0, str_offsets_base=std::optional<unsigned long> = {...}, include_hash=..., cu=0x61700008b600) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/macro.c:511
    #7  0x000055556624af0e in dwarf_decode_macros (per_objfile=0x616000056180, builder=0x614000006040, section=0x61700008ba78, lh=0x60f000028bd0, offset_size=4, offset=0, str_section=0x61700008bac8, str_offsets_section=0x61700008baf0, str_offsets_base=std::optional<unsigned long> = {...}, section_is_gnu=1, cu=0x61700008b600) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/macro.c:934
    #8  0x000055556642cb82 in dwarf_decode_macros (cu=0x61700008b600, offset=0, section_is_gnu=1) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:19435
    #9  0x000055556639bd12 in read_file_scope (die=0x6210000885c0, cu=0x61700008b600) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:6366
    #10 0x0000555566392d99 in process_die (die=0x6210000885c0, cu=0x61700008b600) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:5310
    #11 0x0000555566390d72 in process_full_comp_unit (cu=0x61700008b600, pretend_language=language_minimal) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:5075

The exception is then only caught at the event-loop level
(start_event_loop), causing the whole debug info reading process to be
aborted.  I think it's a little harsh, considering that a lot of things
could work even if we failed to read macro information.

Catch the exception inside read_file_scope, print the exception, and
carry on.  We could go even more fine-grained: if reading the string for
one macro definition fails, we could continue reading the macro
information.  Perhaps it's just that one macro definition that is
broken.  However, I don't need this level of granularity, so I haven't
attempted this.  Also, my experience is that macro reading fails when
the compiler or linker has a bug, in which case pretty much everything
is messed up.

With this patch, it now looks like:

    $ ./gdb -nx -q --data-directory=data-directory testsuite/outputs/gdb.base/style/style
    Reading symbols from testsuite/outputs/gdb.base/style/style...
    (gdb) p main
    While reading section .debug_macro.dwo: DW_FORM_strp pointing outside of .debug_str section [in module /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/style/style]
    $1 = {int (int, char **)} 0x684 <main>
    (gdb)

In the test I am investigating (gdb.base/style.exp with the fission
board), it allows more tests to run:

    -# of expected passes           107
    -# of unexpected failures       17
    +# of expected passes           448
    +# of unexpected failures       19

Of course, we still see the error about the macro information, and some
macro-related tests still fail (those would be kfailed ideally), but
many tests that are not macro-dependent now pass.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111409

Change-Id: I0bdb01f153eff23c63c96ce3f41114bb027e5796
Approved-By: Tom Tromey <tom@tromey.com>
gentoo-bot pushed a commit that referenced this pull request Jul 28, 2025
When building with gcc, with flags -gdwarf-5, -gsplit-dwarf and
-fdebug-types-section, the resulting .dwo files contain multiple
.debug_info.dwo sections.  One for each type unit and one for the
compile unit.  This is correct, as per DWARF 5, section F.2.3 ("Contents
of the Split DWARF Object Files"):

    The split DWARF object files each contain the following sections:

        ...
        .debug_info.dwo (for the compilation unit)
        .debug_info.dwo (one COMDAT section for each type unit)
	...

GDB currently assumes that there is a single .debug_info.dwo section,
causing unpredictable behavior.  For example, sometimes this crash:

    ==81781==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x508000007a71 at pc 0x58704d32a59c bp 0x7ffc0acc0bb0 sp 0x7ffc0acc0ba0
    READ of size 1 at 0x508000007a71 thread T0
        #0 0x58704d32a59b in bfd_getl32 /home/smarchi/src/binutils-gdb/bfd/libbfd.c:846
        #1 0x58704ae62dce in read_initial_length(bfd*, unsigned char const*, unsigned int*, bool) /home/smarchi/src/binutils-gdb/gdb/dwarf2/leb.c:92
        #2 0x58704aaf76bf in read_comp_unit_head(comp_unit_head*, unsigned char const*, dwarf2_section_info*, rcuh_kind) /home/smarchi/src/binutils-gdb/gdb/dwarf2/comp-unit-head.c:47
        #3 0x58704aaf8f97 in read_and_check_comp_unit_head(dwarf2_per_objfile*, comp_unit_head*, dwarf2_section_info*, dwarf2_section_info*, unsigned char const*, rcuh_kind) /home/smarchi/src/binutils-gdb/gdb/dwarf2/comp-unit-head.c:193
        #4 0x58704b022908 in create_dwo_unit_hash_tables /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:6233
        #5 0x58704b0334a5 in open_and_init_dwo_file /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:7588
        #6 0x58704b03965a in lookup_dwo_cutu /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:7935
        #7 0x58704b03a5b1 in lookup_dwo_comp_unit /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:8009
        #8 0x58704aff5b70 in lookup_dwo_unit /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:2802

The first time that locate_dwo_sections gets called for a
.debug_info.dwo section, dwo_sections::info gets initialized properly.
The second time it gets called for a .debug_info.dwo section, the size
field in dwo_sections::info gets overwritten with the size of the second
section.  But the buffer remains pointing to the contents of the first
section, because the section is already "read in".  So the size does not
match the buffer.  And even if it did, we would only keep the
information about one .debug_info.dwo, out of the many.

First, add an assert in locate_dwo_sections to make sure we don't
try to fill in a dwo section info twice.  Add the assert to other
functions with the same pattern, while at it.

Then, change dwo_sections::info to be a vector of sections (just like we
do for type sections).  Update locate_dwo_sections to append to that
vector when seeing a new .debug_info.dwo section.  Update
open_and_init_dwo_file to read the units from each section.

The problem can be observed by running some tests with the
dwarf5-fission-debug-types target board.  For example,
gdb.base/condbreak.exp crashes (with the ASan failure shown above)
before the patch and passes after).

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119766

Change-Id: Iedf275768b6057dee4b1542396714f3d89903cf3
Reviewed-By: Tom de Vries <tdevries@suse.de>
gentoo-bot pushed a commit that referenced this pull request Jul 28, 2025
I decided to try to build and test gdb on Windows.

I found a page on the wiki [1] suggesting three ways of building gdb:
- MinGW,
- MinGW on Cygwin, and
- Cygwin.

I picked Cygwin, because I've used it before (though not recently).

I managed to install Cygwin and sufficient packages to build gdb and start the
testsuite.

However, testsuite progress ground to a halt at gdb.base/branch-to-self.exp.
[ AFAICT, similar problems reported here [2]. ]

I managed to reproduce this hang by running just the test-case.

I attempted to kill the hanging processes by:
- first killing the inferior process, using the cygwin "kill -9" command, and
- then killing the gdb process, likewise.

But the gdb process remained, and I had to point-and-click my way through task
manager to actually kill the gdb process.

I investigated this by attaching to the hanging gdb process.  Looking at the
main thread, I saw it was stopped in a call to WaitForSingleObject, with
the dwMilliseconds parameter set to INFINITE.

The backtrace in more detail:
...
(gdb) bt
 #0  0x00007fff196fc044 in ntdll!ZwWaitForSingleObject () from
     /cygdrive/c/windows/SYSTEM32/ntdll.dll
 #1  0x00007fff16bbcdcf in WaitForSingleObjectEx () from
     /cygdrive/c/windows/System32/KERNELBASE.dll
 #2  0x0000000100998065 in wait_for_single (handle=0x1b8, howlong=4294967295) at
     gdb/windows-nat.c:435
 #3  0x0000000100999aa7 in
     windows_nat_target::do_synchronously(gdb::function_view<bool ()>)
       (this=this@entry=0xa001c6fe0, func=...) at gdb/windows-nat.c:487
 #4  0x000000010099a7fb in windows_nat_target::wait_for_debug_event_main_thread
     (event=<optimized out>, this=0xa001c6fe0)
     at gdb/../gdbsupport/function-view.h:296
 #5  windows_nat_target::kill (this=0xa001c6fe0) at gdb/windows-nat.c:2917
 #6  0x00000001008f2f86 in target_kill () at gdb/target.c:901
 #7  0x000000010091fc46 in kill_or_detach (from_tty=0, inf=0xa000577d0)
     at gdb/top.c:1658
 #8  quit_force (exit_arg=<optimized out>, from_tty=from_tty@entry=0)
     at gdb/top.c:1759
 #9  0x00000001004f9ea8 in quit_command (args=args@entry=0x0,
     from_tty=from_tty@entry=0) at gdb/cli/cli-cmds.c:483
 #10 0x000000010091c6d0 in quit_cover () at gdb/top.c:295
 #11 0x00000001005e3d8a in async_disconnect (arg=<optimized out>)
     at gdb/event-top.c:1496
 #12 0x0000000100499c45 in invoke_async_signal_handlers ()
     at gdb/async-event.c:233
 #13 0x0000000100eb23d6 in gdb_do_one_event (mstimeout=mstimeout@entry=-1)
     at gdbsupport/event-loop.cc:198
 #14 0x00000001006df94a in interp::do_one_event (mstimeout=-1,
     this=<optimized out>) at gdb/interps.h:87
 #15 start_event_loop () at gdb/main.c:402
 #16 captured_command_loop () at gdb/main.c:466
 #17 0x00000001006e2865 in captured_main (data=0x7ffffcba0) at gdb/main.c:1346
 #18 gdb_main (args=args@entry=0x7ffffcc10) at gdb/main.c:1365
 #19 0x0000000100f98c70 in main (argc=10, argv=0xa000129f0) at gdb/gdb.c:38
...

In the docs [3], I read that using an INFINITE argument to WaitForSingleObject
might cause a system deadlock.

This prompted me to try this simple change in wait_for_single:
...
   while (true)
     {
-      DWORD r = WaitForSingleObject (handle, howlong);
+      DWORD r = WaitForSingleObject (handle,
+                                     howlong == INFINITE ? 100 : howlong);
+      if (howlong == INFINITE && r == WAIT_TIMEOUT)
+        continue;
...
with the timeout of 0.1 second estimated to be:
- small enough for gdb to feel reactive, and
- big enough not to consume too much cpu cycles with looping.

And indeed, the test-case, while still failing, now finishes in ~50 seconds.

While there may be an underlying bug that triggers this behaviour, the failure
mode is so severe that I consider it a bug in itself.

Fix this by avoiding calling WaitForSingleObject with INFINITE argument.

Tested on x86_64-cygwin, by running the testsuite past the test-case.

Approved-By: Pedro Alves <pedro@palves.net>

PR tdep/32894
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32894

[1] https://sourceware.org/gdb/wiki/BuildingOnWindows
[2] https://sourceware.org/pipermail/gdb-patches/2025-May/217949.html
[3] https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-waitforsingleobject
gentoo-bot pushed a commit that referenced this pull request Jul 28, 2025
This commit fixes a couple of issues relating to the pagination
prompt and styling.  The pagination prompt is this one:

  --Type <RET> for more, q to quit, c to continue without paging--

I did try to split this into multiple patches, based on the three
issues I describe below, but in the end, the fixes were all too
interconnected, so it ended up as one patch that makes two related,
but slightly different changes:

  1. Within the pager_file class, relying on the m_applied_style
  attribute of the wrapped m_stream, as is done when calling
  m_stream->emit_style_escape, is not correct, so stop doing that, and

  2. Failing to update m_applied_style within the pager_file class can
  leave that attribute out of date, which can then lead to styling
  errors later on, so ensure m_applied_style is always updated.

The problems I have seen are:

  1. After quitting from a pagination prompt, the next command can
  incorrectly style its output.  This was reported as bug PR
  gdb/31033, and is fixed by this commit.

  2. The pagination prompt itself could be styled.  The pagination
  prompt should always be shown in the default style.

  3. After continuing the output at a pagination prompt, GDB can fail
  to restore the default style the next time the output (within the
  same command) switches back to the default style.

There are tests for all these issues as part of this patch.

The pager_file class is a sub-class of wrapped_file, this means that a
pager_file is itself a ui_file, while it also manages a pointer to a
ui_file object (called m_stream).  An instance of pager_file can be
installed as the gdb_stdout ui_file object.

Output sent to a pager_file is stored within an internal
buffer (called m_wrap_buffer) until we have a complete line, when the
content is flushed to the wrapped m_stream.  If sufficient lines have
been written out then the pager_file will present the pagination
prompt and allow the user to continue viewing output, or quit the
current command.

As a pager_file is a ui_file, it has an m_applied_style member
variable.

The managed stream (m_stream) is also a ui_file, and so also has an
m_applied_style member variable.

In some places within the pager_file class we attempt to change the
current style of the m_stream using calls like this:

  m_stream->emit_style_escape (style);

See pager_file::emit_style_escape, pager_file::prompt_for_continue,
and pager_file::puts.  These calls will end up in
ui_file::emit_style_escape, which tries to skip emitting unnecessary
style escapes by checking if the requested style matches the current
m_applied_style value.

The m_applied_style value is updated by calls to the emit_style_escape
function.

The problem here is that most of the time pager_file doesn't change
the style of m_stream by calling m_stream->emit_style_escape.  Most of
the time, style changes are performed by pager_file writing the escape
sequence into m_wrap_buffer, and then later flushing this buffer to
m_stream by calling m_stream->puts.

It has to be done this way.  Calling m_stream->emit_style_escape
would, if it actually changed the style, immediately change the style
by emitting an escape sequence.  But pager_file doesn't want that, it
wants the style change to happen later, when m_wrap_buffer is
flushed.

To avoid excessive style escape sequences being written into
m_wrap_buffer, the pager_file::m_applied_style performs a function
similar to the m_applied_style within m_stream, it tracks the current
style for the end of m_wrap_buffer, and only allows style escape
sequences to be emitted if the style is actually changing.

However, a consequence of this is the m_applied_style within m_stream,
is not updated, which means it will be out of sync with the actual
current style of m_stream.  If we then try to make a call to
m_stream->emit_style_escape, if the style we are changing too happens
to match the out of date style in m_stream->m_applied_style, then the
style change will be ignored.

And this is indeed what we see in pager_file::prompt_for_continue with
the call:

  m_stream->emit_style_escape (ui_file_style ());

As m_stream->m_applied_style is not being updated, it will always be
the default style, however m_stream itself might not actually be in
the default style.  This call then will not emit an escape sequence as
the desired style matches the out of date m_applied_style.

The fix in this case is to call m_stream->puts directly, passing in
the escape sequence for the desired style.  This will result in an
immediate change of style for m_stream, which fixes some of the
problems described above.

In fact, given that m_stream's m_applied_style is always going to be
out of sync, I think we should change all of the
m_stream->emit_style_escape calls to instead call m_stream->puts.

However, just changing to use puts doesn't fix all the problems.

I found that, if I run 'apropos time', then quit at the first
pagination prompt.  If for the next command I run 'maintenance time' I
see the expected output:

  "maintenance time" takes a numeric argument.

However, everything after the first double quote is given the command
name style rather than only styling the text between the double
quotes.

Here is GDB's stack while printing the above output:

  #2  0x0000000001050d56 in ui_out::vmessage (this=0x7fff1238a150, in_style=..., format=0x1c05af0 "", args=0x7fff1238a288) at ../../src/gdb/ui-out.c:754
  #3  0x000000000104db88 in ui_file::vprintf (this=0x3f9edb0, format=0x1c05ad0 "\"%ps\" takes a numeric argument.\n", args=0x7fff1238a288) at ../../src/gdb/ui-file.c:73
  #4  0x00000000010bc754 in gdb_vprintf (stream=0x3f9edb0, format=0x1c05ad0 "\"%ps\" takes a numeric argument.\n", args=0x7fff1238a288) at ../../src/gdb/utils.c:1905
  #5  0x00000000010bca20 in gdb_printf (format=0x1c05ad0 "\"%ps\" takes a numeric argument.\n") at ../../src/gdb/utils.c:1945
  #6  0x0000000000b6b29e in maintenance_time_display (args=0x0, from_tty=1) at ../../src/gdb/maint.c:128

The interesting frames here are #3, in here `this` is the pager_file
for GDB's stdout, and this passes its m_applied_style to frame #2 as
the `in_style` argument.

If the m_applied_style is wrong, then frame #2 will believe that the
wrong style is currently in use as the default style, and so, after
printing 'maintenance time' GDB will switch back to the wrong style.

So the question is, why is pager_file::m_applied_style wrong?

In pager_file::prompt_for_continue, there is an attempt to switch back
to the default style using:

  m_stream->emit_style_escape (ui_file_style ());

If this is changed to a puts call (see above) then this still leaves
pager_file::m_applied_style out of date.

The right fix in this case is, I think, to instead do this:

  this->emit_style_escape (ui_file_style ());

this will update pager_file::m_applied_style, and also send the
default style to m_stream using a puts call.

While writing the tests I noticed that I was getting unnecessary style
reset sequences emitted.

The problem is that, around pagination, we don't really know what
style is currently applied to m_stream.  The
pager_file::m_applied_style tracks the style at the end of
m_wrap_buffer, but this can run ahead of the current m_stream style.
For example, if the screen is currently full, such that the next
character of output will trigger the pagination prompt, if the next
call is actually to pager_file::emit_style_escape, then
pager_file::m_applied_style will be updated, but the style of m_stream
will remain unchanged.  When the next character is written to
pager_file::puts then the pagination prompt will be presented, and GDB
will try to switch m_stream back to the default style.  Whether an
escape is emitted or not will depend on the m_applied_style value,
which we know is different than the actual style of m_stream.

It is, after all, only when m_wrap_buffer is flushed to m_stream that
the style of m_stream actually change.

And so, this commit also adds pager_file::m_stream_style.  This new
variable tracks the current style of m_stream.  This really is a
replacement for m_stream's ui_file::m_applied_style, which is not
accessible from pager_file.

When content is flushed from m_wrap_buffer to m_stream then the
current value of pager_file::m_applied_style becomes the current style
of m_stream.  But, when m_wrap_buffer is filling up, but before it is
flushed, then pager_file::m_applied_style can change, but
m_stream_style will remain unchanged.

Now in pager_file::emit_style_escape we are able to skip some of the
direct calls to m_stream->puts() used to emit style escapes.

After all this there are still a few calls to
m_stream->emit_style_escape().  These are all in the wrap_here support
code.  I think that these calls are technically broken, but don't
actually cause any issues due to the way styling works in GDB.  I
certainly haven't been able to trigger any bugs from these calls yet.
I plan to "fix" these in the next commit just for completeness.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31033

Approved-By: Tom Tromey <tom@tromey.com>
gentoo-bot pushed a commit that referenced this pull request Jul 28, 2025
When running gdb.base/foll-fork-syscall.exp with a GDB built with UBSan,
I get:

    /home/simark/src/binutils-gdb/gdb/linux-nat.c:1906:28: runtime error: load of value 3200171710, which is not a valid value for type 'target_waitkind'
    ERROR: GDB process no longer exists
    GDB process exited with wait status 3026417 exp9 0 1
    UNRESOLVED: gdb.base/foll-fork-syscall.exp: follow-fork-mode=child: detach-on-fork=on: test_catch_syscall: continue to breakpoint after fork

The error happens here:

    #0  __sanitizer::Die () at /usr/src/debug/gcc/gcc/libsanitizer/sanitizer_common/sanitizer_termination.cpp:50
    #1  0x00007ffff600d8dd in __ubsan::__ubsan_handle_load_invalid_value_abort (Data=<optimized out>, Val=<optimized out>) at /usr/src/debug/gcc/gcc/libsanitizer/ubsan/ubsan_handlers.cpp:551
    #2  0x00005555636d37b6 in linux_handle_syscall_trap (lp=0x7cdff1eb1b00, stopping=0) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:1906
    #3  0x00005555636e0991 in linux_nat_filter_event (lwpid=3030627, status=1407) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:3044
    #4  0x00005555636e407f in linux_nat_wait_1 (ptid=..., ourstatus=0x7bfff0d6cf18, target_options=...) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:3381
    #5  0x00005555636e7795 in linux_nat_target::wait (this=0x5555704d35e0 <the_amd64_linux_nat_target>, ptid=..., ourstatus=0x7bfff0d6cf18, target_options=...) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:3607
    #6  0x000055556378fad2 in thread_db_target::wait (this=0x55556af42980 <the_thread_db_target>, ptid=..., ourstatus=0x7bfff0d6cf18, options=...) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1398
    #7  0x0000555564811327 in target_wait (ptid=..., status=0x7bfff0d6cf18, options=...) at /home/simark/src/binutils-gdb/gdb/target.c:2593

I believe the problem is that lwp_info::syscall_state is never
initialized.  Fix that by initializing it with TARGET_WAITKIND_IGNORE.
This is the value we use elsewhere when resetting this field to mean
"not stopped at a syscall".

Change-Id: I5b76c63d1466d6e63448fced03305fd5ca8294eb
Approved-By: Tom Tromey <tom@tromey.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants