[AIROCMLIR-49] LLVM Upstream Merge - January 2026#2239
[AIROCMLIR-49] LLVM Upstream Merge - January 2026#2239mirza-halilcevic wants to merge 6 commits intodevelopfrom
Conversation
|
@mirza-halilcevic Github is unable to render such large changes.
|
external/llvm-project/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
Outdated
Show resolved
Hide resolved
The diff files are too large to be uploaded as attachments (in gigabytes). If needed we can generate the diff with: |
The diff files have historically only been a couple of hundred lines for both llvm and mlir. If we are seeing the file size be a couple of gb then it seems like something went wrong. |
|
Also, when it comes time to merge in this PR can we try to have a clean git history as much as possible? I think having a single commit for bumping the LLVM version, then another commit for any external changes that we need to apply, and then finally a third commit for any rocMLIR changes that were needed because of the upstream bump. |
I meant git diff between upstream SHA and with this branch and not against the develop branch |
d4d1212 to
c371694
Compare
|
@umangyadav @justinrosner diff files can be found in the PR description |
…6a98
f5f2faf16a98 [amd] Add to CODEOWNERS for device-libs/hipcc (#1649)
403e9af59db7 merge main into amd-staging (#1645)
25fcb507e040 [AMDGPU] Add -amdgpu-gfx1250-b0-specific option (#1644)
2a255ac32fc4 merge main into amd-staging
772c38832056 [Comgr] Add COMGR_STATIC_LLVM option for static LLVM linking (#1640)
ec15263cb898 [lld][WebAssembly] Convert weak-alias tests to assembly. NFC (#184667)
90c6e6374b95 [clang-tidy] Fix readability-else-after-return for if statements appear in unbraced switch case labels (#181878)
d8e86a5cc0d5 [dsymutil] Add option to filter debug map objects by allow/disallow-list (#182083)
f26ff8d86fd8 [LLDB] Allow symbols added by linker scripts to be examined. (#184679)
933e5368ba02 [DebugInfo] Don't specify target triple for cross-cu-linkonce-distinct.ll (#184685)
7822c770cf48 merge main into amd-staging (#1642)
97275e57b4fc [hipcc] Add HIP_CLANG_LAUNCHER for launching Clang through a wrapper executable (#1490)
585928482419 [MemProf] Add stack IDs to MemProfUse optimization remarks (#184670)
5c08616df765 [DAG] isKnownToBeAPowerOfTwo - Power of 2 value is known to be power of 2 after BSWAP/BITREVERSE (#182207)
28b840815c22 [HLSL] groupshared variables should be implicitly extern and should not be initialized (#184459)
f986523c5937 [CIR] Fix Codegen for Complex & Scalar comparisons (#184006)
ade43a54d441 [WebAssembly] MC support for acquire-release atomics (#183656)
bb78a0a47083 [CIR] Fix __builtin_va_start handling (#184654)
149e62c29bf8 [clang-doc] Add DAG directive to MD "All files" test (#184671)
56ed7485b75e [CIR][NFC] Remove unnecessary call to clangCmpToCIRCmp (#184217)
116b44526724 device-libs: Use frexp builtin instead of frexp_exp + frexp_mant (#1469)
63074da25d16 [DebugInfo][DwarfDebug] Move emission of globals from beginModule() to endModule() (5/7) (#184219)
6e1aee4276bb [AMDGPU] Select v_bfe_u32 for i8/i16 (and (srl x, c), mask) (#182446)
d62fbb6a9664 [SPIRV] Update the global registry when expanding function pointer (#183873)
a212ebd471b1 ValueTracking: Handle constant structs in computeKnownFPClass (#184192)
20902f0b721b ValueTracking: Teach computeKnownFPClass to look at bitcast + integer max (#184073)
34541e5a5400 [HLSL] Add WaveActiveAllEqual functions (#183634)
75b2ea57d5f4 [Clang][UnsafeBufferUsage] Warn about two-arg string_view constructors. (#180471)
2032960b5de1 [flang][NFC] Converted five tests from old lowering to new lowering (part 24) (#184538)
a7914aecb2db [RISCV] Allow unsigned immediates for pli.h, pli.dh, pli.w (#184554)
64c0f624cd07 [lldb] Make the PluginManager thread safe (#184452)
9dc65372aa61 [clang-tidy] Don't report unnamed params for misc-const-correctness (#184388)
df52bb4c32e2 [LoopUnrollPass] Don't use clang specific syntax in optimization remarks (#182430)
0baf5a0496a5 Revert "Silence -Wunused-parameter warnings in Unwind-wasm.c" (#175776)
fbc3a312d69d [mlir][xevm] Remove unnecessary attach target pass. (#184432)
2c95b8d5185b AMDGPU: Clean up print handling of AMDGPUTargetID (#184643)
2b3e30d4c470 [CodeGen] Treat hasOrderedMemoryRef as implying arbitrary loads or stores (#182000)
d9d6b16cc622 ValueTracking: Handle ConstantDataSequential in computeKnownFPClass (#184191)
ff0220d236de [flang][acc] Allow orphaned acc cache directive (#184448)
937bf9cef69f [bazel] Fix parse_headers in bolt (#184648)
56a53550d317 [DebugInfo] Emit DW_AT_const_value for constexpr array static members (#182442)
b80248a0ea35 [clang-doc] Add a Mustache Markdown generator (#177221)
c2db12daa171 [AIX] Sort relocations in XCOFF object writer. (#180807)
302860455663 [bazel] Add target for `clang-nvlink-wrapper` (#184644)
9105d9c24949 [lld][Hexagon] Fix findMaskR8 missing duplex support (#183936)
d326a76d9a1f merge main into amd-staging
9cd054b0bb3e [AArch64] Add lowering for misc NEON intrinsics (#183050)
b9f1199581e7 InstCombine: Support extractvalue in SimplifyDemandedFPClass (#184171)
b0b583475a03 [DAG] Improved handling of ISD::ROTL and ISD::ROTR in isKnownToBeAPowerOfTwo (#182744)
a14f9f822f48 [mlir][xegpu] Add support for accessing the default order of a layout. (#184451)
18226e7e2eb8 [RISCV] Lower i8/i16/i32 scalable vector ISD::CLMUL/CLMULH with Zvbc32e. (#184465)
77f1480f7ef1 [SPIRV] Fix return value of runOnModule for SPIRVPrepareFunctions (#184636)
668d09b2846d [clang][Modules] Fixing unexpected warnings triggered by a PCH and a module with config macros (#177078)
4b3a9246a086 [bazel] Fix more parse_headers cases in lldb (#184534)
8f8590e691fe [OpenMP][AIX] Add libpthreads for -fopenmp (#184629)
39d5aea6df90 [OpenACC] Replace terminators with scf.yield in wrapMultiBlockRegionWithSCFExecuteRegion (#184458)
613a5c555ebf [mlir][vector] Replace OneDimMultiReductionToTwoDim with OneDimMultiReductionToReduction (#184241)
7b72b5fde414 [bazel] Fix building lldb without libedit (#184535)
8486d893bd79 [SelectionDAG] Fix -Wunused-variable after #179318 (#184623)
a40e83b29ce9 [libsycl] Add sycl::queue stub (#184110)
53aa77092ea7 [flang] Fix distribution build of Fortran builtin/intrinsic modules. (#184204)
b0f953f0f395 merge main into amd-staging (#1638)
e8e8d30b229a [Hexagon] Use __HVX_IEEE_FP__ to guard protos that need -mhvx-ieee-fp (#184422)
f55080da988a [flang][OpenMP] Avoid implicit default mapper on pointer captures (#184382)
247a9bfc26ad [mlir][AMDGPU] Add folders for memref aliases to TDM base creation (#184567)
a3eb13b5bfd1 [X86] remove unnecessary movs when %rdx is an input to mulx (#184462)
ded64d2417d4 [DTU] fix dominator tree update eliding reachable nodes (#177683)
b28ec5ad1808 [mlir][Func] Fix FuncOp verifier ordering via hasRegionVerifier (#184612)
e5a6a0f10856 [SPIRV] Fix global emission for modules with no functions (#183833)
c123642824fd [CI] Install binutils-dev in pre-merge container (#184608)
33be2d0e7ab2 [AArch64] Update clmul tests after #184403 (#184611)
c9ca768c88cc [mlir][shape] Fix crash when shape.lib array references undefined symbol (#184613)
56e0b6af1d39 [mlir][affine] Fix crash in vectorizeAffineLoopNest test utility for reduction loops (#184617)
c370f5af6c8d [VPlan] Preserve IsSingleScalar for hoisted predicated load. (#184453)
50653e5a0d1d [tosa] : Enhance tosa.slice folding for dynamic dims. (#184615)
11c11ec2e9fd [clang][Lex] Preserve MultipleIncludeOpt state in Lexer::peekNextPPToken (#183425)
5c2740784224 [analyzer] Suppress optin.cplusplus.VirtualCall warnings in system headers (#184183)
073de3b80375 [SPIRV] Rename `selectSelectDefaultArgs` to `selectBoolToInt` (#184120)
0cbba3ed5f12 [flang-rt] Fix incorrect condition for removing backtrace (#184610)
c6bb6a7e4254 [LV] Add `-force-target-supports-masked-memory-ops` option (#184325)
71de1e47c012 Reapply "[AArch64] Wrap integer SCALAR_TO_VECTOR nodes in bitcasts (#172837)" (#183380) (#184403)
21c1ba16edc0 [TableGen] Complete the support for artificial registers (#183371)
c2e22e3b797d [clang][cmake] Add option to control hmaptool installation (#172725)
47766d7f8c39 [AMDGPU][Clang][Doc] Add documentation for WMMA builtins (#183939)
1b3545117df0 [mlir][irdl] Fix crash in TypeOp/AttributeOp verify on empty sym_name (#184598)
05fdd5383967 [Clang] Fix the lambda context for constraint evaluation (#184319)
c985dec6c26d [psdb][Linux] enable aomp smoke test (#1626)
0af2d43e0641 [Clang] Warn if both of `dllexport`/`dllimport` and `exclude_from_explicit_instantiation` are specified (#183515)
5cf09a68a63d [AArch64][ISel] Use vector register for scalar CLMUL (#183282)
98ed41718b0f [LV] Transform tests for early-exit with stores (#183288)
8bb41c929f3a AMDGPU: Fix copy of Triple (#184594)
095e1694d9c0 [clang] Turn misc copy-assign to move-assign (#184144)
c2784e11cc44 [Flang][OpenMP] DEFAULT(NONE) error checking on implicit references (#182214)
1f4074b771be [mlir][llvm] Fix SROA crash on empty LLVM struct types (#184596)
75320d07f881 merge main into amd-staging
0a1e39517b22 [nfc][analyzer][test][z3] Replace "REQUIRES: no-z3" with "UNSUPPORTED: z3" (#184349)
ee8184573f03 Revert "[flang] make lowering to scf.while default" (#184592)
9c2829f2e188 [mlir][Func] Use getMutableSuccessorOperands() in FuncOp verifier (#184589)
7f044944e43e [MLIR][Arith][Vector] Reject i0 integer type in arith and vector ops (#183589)
ee92ac2343f6 [mlir][nvgpu] Fix crash in optimize-shared-memory pass with vector element types (#179111)
943eb6fd958e [LV] Use make_early_inc_range in handleFindLastReductions (#184340)
d0f50d55746a [AMDGPU] Remove DX10_CLAMP and IEEE bits from gfx1170 (#182107)
de5e081a8339 [flang][NFC] Converted five tests from old lowering to new lowering (part 23) (#184533)
8ac00ba7f9f1 [mlir][SCFToEmitC] Fix crash when scf.while carries a memref loop variable (#183944)
f1aa7c3c5fc9 [mlir][cf] Canonicalize block args with uniform incoming values (#183966)
f702ee89c1d7 [VPlan] Fix partially uninitialized accesses after 17aaa0e590a7. (#184583)
2aab31a94e7b [X86] combine-fcopysign.ll - extend test coverage to all x86-64/x86-64-v2/x86-64-v3/x86-64-v4 levels (#184579)
177211a99f70 [AArch64] Generate test checks (NFC) (#184582)
6bdf076137d0 [clang] Predefine `_MSVC_TRADITIONAL` in MSVC compatibility mode (#184278)
1582dd9c31d5 [lldb] Change more uses of AppendMessageWithFormat to AppendMessageWithFormatv (#184337)
d737cd505562 [clang-tools-extra] Turn misc copy-assign into move-assign (#184146)
756d068ead7e [MLIR][Python][Transform] Expose PatternDescriptorOpInterface to Python (#184331)
9cc0df99de85 [clang-repl] Create virtual files for `input_line_N` buffers (#182044)
c62d5f35b678 [AArch64] Avoid folding sign-extend of vector extracts into ALU ops (#183522)
14af5be5da77 [lldb] Add arithmetic binary subtraction to DIL (#184017)
b6761b287f6b [clang-tidy][NFC] Add missing Option tests in `bugprone` [1/N] (#184015)
732f66eccc24 libclc: Reimplement amdhsa workitem functions (#184571)
31f69d333e49 [libc] Fix integration test args/env in LibcTest lit format (#184438)
3d52f0c539d8 [SPIR-V] Don't consider a function be a builtin just by checking name (#182776)
a636928bb4db [SelectionDAG] Add expansion for llvm.convert.from.arbitrary.fp (#179318)
1a7502592f0f [ARM] Generate test checks (NFC) (#184574)
e7db3f1d3df6 [DSE] Handle provenance when eliminating tautological assignments
b86f24fd0ed4 [InstCombine] make `foldBinOpIntoSelectOrPhi` fold on all operands (#183692)
8fcb60aa47e7 [libc++][NFC] Introduce __data() to std::string to replace std::__to_address(__get_pointer()) (#178212)
c9355cc121df [ELF] Move ArmCmseSGSection into Arch/ARM.cpp (#184570)
3bb4a506c590 [WebAssembly] Print type signature and table for call_indirect (#179120)
52dd63d3caa3 [mlir] Add option to ignore commutativity in OperationEquality (#181507)
2be2926b56ea [Loads] Allow replacement of null with ptr in `canReplacePointersIfEqual`
7fbc0734f5b3 [clang][bytecode] Fix a few comment typos (#184561)
9a821584a516 [clang][bytecode] Fix a mishap in HasPtrField calculation (#184557)
deb70a6d643d [InstCombine] Don't strip leading zero index for overaligned vector GEP (#184364)
c612c98fa6dc [VPlan] Add const to VPPredicator methods. nfc (#184359)
ae4386775663 [DA] Remove consistent flag from Dependence class (#181608)
5b156a4372ac [AMDGPU] Add half vector support for table-driven libcall optimization (#178638)
6b59ad6e8d88 [mlir][linalg] Data layout propagation test schedule (#184151)
6fae863eba8a [X86][APX] Add a few pseudo opcodes support EGPR (#184550)
debb2514ea7f [MC] Fuse relaxation and layout into a single forward pass (#184544)
cd01e6526af6 [ELF] Add target-specific relocation scanning for LoongArch (#182236)
027447c61724 [MC][test] Add relax-branch-align.s demonstrating unnecessary branch relaxation (#184551)
dd8d5ffe0d08 [RISCV] Sink instructions so AVL dominates in RISCVVLOptimizer (#184155)
348f4fb9e00a [DA] Add tests that represent edge cases for the Weak Zero SIV tests (NFC) (#183735)
f5f0930c4715 [GVN] Fix crash when svcount is used with globals-aa (#184347)
5f29cdc175f6 [RISCV] Remove OperandType OPERAND_SIMM10_UNSIGNED. Rename OPERAND_SIMM8_UNSIGNED->OPERAND_SIMM8 (#184540)
4b85c1301fb6 [clang-tidy] Fix false positive in readability-redundant-preprocessor for builtin checks (#181734)
a6bbe463ce67 [clang][CIR] Pass VFS to command-line parsing (#184226)
502df3320553 [Flang][OpenMP] Fix unintended write-back shown in SWDEV-579431 (#1594)
e808a7f844ca [RISCV][GISel] Replace buildInstr with BuildMI (#183714)
e6b9d816f6a2 [Hexagon] Ignore formatting of generated proto files (#184427)
dcbc5de7f85f [lldb][NFC] Add missing include to LZMA.h (#184536)
6e1ab3a4310d [Serialization] Stop demote var definition as declaration (#172430) (#177117) (#184287)
7fb92cdf5fa6 [Benchmark] Fix warnings around usage of __COUNTER__ (#184524)
98c46261d926 [TargetLowering][PowerPC] Don't unroll vector CLMUL when MUL is not supported. (#184238)
1c434928d26c [bazel] Remove old zlib config variable (#184527)
3a85d99a1606 [bazel] Fix building lldb with zlib disabled (#184525)
5de659a44310 merge main into amd-staging (#1632)
928505c98345 [lld][WebAssembly] Convert more tests to assembly. NFC (#184418)
3b4d5ffe847c [MLIR][XeGPU] Add blocking and subgroup to lane distribution support for ConvertLayout operation (#183837)
45dbce3a3a3e [lldb] Wrap LLDBLog Initialize/Terminate in a class (NFC) (#184469)
53fbbaa577f2 [lldb] Fix Initialization/Termination for all log channels (#184467)
699563e0da93 [NFC] Don't replicate hasKernelCallingConv. (#184464)
5e5f7efd7706 [lldb] Expose block equality with SBBlock. (#184222)
60d729fdb226 [flang] Fix test breakage from recent preprocessor change (#184455)
630b9570d199 [mlir][math] Add constant folding for math.rsqrt (#184443)
ece4b759327c [lldb] Add C source output mode to formatter_bytecode.py (#184242)
76568dc89916 [LoopUnrollPass] Add `const` to parameters in `computeUnrollCount` (NFC) (#184058)
0c04d019f0a6 [NFC] [Doc] Fix text codeblock being declared llvm (#184461)
62144f48d43f [flang] make lowering to scf.while default (#184234)
393bbd55201a [gn build] Port commits (#184454)
0c9734f12055 [NFC] [doc] fix invalid comment syntax in IR (#184457)
87a4b36fbe7f [WebAssembly] Use MVT::i32 instead of i1 in performAnyAllCombine (#183866)
e71f327b4605 [X86] support reserve r8~r15 on X86_64 (#180242)
1b633d6d6d75 [Clang] Permit floating point and pointer values in most atomic ops (#183843)
28638f519783 [lldb] Remove Debugger::{FindTargetWithProcessID, FindTargetWithProcess} (#184446)
f4e64ceb4bd8 [lldb] AArch64 register 33 is not cpsr (#183860)
685a65a7f03d [clang-tidy] Add zeyi2 as maintainer (#183883)
9264159ae1df [lldb] Fix the GoogleTest teardown in the DAP unit tests (#184262)
5b144c0aec63 [AMDGPU] Add suffix _d4 to tensor load/store with 4 groups D#, NFC (#184176)
f00a05496471 merge main into amd-staging
1953b87a31a9 [CIR][CodeGen] Upstream support for `__builtin_isinf_sign` (#183977)
5f8065ef63e2 merge main into amd-staging (#1631)
89a4bcf02349 [CIR] Split cir.binop into separate per-operation binary ops (#184227)
c4ea6cc3f736 [lldb] Remove call_once wrappers around PluginManager::RegisterPlugin (#184273)
6b5c55ef169c [lldb] Fix 10 year old leak of `g_debugger_list_ptr` (#184259)
fe76fd292cc3 [AMDGPU][SIInsertWaitcnts][NFC] Call applyWaitcnt() in a loop (#184426)
fdc4a982f5d6 [AMDGPU] Add dereferenceable retAttr to a call to llvm.amdgcn.implicitarg.ptr (#182206)
dc1e3e5dbf78 [X86] getFauxShuffleMask - add ISD::ROTL/ROTR handling (#184417)
dc44bcafe08e [flang-rt] Fix NVPTX builds erroneously using backtrace support (#184415)
df1a53ae2424 Disable leak sanitizer test on ppc. (#184414)
4b06e8388559 [Github][CI] Bump CI containers to LLVM v22.1.0 (#184375)
80a1cf4f8058 clang: Add builtin header for amdhsa abi (#181993)
9d0c62c3ddb1 [X86] known-never-zero.ll - improve demandedelts test coverage for #183227 (#184411)
375d65ee8de7 [CIR] Implement EH lowering to Itanium form and LLVM IR (#184386)
5586d93a87ef [NFC] [HWASan] more meaningful BB names in use-after-scope test (#183867)
b4dfa43cb8ae [RISCV] Fix type inference ambiguity in SwapSysReg pattern (#184305)
8272546f6910 [HLSL][SPIRV] Fix `faceforward` pattern matcher logic (#183630)
17aaa0e590a7 [VPlan] Use bitfield to store Cmp predicates and GEP wrap flags. (NFC) (#181571)
899080a87ad9 [Analysis][DXILResource] Correct bound computation (#184198)
b5baf5e062b2 [CIR] Implement func-ptr/void-ptr addition/subtraction/inc/dec. (#184254)
c7c16573b8f3 [CIR] Synchronize CIR with recent changes to atomic ops (#184416)
a5ca0ec16bdd [libc++] Update documentation for _executeWithFakeConfig (#184420)
2d4c8e0d0fa2 [OpenMP][clang] Indirect and Virtual function call mapping from host to device (#184412)
03bd4ef4ecf9 [CIR] Handle vtable pure and deleted virtual functions (#183862)
6893d277575d [flang][acc] Improve clause validity check around do concurrent (#184389)
c5039c184827 [NFC] Refactor the SelectionDAG::getMemcmp etc with a existing helper function getRuntimeCallSDValueHelper (#184200)
e379ad78203b [LifetimeSafety] Use per-container invalidation rules to fix false positives (#183000)
80acaccbe644 [RISCV] Promote i8/i16/i32 scalable vector CLMUL to i64 CLMUL with Zvbc. (#184265)
f42b8a18d904 [sanitizer][Fuchsia] Define interceptor for reallocarray on Fuchsia (#184410)
ac950786b13e merge main into amd-staging
637bb0e37747 [WebAssembly][FastISel] Call materializeLoadStoreOperands in load fold (#184203)
90febba9c4ec [X86] vector-shuffle-combining-xop.ll - tests showing failure to combine shuffles with non-uniform rotates (#184397)
a34d56dee94b [AArch64] Fix relative vtable PLT/GOTPCREL specifiers to use MCSpecifierExpr (#184393)
ea79bcfcc579 [flang][OpenMP] Fix lowering of LINEAR iteration variables (#183794)
d0dd37124979 [MLIR][Canonicalization] Add shape_cast folding patterns (#183061)
6b040b0dee9c [HIP] Fix -save-temps with the new offload driver (#184385)
7161bd94fded [mlir][mpi] fixing 184189 build failures (#184399)
56b5af76cf3c [bazel][mlir] Fix Bazel build for a232b5b (#184394)
b926acfb341b [flang] remove unused variable (NFC) (#184293)
c1bba5ba023a [VPlan][NFC] Remove unnecessary explicit copy constructors (#183863)
7a310b4c5a06 [mlir][linalg] Upstream PackOp/UnPackOp's generateScalarImplementation. (#182838)
bf680bdf1349 [clang-tidy] Fix yet another false positive in `readability-redundant-typename` (#184301)
7b7c8b2eb3f1 [libc] Extend check-libc-lit to cover include, integration, and all src tests (#184366)
200600a06c20 [ELF] Move PPC32Got2Section into Arch/PPC.cpp (#184383)
616656bc5e1a [ELF] Move MIPS synthetic sections into Arch/Mips.cpp (#184384)
a8a2f2fe9976 [MLIR][XeGPU] Remove fold alias pass in xegpu (#182802)
f95662d159dc Revert "[OpenMP][clang] Indirect and Virtual function call mapping from host to device" (#184378)
b6f389e005d7 [clang-doc] Improve complexity of Index construction (#182621)
9081ac255a8b [DirectX][ResourceAccess] Resolve resource handles at access (#182106)
640ba7b05e75 [Github] Bump clang-format/clang-tidy to v22.1.0 (#184374)
b33c7db8eb63 [clang-doc] Add basic benchmarks for library functionality (#182620)
c72d2e503caf [Comgr] Keep LLVM temporary files if AMD_COMGR_SAVE_LLVM_TEMPS=1. (#1543)
779d76c9effd [AArch64] Add basic NPM support for LoadStoreOptimizer. (#184090)
b44dba97d059 [mlir] Install '.pdll' files along with the header files (#183855)
bb2b957c53b0 [Thumb2] Use BXAUT instruction if available (#183056)
829da4927bf1 [CIR][AArch64] Add lowering for vaba_* and vabd_* builtins (#183595)
a232b5b96f67 [mlir][shard, mpi] Adding Shard/MPI reduce_scatter and simplification (#184189)
5f8f1e2afe99 [CIR] Fix unreachable block generation in EH flattening (#184268)
f82f8cf8d498 [ELF] Add TargetInfo::initTargetSpecificSections hook (#184292)
3f1d968db946 [mlir][IR] Add variadic `getParentOfType` overloads (#184071)
e68f696fdae0 [CI][SPIRV][NFC] Remove unneccessary mkdir from workflow (#184353)
6cc42b39556d [libc] Various GPU allocator tweaks and optimizations (#184368)
d61b45cd409d [Clang] Generate ptr and float atomics without integer casts (#183853)
aef962708fe5 Reapply "[SPIRV][NFCI] Use unordered data structures for SPIR-V extensions (#184162)
02b2a1e8fe7f Fix `assignValueToReg` function's argument (#184354)
358f47772023 [Clang] Fix clang crash for fopenmp statement(for) inside lambda function (#146772)
e10655eb1dfc [X86] known-never-zero.ll - add sdiv/udiv vector test coverage for #183047 (#184350)
43503c44c8d0 [NFC][AArch64] isPureCmp is a duplicate of canAdjustCmp, so remove the duplicate (#183568)
81396ebc51c4 [AMDGPU] Generate more swaps (#184164)
e570faa87ed3 [SPIR-V][HIP] Disable SPV_KHR_untyped_pointers (#183530)
acb8a6df1991 [AArch64] Fix type mismatch in bitconvert + vec_extract patterns (#183549)
c9d065abc158 [X86] Add i256 shift / funnel shift coverage to match i512 tests (#184346)
5b976c930189 [libc][sys] add header and functions for sys ipc (#182700)
c782e2d40572 [SPIRV] Don't emit service function basic block names (#184206)
bbde3e3b59c8 [VPlan] Preserve IsSingleScalar for sunken predicated stores. (#184329)
1eeb2eccf8b2 [clang-tidy] Handle specialization of user-defined type in `bugprone-std-namespace-modification` (#183984)
33864efe461e [lld] Turn misc copy-assign to move-assign (#184145)
534d6e887ff8 [Analysis][NFC] Store CallbackVH in vector, not in map (#184323)
97043e50ad41 [mlir][Vector][GPU] Distribute expanding `shape_cast` ops (#183830)
cec32683498a merge main into amd-staging
de69348f80f5 [Reland] [APINotes] Refactor APINotesReader to propagate llvm::Error (#184212)
fa6eef837831 Revert "Avoid maxnum(sNaN, x) optimizations / folds (#170181)" (#184125)
703649554da8 [DAG] isKnownNeverZero - add ISD::OR DemandedElts handling (#183228)
d908184487b9 [AArch64] Limit support to f32 and f64 in performSelectCombine (#184315)
ec7f3503f8d0 [MLIR] Make test-block-is-in-loop pass a module pass (#184036)
a368bd4049db [CIR][CUDA]: Handle duplicate mangled names (#183976)
ee8259dcca82 [mlir][sparse] Fix use-after-free crash in SparseSpaceCollapsePass (#184001)
f934db36aa26 [mlir][sparse] Reject dense level after non-unique level in encoding verifier (#184157)
6884ff014277 [mlir][sparse] Fix crash in ForeachRewriter for rank-0 dense tensors (#183903)
8d082c7c3144 [mlir][sparse] Fix crash in sparse_tensor.new with unsupported element type (#183898)
bbd5b1d3bd07 [mlir][VectorToXeGPU] Fix crash on memref with non-scalar element type (#183905)
a8a5242bb2dc [mlir][XeGPU] Fix crash in wg-to-sg type converter on non-XeGPU tensors (#183914)
7856e9876808 [mlir][XeGPU] Fix crash in getUArch when no chip target attribute is set (#183912)
ecec7920c636 [mlir][func] Move return-type verification from ReturnOp to FuncOp (#184153)
263a22e86556 [mlir][xegpu] Fix crash in XeGPUPropagateLayout when module has llvm.func (#183899)
6ee48f2ce747 [RISCV] Remove VL != 1 restriction in RISCVVLOptimizer (#184298)
36cced2b8244 [NFC][AArch64] Refactor Arm llvm-mca tests (#183294)
245887e343d3 [X86] Added sincos vector lib codegen test coverage (#183702)
0b36d4265e30 [AArch64] Add vector expansion support for ISD::FCBRT when using ArmPL (#183750)
03a9ebc8974b [DAG] isKnownNeverZero - add ISD::UADDSAT/UMAX/UMIN DemandedElts handling and tests (#183992)
5e814e26dd72 [mlir][llvm] Fix crash in LLVM inliner when callee has no recognized terminator (#183949)
d1c563beee79 [lldb] Don't link TestingSupport as a component (#184310)
4d3bdc0f8947 [lldb] Use AppendMessageWithFormatv in ComandObjectWatchpoint (#184128)
91e73b93e881 [MLIR][XeGPU] Allow uniform vectors in layout conflict resolution (#183756)
8879ff136c73 Support unnamed functions in MIR parser (#183018)
b4fffcd8e415 [NFC][Docs] Add documentation for NVPTX conversion intrinsics (#175536)
5d8c6c198dde [LangRef] Mention allocation elision (#177592)
b4743b2641b6 [VPlan] Introduce VPlan::get(Zero|AllOnes) (NFC) (#184085)
39f2740facea [AMDGPU] IGroupLP: Avoid repeating reachability checks in greedy algorithm (#182463)
09217ba90459 [lldb] Disable shared build for TestTemplateArgs,TestEvents,TestTypeList (#184304)
c4e2f79c22d2 [AArch64][GlobalISel] Limit srem by const of small sizes. (#184066)
92bd6eee4db3 [libc] Reland add getc, ungetc, fflush to enable libc++ iostream on baremetal (#183556)
0933b634c6a2 [AMDGPU] IGroupLP: Refactor SchedGroup::initSchedGroup (NFC) (#184122)
eb1e808fdb44 [IR] Mark reduction intrinsics as nocreateundeforpoison (#184173)
9cda40735a17 [clang][Sema] Fix initialization of GRO when GRO-return type mismatches (CWG2563) (#179156)
b67536954eb1 [Clang][NFCI] Make unchanged global state const (#183478)
ecb694de65e8 [Clang][NFCI] Initialize PredefinedNames immediately (#183295)
a631c3f4077c [mlir][spirv] Expand verifier testing for spirv.Tosa ops (#184112)
7fb5a02dcda1 [CMake][AST] Add PCH (#183358)
0fff939c1aa9 [mlir][linalg] Lower unpack - capture handle to created copy op (#183744)
d59a267ad541 device-libs: Use generic fshr builtin instead of alignbit (#1468)
78ac964c47cb [RISCV][NFC] Prepare for Short Forward Branch of branches with immediates (#182456)
f67c2cd75e25 [RISCV] Handle Zvabd and XRivosVizip EEWs in RISCVVLOptimizer (#184117)
0504af9e3bf7 [llvm] Turn misc copy-assign to move-assign (#184143)
e4def2d11fb5 [AMDGPU] Make the options consistent across 3 RA pipelines(NFC) (#184190)
d20395cfa3bb [LegalizeVectorOps][RISCV][PowerPC][AArch64][X86] Enable the clmul/clmulr/clmulh expansion code. (#184257)
4ea39c43e133 [LIT] Use forward slashes in substitutions when LLVM_WINDOWS_PREFER_FORWARD_SLASH is set (#179865)
75b0cf39b2f8 [RISCV] Add scalar saturating add/sub operations for i32 for RV64P (#184062)
84d0f8766de2 [RISCV] Alphabetize riscv_files in clang/lib/Headers/CMakeLists.txt. NFC (#184024)
30fc31aa71fa [NFC][TableGen] Add deleted copy operations for RAII guard classes (#184168)
eba4a76597dd [CFI] Expand test to include minimal runtime (#183646)
da8929bd2404 [bazel][mlir][acc] Port e63e55cae8ce29150f38a758555d9cc712a1cf4c (#184289)
a85dbcfe016d [clang][bytecode] Reject non-VarDecl DeclRefExprs (#184141)
5a53fce8582b [RISCV] Extends RISCVMoveMerger to merge GPRPairs independent of even/odd pair instruction order. (#183657)
198f85ea7c17 [clang][bytecode] Fix newly added pfp test (#184137)
b23438661c10 [OpenMP][clang] Indirect and Virtual function call mapping from host to device (#159857)
572a0e45c637 AMDGPU: Remove "MBUF" from "loadMBUFScalarOperandsFromVGPR" (#184282)
6d25af00ac47 [utils] use annotations from __future__ in lit (#184225)
768240d01952 [AMDGPU] Insert readfirstlane for uniform VGPR arguments (#178198)
e63e55cae8ce [mlir][acc] Add ACCRecipeMaterialization pass and reduction ops (#184252)
582586d36e09 [psdb] staging nightly build status notifier with more details
92aa2d36f020 [Github] Respect LLVM_VERSION when building windows container (#184231)
52f32d780fa2 [Github] Bump Github Runner to v2.332.0 (#184230)
8decfb8a90df [mlir][emitc] Do not convert illegal types to emitc (#156222)
2407564cbfa1 [Clang] Add missing extension cl_intel_split_work_group_barrier declaration (#184269)
a6fa21c5aabb [CIR] Upstream basic CodeGen tests from incubator (#183998)
82319d74aae4 [RISCV] Update Andes45 vector reduction scheduling info (#182980)
4f91d0b322a8 [libc++] Give proper names to a few benchmarks (#183333)
0ced81f7eabc [NFC][OpenMP] Remove redundant prints in `target` regions from tests added in #184260. (#184266)
1d1c83ad7397 Reland "[OpenMP][Offload] Handle `present/to/from` when a different entry did `alloc/delete`." (#184260)
d4d18248fde6 [lldb] Terminate the LLDB Log in SystemInitializerCommon::Terminate (#184261)
03e2af7a65ea [CIR] Fix bitfield store locations for assignment codegen (#184005)
0f8aa9610c0a [lldb][NFC] Whitespace cleanup in RegisterContextMinidump_ARM64 Breaking out the whitespace changes turned up in a separate contentful PR.
743428688fb0 [flang] Recognize compiler directives after expansion in comment (#183626)
4e3e4f25bc4a [WASM] add CheckWasmTableElement helper (#181172)
6719ec1e9512 [Coroutines] Replace struct alloca frame with byte array and ptradd (#178359)
a4f9d43eef7f [alpha.webkit.NoDeleteChecker] Add a test for unsafe function override (#184208)
23f21f3e277d [CIR] Implement function/call attribute parsing (#184185)
abb228af20c9 [CIR] Fix handling of cleanup scopes inside a try body (#183869)
c433ae7e2e57 Revert "Add a test that we recover from a crashing breakpoint condition."
a14d8b2e36d4 [CIR] Upstream vtable thunk handling (#183629)
49c3cd15e8b4 Add a test that we recover from a crashing breakpoint condition.
78f259fcc14b [MLIR] mlir_levelzero_runtime: remove dependency on LLVM (#182942)
4995b2b8591d [Github] Enable long paths in windows CI Container (#184224)
4f50a725fa19 [clang][clang-scan-deps] Add LangOptions::AllowLiteralDigitSeparator to fix #88896 (#184235)
42a0fbc2c792 Revert "[OpenMP][Offload] Handle `present/to/from` when a different entry did `alloc/delete`." (#184240)
5156147824be [libc] Declare reallocarray in stdlib.h / malloc.h (#184223)
4a9e0812c506 [flang] Allow acc cache directive inside acc routine (#184213)
3c43fc16b73b [clang][deps] Remove the `finalize()` API for by-module-name scans (#184232)
895597a1f579 [Github][bazel] Run `buildifier --mode=diff` on error (#184233)
6dcaffaa1757 [psdb] use linux-mi325-4gpu-ossci-rocm resource label
526a4d4d8a6a [LAA] Always use DepCands when grouping runtime checks. (#91196)
f52a2035548f Revert "[mlir][acc] Replace terminators with scf.yield in wrapMultiBlockRegionWithSCFExecuteRegion (#183758)" (#184228)
fab5681686a1 comgr: Add new path to automatically embed from the resource directory (#1476)
183d02d257f6 [clang] NFC: remove unused / untested workaround in pack deduction (#183875)
ea7ff48c3108 [DominanceFrontier] Support multiple root nodes for post-dom (#181257)
61310cd72dd2 [Github] Remove force build from windows container
533f16fe8969 [clang-tidy][NFC] Add `findTokenInRange` and reuse it (#183941)
8107c71511b3 [RISCV] Put Large Code Model Constant Pools in .text (#151393)
1a7060a7b07c [OpenMP][Offload] Handle `present/to/from` when a different entry did `alloc/delete`. (#165494)
501c6fda951b [CMake] Propagate dependencies to OBJECT libraries in add_llvm_library (re-land) (#184201)
ebe3c1ee991c [flang] Remove usage of the `DependencyConsumer::finish()` API (#184229)
5ae64c620750 [Clang][Sema][Builtins] Check argument count for `__builtin_allow_sanitize_check` (#183927)
8a9049198d18 [clang] Replace `finish()` with destructors for `DiagnosticConsumer` (#183831)
a4d786630c47 [lldb][ARM] Support thread local variables on ARM Linux (#181315)
03773c3b06b2 [APINotes][NFC] Fix typos and header comment errors (#183811)
386a3afa553f [mlir] Fix typos that propagate downstream. NFC. (#184220)
8e6e9cb8c203 [HLSL][NFC] Move SemaHLSL resource tests to Resources subdir (#183386)
f7176ee33662 [bazel][mlir][acc] Port 12f4eb2156559c2f8c99fa7dc3b59cb4fef1389d: scf.yield (#184216)
ed524ba0d458 [llvm] Avoid resolving `.incbin` during symbol collection (#172920)
ed085573f402 [SemaHLSL] Warn when a local resource is re-assigned to non-unique global resource (#182101)
0797a10cc537 [MLIR][XeVM] Rewrite llvm.alloca if addr_space is 3 (#183417)
96a02c5eb53c Revert "[APINotes] Refactor APINotesReader to propagate llvm::Error " (#184211)
9aff7b6347f1 [HIP] Fix wrong triple being passed to offload-bundler (#184195)
24873cb95574 [SelectionDAG] Pass DemandedElts to isKnownNeverZero for extend nodes (#183624)
12f4eb215655 [mlir][acc] Replace terminators with scf.yield in wrapMultiBlockRegionWithSCFExecuteRegion (#183758)
973f7606fba4 [HWASan] [MTE] support double lifetime.end in same BB
0cabe933812f [HLSL] Reintroduce dx.disable_optimizations to set DisableOptimization Shader Flag (#180069)
fb6038d93781 [DAG] isKnownNeverZero - add ISD::SRA/SRL DemandedElts handling and tests (#183577)
b3c4d44c4423 [lldb] Batch breakpoint step-over for threads stopped at the same BP (#183412)
a171b8d4d523 [NVPTX] Refactor NVPTXLowerArgs and move helpers to NVPTXUtilities (#183686)
5ff5a1f14761 Revert "[CMake] Use keyword signature in two additional callsites (#1… (#184186)
2846cb31e045 merge main into amd-staging (#1615)
ae363d50ad29 [HLSL][Matrix] Make Matrix InitListExprs and AST row-major order, and respect /Zpr and /Zpc in codegen (#182904)
28d294e080b3 [flang] Let -fdisable-real-10 affect only user code (#183870)
d723d14e4c34 [flang][runtime] Emit "Infinity" rather than "Inf" when required (#183359)
dd3d727c88b5 Revert "[llvm-ir2vec] Adding Inst Embeddings Map API to ir2vec python bindings" (#184179)
f486fc95db20 [clang-tidy] Nominate myself as a maintainer (#183173)
307d912378ac [clang][analyzer] Add taintedness to argv (#178054)
ab5205c916a7 [llvm][DebugInfo] Emit DW_LNAME_Assembly for DWARFv6 assembly CUs (#183897)
573a54120207 [clang-tidy][NFC] Use singe mock string header in tests (#183996)
9d1fd9ec1eb8 [AMDGPU] Extend DS loop wait optimization with flush point tracking (#175658)
447eba88c8d7 [lldb][Target] Allow eLanguageTypeAssembly to use ScratchTypeSystemClang (#183771)
fd578f7c5c98 [libomp] Fix hwloc include for non-standard paths (#184087)
41fc9b98459c [LAA] Fix recordAnalysis receiving null Instruction pointer (#183512)
cf8597bd3b87 [clang][Modules] Handle relocated modules during implicit module builds (#181836)
d7eec97bd83f [APINotes] Refactor APINotesReader to propagate llvm::Error (#183812)
148b10be8ad3 [flang][OpenMP] Support custom mappers in target update to/from clauses (#169673)
f3e8508ac771 [clang][ssaf] Add `JSONFormat` serialization support for `LUSummary` and `LUSummaryEncoding`
9ae143149b6f [mlir][bazel] Fix build after moving AMX into X86 in #183717. (#184165)
95832c9bfd7c [LinkerWrapper] Fix a bunch of minor issues and typos (#183679)
82d747e49142 [X86] known-never-zero.ll - add additional demanded elts vector test coverage (#184159)
b11a424e0582 [flang] Inline trivial scalar allocatable assignments in HLFIR-to-FIR (#183177)
11576569336d [lldb][Process/FreeBSDKernelCore] Fix RegisterContext for arm64 (#183947)
f5e8e98a4ef4 [mlir][VectorOps] Fold extract on constant_mask (#183780)
3357e487cf0e [clang/APINotes] Fix assertion crash in addObjCMethod for protocol DesignatedInit methods (#183799)
64139516e5c2 [X86] known-never-zero.ll - remove unnecessary declarations (#184142)
bd02c1712322 [X86] known-never-zero.ll - add shift right vector test coverage for #183577 (#184140)
a8fb8eb49f00 AMDGPU: Stop copying triple into AMDGPUSubtarget (#184147)
24ac5987b482 [bazel][libc] Add missing dep (#184152)
dbacb148dc41 [bazel][libc] Enable layering_check for libc/BUILD.bazel (#183822)
4f84347b2e7e [llvm-ir2vec] Adding Inst Embeddings Map API to ir2vec python bindings (#180140)
bc0af9901b51 [TableGen] Allow specification of underlying type for GenericEnum (#183769)
dfcbf6c70e70 [CVP] Stop CVP constant propagation from destroying `llvm.assume` (#183688)
070683157766 [mlir][bazel] Fix build after changes from #183856. (#184134)
da6b2db1a6e1 Revert "[VPlan] Remove unused VPExpandSCEVRecipe before expansion" (#184108)
8da1bb891e00 Reapply "[AMDGPU] Elide bitcast fold i64 imm to build_vector" (#160325) (#184114)
919ae1cd2f46 [lldb-dap] Skip return_variable_with_children on arm64 (#184132)
644f07cef5dc [CIR] Use `-verify` on clang/test/CIR/CodeGen/nonzeroinit-struct.cpp (#183910)
bb42c74b05c6 [clang-tidy] Add fixit capability to performance-use-std-move linter (#184072)
cca5bb52f37a [OpenMP] Use CreatePtrDiff() (#184127)
eb8f17162973 [clang][test] Add missing FileCheck pipe in n1311.c (#183965)
e0fa4952fd78 [ARM] Format ARMLoadStoreOptimizer Pass classes. NFC
1175046d14b9 [libc] Fix GPU loader propagation to lit test infrastructure (#184105)
c5e5c9735a33 [MLIR][MemRef] Validate linear size before lowering allocs (#179155)
19be8d60662b [mlir][tosa] Fix crash in TosaInferShapes when while_loop carries sparse tensors (#183943)
977355be38d1 [mlir][tosa] Disallow inferable dim in reshape/slice validation (#182472)
4e8be20faa1c [clang][test] Add multi-dim-array diagnostic test for multi-dimensional array function passing (#183847)
9d5ca5282d13 [IR] Return bool from replaceUsesWithIf() (#184107)
4a907a526dd7 [CMake] Add LLVM_ENABLE_WARNING_SUPPRESSIONS to toggle warning suppressions (#183439)
4af885c0c13c [AArch64] Fix performZExtUZPCombine() DAG combine (#183765)
bcc272b3220f [LV] Remove DataAndControlFlowWithoutRuntimeCheck. NFC (#183762)
ce79fb371245 [InstCombine] Always fold nonnull assumptions into operand bundles (#169923)
cd0eb16a11f8 [AArch64] Add maybe_unused to DstTy in assert. NFC
60fec80bdcb3 Revert "[VPlan] Remove unused VPExpandSCEVRecipe before expansion" (#184108)
88693c49d9ac [NFC][analyzer][test][z3] Move test cases requiring Z3 to the `Analysis/z3/` subdirectory (#183724)
87cbea6cdc9a [openmp][cmake][NFCI] Avoid non-eval uses of ${var} (#182267)
b39247c391c4 [AMDGPU] Fix typo "PGRM" in variable name. NFC. (#184104)
24d21ca03cd2 [flang][OpenMP] Fix counting generated nests (#183957)
0a53c0b9c360 merge main into amd-staging
d2b6a5a3f6e8 [LLVM][NVPTX] Fix infinite legalization loop in tcgen05.st (#183012)
82a1905c4bd3 InstCombine: Pass SimplifyQuery through SimplifyDemandedFPClass (#184096)
52df4a19599b [AMDGPU] Fix typos "SPGR" / "VPGR" in comments
dd871f55f018 [flang] Use CHECK-DAG to check constants (NFC) (#184097)
47383919d111 [AArch64][NFC] Remove unused parameters for `performORCombine` (#184075)
6cce18b9f518 [LoopIdiomVectorize] Avoid wrapping in find_first_of loops. (#180570)
c1d82e2f3e83 [mlir][reducer] Use LDBG in opt-reduction-pass (NFC) (#184026)
00af181a4edd [mlir][emitc] Fix crash in form-expressions when identity cast is folded (#183894)
482a7718a8d8 [DAG] visitCLMUL - fold (clmul x, c_pow2) -> (shl x, log2(c_pow2)) (#184049)
e44fd05035a3 [mlir][x86] Move AMX dialect into X86 dialect (#183717)
e3b01e132908 [lldb] Fix wchar addition tests in DIL (#184082)
13751c87076b [AArch64] Vectorise llvm.pow using vector intrinsic for ArmPL library (#183319)
730587d3be6c [DAG] isKnownNeverZero - add DemandedElts for ISD::SMIN/SMAX (#184054)
61faf7d3db72 [AArch64][GlobalISel] Use GPR for illegal fconstants and extend < 32 bit GPR constants to 32 bits (#178692)
0c89071fa33f github-automation.py: Fix mis-indented statement (#149653)
925ec952ddd8 [llvm][DebugInfo][test] dwarf-asm-multiple-sections.s: refine FileCheck checks
f46aca9bf84e [AArch64] Combine (and/or X, (dup (not Y))) -> (bic/orn X, (dup Y)) (#175739)
900f70258b90 [lldb] Indent option help with ANSI cursor codes when possible. (#183558)
3ad43f2d1c03 [LangRef] Clarify nsz semantics (#180906)
96113ac416e7 [Clang] Use llvm.ptrmask to mask out thumb bit (#183535)
037fd6eaaf45 [AMDGPU] Add VINTERP encoding to gfx13 (#182481)
86b07a79a9c3 [AArch64] Remove -aarch64-load-store-renaming=true from test. NFC
c7e1ec97b979 [flang][OpenMP] Implicitly capture variables in enclosing task for nested firstprivate (#183770)
4c2ac846bb77 [mlir][spirv] Add Element Binary Logical operators to TOSA Ext Inst Set (#183703)
36c6c689dc31 [compiler-rt][ARM] Fix conditions for strict-mode FP testing (#183507)
4922ab9915b0 [RISCV] Relax codegen predicates for HINT-based instructions (#179872)
0704b68a027a [gold] Fix test
c62c00c52405 [VPlan] Remove unused VPExpandSCEVRecipe before expansion (#181329)
51d9b40b0d09 [AArch64] Remove iXLen from sve-lrint.ll. NFC
f7b1107bf564 [IVDescriptors] Remove function FMF attribute check for FP min/max reduction (#183523)
265c1f483398 [LV] Add debug print for TTI.MaxInterleaveFactor (NFC) (#183309)
1c3327561977 [mlir][spirv] Introduce a base class for spirv.TOSA convolution ops (#183751)
7fbbbd7893d4 merge main into amd-staging (#1612)
14bcb1a00954 [BOLT] Make sure IOAddressMap exist before lookup (NFC) (#183184)
b4b32e88dde6 [BOLT][instr] Disable stderr diagnostic output when targeting Android (#183185)
3270bbf04cba [BOLT][instr] Make instrumentation counter reset thread safe (#183186)
b8d0bb2ddc77 [WebKit checkers] Trivial function analysis ignores some nodelete annotation (#183970)
6d82f143dee1 [clang-tidy] New performance linter: performance-use-std-move (#179467)
f1620e44412f [OpenCL] Enable __cl_clang_function_scope_local_variables for AMDGPU and NVPTX targets (#183892)
90eb27e56e3e merge main into amd-staging
ab1d59e72524 [clang-format] Allow InheritParentConfig to accept a directory (#182791)
52a9eb37db83 [Github] Add TODO around actions/attest
8fff1c042d14 Update actions/attest-build-provenance action to v4 (#184051)
686987a540bc ValueTracking/AMDGPU: handle mbcnt in computeKnownBitsFromOperator (#183229)
e95dabef96f4 [MLIR][Python] Support attribute definitions in Python-defined dialects (#183907)
8774da8f2f4d [MLIR][XeGPU] Preserve anchor layouts in recoverTemporaryLayout (#182186)
53a6db6a3eba merge main into amd-staging (#1611)
81872e7049ea [NFC] Fix check lines for `clang/test/CodeGenOpenCL/cl-uniform-wg-size.cl` on Darwin (#184042)
e6aafae828e0 [Polly] Update isl to isl-0.27-86-gcf471c16 (#184044)
d947f8f699eb [clang][Sema] fix crash on __type_pack_element with dependent packs (GH180307) (#180407)
f05d2e8a3998 [AMDGPU] Make uniform-work-group-size a valueless attribute (#183925)
e2ef93fc5750 [NFC] Remove `clang/test/CodeGenOpenCL/.gdb_history` (#184038)
d9ca61b6e7b9 Revert "[NFC][Clang] Auto generate check lines for `clang/test/CodeGenOpenCL/cl-uniform-wg-size.cl`" (#184035)
a06dcc7ccf38 merge main into amd-staging
789bf51f0ce6 [SLP]Do not consider condition with multiple uses and negate predicate as a candidate for inversed select
cf1e76835feb [clang-tidy][NFC] Don't call `getLangOpts` in `isLanguageVersionSupported` (#184029)
d1d2a1ed76a6 [SLP][NFC]Add a test with the incorrect compare, extracted from the transformed vector
dddd06be8c3e [NFC][Clang] Auto generate check lines for `clang/test/CodeGenOpenCL/cl-uniform-wg-size.cl` (#183926)
f62adea305d6 [ProfCheck] Exclude new GVN test
3cf53f684d51 [LV] Handle sunk reverse VPInstruction in planContainsAdditionalSimps.
1dc85c60410f [clang-tidy][NFC] Add `getCommentsInRange` utility (#183940)
48209b6777be [DAG] isKnownToBeAPowerOfTwo - add ISD::EXTRACT_VECTOR_ELT handling (#183924)
a13afe84bb42 [SLP][NFC]Add more bitcast/bswap tests with immediate loads, NFC
3041c90718df [mlir][tensor] Remove hard-coded types from `ConstantOpExtractSliceFolder` (#184013)
e4301c48fdd9 [bazel] Fix windows stack space on llvm driver link (#182998)
451529778d42 [clang] fix common type calculation for l-values of 'void' type (#183972)
02c7a6cd7f35 [SLP][NFC]Add tests for bitcasts/bswaps with large target type
262be3b7cbd8 merge main into amd-staging (#1610)
ae7916539918 [clang][NFC][diagnostics] Remove several uses of `getCustomDiagID()` (#172532)
320220e48b8f [VPlan] Support arbitrary predicated early exits. (#182396)
9730d3128435 [SLP]Fix types for reductions in revec
7b26069828aa [VPlan] Pass ForceTargetInstructionCost insted of NumOccurences.
a6e7c38ea631 [SLP]Do not vectorize select nodes with scalar and vector conditions
49b77e3b4555 [VectorCombine] Fold sign-bit check for multiple vectors (#182911)
3bdee9b5576b [GVN] Forward store values through select addresses in findDominatingValue (#183316)
2c9720972e90 [mlir][python] Add stable ABI (abi3) support (#183856)
0ba4f13b264a [mlir][test] Fix crash in ReifyBoundOp with invalid 'type' attribute (#184004)
785490e9db54 [MLIR] Remove `let constructor = ` from mlir/include/mlir/Transforms/Passes.td (#183950)
7629c5cc32cc merge main into amd-staging
b7e20442d5ed [MLIR][ODS] Fix AllElementCountsMatch crash on dynamic shaped types (#183948)
2cb2fe7f2a1d [mlir][scf] Fix crash in ForOp verifier when body block has no arguments (#183946)
9801e752024f [ARM] tADDrSPi no side effects change (#183071)
74c0ee7e72bf [TTI] Remove TargetLibraryInfo from IntrinsicCostAttributes (NFC) (#183764)
6fa90a3f7e89 [MLIR][SymbolTable] Fix crash when SymbolTable is built on unverified IR (#183945)
d68d47db7b5a [ARM] Explicitly mark certain instructions as having no side effects. (#182771)
0f63db5c665b Attributor: Avoid calling identifyDefaultAbstractAttributes on declarations (#182663)
5768ee2dcdad [clang-repl] fix CleanUpPTU by removing decl according to C implicitly FuncitonDecl. (#178648)
d74c6b1176e9 [mlir][IR] Generalize `DenseElementsAttr` to custom element types (#183920)
9ffa08f097d4 [mlir][NFC] Fix typo in property predicate tests (#183987)
b872179bebe7 [AArch64][test] Add i256 codegen baseline tests (#183587)
d412b04a883c [UBSan] Wrap Location variants in anonymous union (#168866)
10b1b7857b05 [ASan] Mark recent integration tests as accordingly for MSVC (#135889)
b2ce908a48e0 [compiler-rt][CMake] Fix build when specifying --stdlib= (with 2 dashes) (#136111)
bf4ed7903aee [clang-tidy][NFC] Use singe mock vector header in tests (#183963)
dce9aaf48638 [Revert_patches.txt] cleanup (#1608)
c35a726ca979 [Clang][TableGen] Sort undocumented builtins after documented ones in generated docs (#183938)
da8d18190530 [libc][math] Cleanup shared/math (#183971)
4673cecc89d1 [MLIR][Python] Add support of `convert_region_types` and the bf integration test (#183664)
910988b9f15c [AMDGPU] Stop treating AMDGPU_CS_ChainPreserve as a module entry funtion (#183718)
7838bdfaaaea Amd/compiler/rlieberm/reland pch (#1607)
e3bad32ceb12 merge main into amd-staging (#1606)
0b423a5b2738 [MLIR] Fix invalid test after improving the error message (NFC)
2c3d5f958f22 [clang] use typo-corrected name qualifier for expressions (#183937)
20f36a2ff10f [MLIR][GPU] Improve error message on invalid pass option
b72d8ac98c6e [DAG] isKnownNeverZero - add ISD::EXTRACT_VECTOR_ELT handling (#183961)
df616fbe1c90 [lldb][lldb-dap] Correctly format lldb warnings in the debug console (#173852)
a0fb4f670848 [lldb] Add BytecodeSection class to formatter_bytecode.py (#183876)
a34fe9d5354d windows namspace ambiguity: remove using llvm
d89528150c26 [CMake][CodeGen] Add PCH (#183346)
b3af477f9862 [CMake][IR] Add PCH (#183303)
59ba10b9d38a [mlir][spirv] Fix crash when spirv.struct member type is not a SPIR-V type (#183942)
4a602c03ea05 [lldb][Process/FreeBSDKernelCore] Add riscv64 support (#180670)
3e05ab6322cb [ThinLTO] Reduce the number of renaming due to promotions (#183793)
e317f424557c [SLP]Recalculate dependencies for the buildvector schedule node, if they have copyable node
5ed875a06cb0 [lldb][lldb-server] Fix zip file lookup ignoring last entry in the zip file (#173966)
061714cd8c01 merge main into amd-staging
bf52cf2ee677 [Revert_patches.txt] cleanup (#1605)
3034c0966931 [clang-format] bugfix: Whitesmiths with IndentAccessModifiers (#182432)
4d724c074dd1 [X86] known-never-zero.ll - add tests showing failure to handle ISD::EXTRACT_VECTOR_ELT nodes (#183934)
1909e43a4adc [mlir][GPU] Fix crash in WarpExecuteOnLane0Op::verify with wrong terminator (#183930)
712f9637b278 [SimplifyLibCalls] Avoid simplifying pow(x, 2.0) -> x * x with math-e… (#1601)
baed2c8a31a3 merge main into amd-staging (#1600)
2430410b7d87 [lldb][Process/FreeBSDKernelCore] Add ppc64le support (#180669)
4a93b9a1b1ec [ARM] Lower strictfp vector fp16 rounding operations similar to default mode (#183700)
a6ceae48f56c [AMDGPU] Assert non-array alloca does have a size (#183834)
3d086f573dc4 [CIR] Implement ImplicitValueInitExpr for ComplexType (#183836)
7585ab05d6fb [AMDGPU] Enable shift64 hazard recognition for gfx9 (#183839)
d5a8f1eda29a [X86] known-pow2.ll - add tests showing failure to handle ISD::EXTRACT_VECTOR_ELT nodes (#183918)
5b64aeb409ec Revert "[mlir][IR] Generalize `DenseElementsAttr` to custom element types" (#183917)
2342db00ab4d [CMake] Use keyword signature in two additional callsites (#183889)
225b56e742fe [mlir][VectorToLLVM] Fix crash in VectorInsertOpConversion with dynamic index (#183783)
2f7c947946f4 Precommit tests: strictfp rounding vector f16 intrinsics (#183699)
e655c36c16c1 [mlir][IR] Generalize `DenseElementsAttr` to custom element types (#183891)
72525fb4ee37 [VPlan] Materialize UF after unrolling (NFCI).
94ebc8a95baf [LV] Remove duplicated IV expression sinking tests. (NFC)
e61d49ab51aa merge main into amd-staging
0b61f15f2e13 [AArch64] Add fcvt-i256 test cases. NFC
903acc2762d5 [AArch64][PAC] Emit `!dbg` locations in `*_vfpthunk_` functions (#179688)
ba0b395d3f25 [OpenMP] Remove NVPTX local addrspace on parameters (#183195) (#1598)
b3be782c4d14 [mlir][affine] Fix crash in linearize_index fold when multi-index is ub.poison (#183816)
f05b705dd3ce [mlir] Fix crash in testNoSkipErasureCallbacks on empty blocks (#183757)
245621408d03 Restore #125407, Make covmap tolerant of nested Decisions (#183073)
7370091a43e5 [mlir][test-ir-visitors] Fix noSkipBlockErasure crash with block args used across blocks (#183828)
c8e211c2a8b2 [mlir][tensor] Fix crash in expand_shape fold with dynamic result type (#183785)
b2c92bca2e66 [llvm-mc][dwarf] Bump supported version to DWARF 6 (#183779)
3403aac73418 [CMake][LLVM] Disable PCH on Clang for file with custom flags too (#183813)
9b1f7845227e [ARM][MVE] Add SLI and SRI recognition. (#183471)
8f0928252bbe [llvm][DebugInfo] Bump DWARFListTable maximum DWARF version (#183859)
ce3460e00272 [llvm][DebugInfo] Bump DWARFDebugLine maximum DWARF version (#183841)
c40b0b2235e5 [llvm][DebugInfo] Bump DWARFContext maximum DWARF version (#183838)
ab2908ed21e7 [LV] Add tail-folding & required scalar epilogue tests for IG narrowing.
55f9cf33fc14 RISCVMCAsmInfo: Remove redundant `UseAtForSpecifier = false`. NFC (#183890)
a3f9f6a82374 merge main into amd-staging (#1599)
1ff1e5f10a5c InstCombine: Stop applying nofpclass from use nofpclass attribute (#183835)
702e4ec5f705 [lldb/test] Skip TestDelayInitDependency on remote platforms (#183885)
3b30dcddd973 [Driver] Add -Wa,--reloc-section-sym= to control section symbol conversion (#183472)
27d654c4c4e6 [AMDGPU] Fix piggybacking after commute in AMDGPULowerVGPREncoding (#183778)
bed89970c3df AArch64: Replace @plt/%gotpcrel in data directives with %pltpcrel %gotpcrel (#155776)
ce6a3d98cc3e [clang-tidy] Teach `misc-unused-using-decls` that exported using-decls aren't unused (#183638)
04484e4c8fa7 [amd/device-libs] __builtin_elementwise_max ...
fe76e9004b5b [CodeGen] Allow `-enable-ext-tsp-block-placement` and `-apply-ext-tsp-for-size` passed together (#183642)
d72e95bab071 [CIR] Use `-verify` on clang/test/CIR/CodeGenHLSL/matrix-element-expr-load.hlsl (#182817)
0b88ee12dd88 [CIR] Infrastructure and MemorySpaceAttrInterface for Address Spaces (#179073)
53e538a99179 merge main into amd-staging
6f9c68d32074 [VPlan] Don't adjust trip count for DataAndControlFlowWithoutRuntimeCheck (#183729)
b281cdc5b244 [psdb] use latest rock CI backend
5f22decefac0 Clang: Deprecate float support from __builtin_elementwise_max (#180885)
62cfe1659edf [libc][math][c23] implement C23 `acospif` math function (#183661)
fb6b470caedc [libc][math] Refactor floor family to header-only (#182194)
a8d37d3cce19 [Flang][OpenMP] Unxfail omptarget-record-type-with-ptr-member-host.mlir (#1596)
e884a8cbcc51 merge main into amd-staging (#1597)
8bd8d8e6debe [AMDGPU] Remove extra pipes from load-saddr-offset-imm.ll (#183874)
5395d2668968 Revert "[WebAssembly] Incorporate SCCs into WebAssemblyFixIrreducibleControlFlow (#181755)" (#183872)
342e44603dc2 [AMDGPU][SIInsertWaitcnts] Move VCCZ workaround code out of the way (#182619)
795cfaea9cc8 [CIR][NFC] Move some builtin tests to the CodeGenBuitins folder (#183607)
7a5a92d27f68 Manual update of LLVM_MAIN_REVISION to 570809 (#1595)
12e1075b6495 [SLP]Fix operand reordering when estimating profitability of operands
fd9421cccd0b [lldb] Fix sys.path manipulation failure in formatter_bytecode.py (#183868)
e3c045415ae5 [CMake] Propagate dependencies to OBJECT libraries in `add_llvm_library` (#183541)
136ba6e208b2 [Hexagon] Define __HVX_IEEE_FP__ when -mhvx-ieee-fp is enabled (#183829)
dc520a5f493a [mlir][GPU] Add ValueBoundsOphinterface to gpu.subgroup_broadcast (#183848)
c78f37fdebd8 [CIR] Fix dominance problems with values defined in cleanup scopes (#183810)
07891ab5901c [cmake] Disable -Wdangling-pointer on GCC 12+ (#183593)
329c52c1004f [lldb] Change the way the shlib directory helper is set (#183637)
788625757ea4 [NFC] Fix use-after-free: track TargetLibraryAnalysis in BasicAAResult invalidation (#183852)
89d42b316a10 merge main into amd-staging
e35fc30cb8f5 Fix `BuiltinTypeMethodBuilder` uninitialized pointer (#183814)
0a9b5d52188f [libc++] Forward find* algorithms to find_if (#179938)
c5588becb8dd [lldb] Add skip shared build to more API tests
abbba22f4566 [lldb] Add synthetic support to formatter_bytecode.py (#183804)
7ad2c6db54a0 [mlir][arith] Add `exact` to `index_cast{,ui}` (#183395)
73d655a598d7 [VPlan] Support unrolling/cloning masked VPInstructions.
a0f79991dc3a merge main into amd-staging (#1592)
8f268e63e484 [Offload] Remove unused data type (#183840)
cdd431318318 [mlir][LLVM] Let decomposeValue/composeVale pad out larger types (#183825)
d7e037c8383e Revert "[VPlan] Remove manual region removal when simplifying for VF and UF. (#181252)"
94bd8b9444be [NFC] [MTE] add test for duplicated lifetime end
63ab568070c7 [NFC] [HWASan] add test for duplicated lifetime end
c2f66f2a940e [WebAseembly] Fix -Wunused-variable in #181755
a71ded3861aa [BOLT][AArch64] Add a unittest for compare-and-branch inversion. (#181177)
1073951bdb8e [mlir][cf] Fix crash in simplifyBrToBlockWithSinglePred when branch operand is a block argument of its successor (#183797)
d0afaeadecd0 [clang][modulemap] Lazily load module maps by header name (#181916)
977702ccc40d [clang] fix crash when casting a parenthesized unresolved template-id (#183633)
2c98566900f0 Revert "[Metal][HLSL] Add support for dumping reflection" (#183818)
02ebe23163c0 [ASan] Document limitations of container overflow checks (#183590)
fff2f0ba78fe [AMDGPU] Handle GFX1250 hazards between WMMA and VOPD (#183573)
fc153b1e254f [alpha.webkit.NoDeleteChecker] Check if each field is trivially destructive (#183711)
ca04a70891fb [libc][math] Refactor bf16sub family to header-only (#182115)
6f612cfbd921 [clang] stop error recovery in SFINAE for narrowing in converted constant expressions (#183614)
d1f4f9453c78 [flang] Fix explanatory messages for generic resolution error (#183565)
4f05592bc01c [Driver][SYCL] Add tests for -Xarch_<arch> option forwarding to SYCL JIT compilation. (#178025)
3d889c464eb1 [clang-format] Fix SpaceBeforeParens with explicit template instantiations (#183183)
df5bee6afc79 [CIR] Implement TryOp flattening (#183591)
8ce2b9cbc2bc [Clang][ItaniumMangle] Fix recursive mangling for lambda init-captures (#182667)
ee6f5f386f95 [InstCombine] Replace alloca with undef size with poison instead of null (#182919)
25d709e72c97 [SystemZ] Emit external aliases for indirect function descriptors in the ADA section (#183443)
cf28f23f1013 [SLP] Reject duplicate shift amounts in matchesShlZExt reorder path (#183627)
282a2b77c358 [clang][ssaf] Add `JSONFormat` support for `TUSummaryEncoding`
403fd7679f80 [SlotIndexes] Further pack indices to improve spill placement time (#182640)
dce48f2653cb [OpenMP] Enable internalization of 'ockl.bc' for OpenMP (#183685)
c05e323be7ca [WebAssembly] Incorporate SCCs into WebAssemblyFixIrreducibleControlFlow (#181755)
852c6ef5aca6 [mlir][LLVM] Let decomposeValue/composeValue handle aggregates (#183405)
48eb40bee024 [lldb-dap] Adjust VariableReferenceStorage lifetime management. (#183176)
ca0e7d31d05b [flang] [flang-rt] Addition of the Fortran 2023 TOKENIZE intrinsic. (#181030)
6301243a5d69 Reapply "[ValueTracking] Propagate sign information out of loop" (#182512)
c49460bae76c [flang-rt] Enable more runtime functions for the GPU target (#183649)
67a51ea34d25 [NFC][POWER] add Pre-Commit test case for Inefficient std::bit_floor(x) (#183363)
6e7c7131b2c3 [psdb] enable rock CI windows build for debug branches
0d95dda1eeee [LoopInfo] Preserve profile information in makeLoopInvariant (#174171)
c3b3f4195219 [SystemZ] Emit external aliases required for indirect symbol handling support (#183442)
1269a74db9ff [bazel] Enable `parse_headers` for llvm/BUILD.bazel (#183680)
179c25eaefe6 [MTE] [HWASan] support more complicated lifetimes
cd50a3074bdf Revert "[ThinLTO] Reduce the number of renaming due to promotions (#178587)" (#183782)
d2c545266b8b [RISCV] Use getCopyFromReg in unit test to match comment. NFC (#183199)
55d62abadbc5 [lldb] Add arithmetic binary addition to DIL (#177208)
5661ed60e37d [mlir][vector] Fix crashes in MaskOp::fold and CanonializeEmptyMaskOp (#183781)
dc26edd9b660 [ASan] Enable Internalization for 'asanrtl.bc' in Driver (#182825)
7f0a343a8ec4 [flang] Implement -grecord-command-line for Flang (#181686)
bad56dbb2385 [libsycl] Add sycl::context stub (#182826)
35f8ca8b76c6 [flang][NFC] Converted five tests from old lowering to new lowering (part 22) (#183681)
de4a1a77e147 [clang][modules] Prevent deadlock in module cache (#182722)
061762385805 [SPIR-V] Fix non-deterministic compiler output for debug type pointer (#182773)
d1da7f6ee5d7 [clang-scan-deps] Add test for symlink-aliased module map PCM reuse across incremental scans (#183328)
a703d91091ec [lldb-dap] Improve test performance for 'cancel' request. (#183632)
729602e81009 Revert "[SPIRV][NFCI] Use unordered data structures for SPIR-V extensions" (#183774)
975dba28633d [ThinLTO] Reduce the number of renaming due to promotions (#178587)
bb9122b3a558 [RevPatch] update PCH list of reverts
c7a20151621b Revert "[CMake][IR] Add PCH (#183303)"
9c53215d2131 [VPlan] Remove manual region removal when simplifying for VF and UF. (#181252)
4b10a4c17781 [mlir] Enable specifying bytecode producer in mlir-opt. (#182846)
e4a97cd05272 merge main into amd-staging
2265d3240f23 [pdb] Fix libc++ strict-weak-ordering assertion failures from gsiRecordCmp (#183749)
ed05f7012fe9 [mlir][vector] Rename `ReduceMultiDimReductionRank` -> `FlattenMultiReduction` (NFC) (#183721)
401163e3d2fa [psdb] enable rock CI windows build for debug branches
11a92a9305a7 [SystemZ] Add indirect reference bit XATTR REFERENCE(INDIRECT) for indirect symbol handling support (#183441)
4eab75e21fdf [SLP][NFC] Precommit test for zext reorder with duplicate shifts (#183748)
53656d1a2fad [clang][DebugInfo] Rename _vtable$ to __clang_vtable (#183617)
abe0c46e03cf merge main into amd-staging (#1586)
d8956d7796bb [SPIRV][NFCI] Use unordered data structures for SPIR-V extensions (#183567)
ef05d0610940 [lldb][Process/FreeBSDKernelCore] Implement DoWriteMemory() (#183553)
bcd8819aee05 [mlir][transforms] Fix crash in remove-dead-values when function has non-call users (#183655)
620425a88438 [mlir][tensor] Fix crash in tensor.from_elements fold with non-scalar element types (#183659)
f55b86258c91 [mlir][Python] Drop Python <=3.9 compatibility path (#183416)
370273382035 [SelectionDAG] Fix CLMULR/CLMULH expansion (#183537)
7cc27e28db97 [MLIR][Vector] Enhance shape_cast unrolling support in case the target shape is [1, 1, ..1] (#183436)
9c2a3ca4949e [MLIR] Fix OpenACC parser crash with opaque pointers (#183521)
a8a6613cc423 [AMDGPU][Scheduler] Fix compilation fail in EXPENSIVE_CHECKS (#183745)
7402312ae12d [NFC][SPIRV] Fix compile warnings (#183725)
9210d701cbf0 [MIR] Error on signed integer in getUnsigned (#183171)
bf3ab0d873bf [AMDGPU][Scheduler] Add `GCNRegPressure`-based methods to `GCNRPTarget` (#182853)
20df251af50b [LLVM][Runtimes] Add 'llvm-gpu-loader' to dependency list (#183601)
dc2ec04342de [gn] port 3490d28c8cab
8a0be0bc3772 [X86] Fold XOR of two vgf2p8affineqb instructions with same input (#179900)
48a9a2fd20a7 [Flang][OpenMP] Fix close map flag propagation for derived types in USM (#1557)
b2fdc435c823 merge main into amd-staging
2f4624613d05 [analyzer] Fix crash in MallocChecker when a function has both ownership_returns and ownership_takes (#183583)
e3dda81e2a80 [flang][OpenMP] Add `is_range<R>` trait to detect classes with begin/end, NFC (#183615)
fc69531254ca [LLVM][ExecutionEngine] Add vector ConstantInt/FP support to getConstantValue(). (#182538)
d8671280d4bf [VPlan] Add nuw to unrolled canonical IVs (#183716)
3676ae43bff9 [NFC][SPIRV] Remove dead code from `SPIRVPostLegalizer.cpp` (#183585)
6b91049f44d2 [Clang] support C23 constexpr struct member access in constant expressions (#182770)
4d169f38cab5 [LangRef] Clarify in vscale_range that vscale is a power-of-two without the attribute (#183689)
1a6bd39fd498 [flang] Use CHECK-DAG to check constants (NFC) (#183687)
14f73345ff0c [mlir][dataflow] Fix crash in IntegerRangeAnalysis with non-constant loop bounds (#183660)
c5c0fe663c7b [VPlan] Remove non-power-of-2 scalable VF comment. NFC (#183719)
98825908fc51 [mlir][affine] Fix crash in linearize_index fold when basis is ub.poison (#183650)
e7bc02d9a49f [SCEV] Always return true for isKnownToBeAPowerOfTwo for SCEVVScale (#183693)
49f4232a7d73 [AMDGPU] Remove unused CmpLGOp instruction (#180195)
b9f2a489607c [MemorySSA] Make `getBlockDefs` and `getBlockAccesses` return a non-const list (NFC)
5e30ff9e70be [lldb][test] Re-enable TestDyldLaunchLinux.py for Linux/Arm (#181221)
1afd7d40afe3 [AMDGPU] Support i8/i16 GEP indices when promoting allocas to vectors (#175489)
250ebfc30688 [X86] regenerate fcopysign test checks (#183710)
d6fcf47a8934 [libc++] Fix vector::append_range growing before the capacity is reached (#183264)
5e1d99158e0b [X86] stack-align.ll - regenerate test checks with no address scrubing (#183712)
294cf1f6b49b [X86] fnabs.ll - regenerate test checks and add AVX512 test coverage (#183709)
10b48e41e7d7 [InstCombine] Combine extract from get_active_lane_mask where all lanes inactive (#183329)
7a5ba652f08b [AArch64] optimize vselect of bitcast (#180375)
9e95cff5155a [AArch64] Add vector expansion support for ISD::FPOW when using ArmPL (#183526)
28cbc682a911 [NFC][analyzer] Remove NodeBuilders: part I (#183354)
4147cd29e1f2 [WebAssembly][FastISel] Emit signed loads for sext of i8/i16/i32 (#182767)
f71bd1c74fe8 [clang][bytecode] Add `Record::hasPtrField()` (#183513)
d43213fe8012 Revert "[VPlan] Don't drop NUW flag on tail folded canonical IVs (#183301)" (#183698)
16aa1900ef8f [clang][bytecode][NFC] Print more info in Pointer::operator<< (#183691)
c690414f8369 [clang][bytecode][NFC] Refactor visitDeclRef() (#183690)
a1f83ba1b6a7 [LV] NFCI: Move extend optimization to transformToPartialReduction. (#182860)
4a0f451cbd01 merge main into amd-staging (#1581)
5af5bd4f9867 [AMX][NFC] Match pseudo name with isa (#182235)
058705bf76af [Clang][NFCI] Make program state GDM key const pointer (#183477)
92704064e585 [VectorCombine][X86] Ensure we recognise free sign extends of vector comparison results (#183575)
a5bbedf522d4 [LV] Convert test to UTC. NFC
b0b3e3e1c7f6 [VPlan] Don't drop NUW flag on tail folded canonical IVs (#183301)
192acd6d536c [Clang][AMDGPU] Change __fp16 to _Float16 in GFX1250 WMMA/SWMMAC builtin definitions (#183493)
a107c1ccf18b merge main into amd-staging
32134a64b195 [mlirbc] Switch generator to enable write's with failures. (#182464)
ed8f080737de [Clang][docs] Fix proposal number typo for P1847R4 (#183671)
d471646607a6 Amd/dev/rlieberm/reland driver new (#1578)
f5bf00681c99 merge main into amd-staging (#1580)
86b99eff8c4d Revert "[Sema] Fix crash on invalid operator template-id (#181404)" (#183682)
07007b7c8d9e [lldb] Don't add remap entries for empty segments (#183651)
77600cbd9798 [MLIR][XeGPU] XeGPU Layout adds support for fractional-subgroup-size vector (#183434)
f30dfe7de4c3 Revert "[mlir-tblgen] Remove `namespace {}` around OpDocGroup (#182721)" (#183458)
b354b206d3be [SafeStack] Allow -fsanitize-minimal-runtime with -fsanitize=safestack (#183644)
5929c9040fac [mlir][vector] Fix fold result for empty vector.mask with no results (#180345)
8d5b74db2d8f [DenseMap] Add memory barrier for sanitizers in getInlineBuckets/getLargeRep
8f9c926868d1 Revert "AMDGPU: Fix runtime unrolling when cascaded GEPs present (#14… (#183641)
c056d7c5d6ea [Sema] Fix crash on invalid operator template-id (#181404)
46b6c9744f84 [LoopUnrollAndJam] Update test unroll-and-jam.ll (NFC) (#183520)
c1d33452468d [MLIR][Presburger][NFC] Don't add empty regions when unioning PWMA functions (#182468)
c78a4986b055 [RevPatch] PCH and Openmp
8224f11a2735 Revert "[CMake][CodeGen] Add PCH (#183346)"
6da19c1e02bd Revert "[OpenMP] Remove NVPTX local addrspace on parameters (#183195)"
b28ad9cb96c2 [llvm-dwp] Fix typo in --help
408209275e63 [LoopUnrollAndJam] Update test dependencies.ll (NFC) (#183509)
decb5d3ff6a1 [CIR] Remove branch through cleanup fixups (#182953)
361e2359860e [MLIR][Python] Support op adaptor for Python-defined operations (#183528)
9b708b003274 [mlir][arith-to-spirv] Fix null dereference when converting trunci/extui with tensor types (#183654)
99c463512a04 [MLIR] Do not abort on invalid --mlir-debug-counter values (#181751)
d149830b98f8 [AMDGPU] Pre-Commit tests for handle mbcnt in computeKnownBitsFromOperator (#178607)
26b4c25b8bce [flang][cuda] Add support for cudaStreamDestroy (#183648)
5e6f0c45a851 [Clang][Hexagon] Add QURT as recognized OS in target triple (#183622)
7c022af37ef2 [scudo] Add reallocarray C wrapper. (#183385)
00b3ce6b5abc merge main into amd-staging
7e39b280e860 [libc][math] Refactor nextafter family to header-only (#181673)
f2baaeb747b9 merge main into amd-staging (#1577)
20ec9a9bb725 build: correct `MSVC` and Windows mixup for `CLANG_BUILD_STATIC` (#183609)
e55945556a1e [scudo] Change header tagging for the secondary allocator (#182487)
2fc0733805e3 [AArch64] Decompose FADD reductions with known zero elements (#167313)
e92dd71f44c3 [RISCV] Add Defs = VXSAT to P extension instructions. (#183455)
c6db35fd343e [mlir][xegpu] Retain order attribute during load + transpose optimization. (#183608)
6bc9ba786d0f [Hexagon] Fix memory type for vgather intrinsics (#183563)
10abb231d6b4 [flang] Update the Flang Community Call to the new MS Teams series (#183576)
d5e501725e31 Reapply "[VPlan] Use VPInstructionWithType for Load in VPlan0 (NFC)"
46c06a34f1de [VPlan] Fixup C++ unit te…
6c9f97d to
9f3690c
Compare
Motivation
Pulled in ~3 months of upstream changes, up to commit
f5f2faf16a985e6cddd81644c34c3c05d0a98c99.Technical Details
AMDGPU
Structural Reorganization
AMDGPU.tdinto:AMDGPUBase.td,AMDGPUOps.td,AMDGPUAttrs.td,AMDGPUEnums.td,AMDGPUTypes.td.Op Rename
amdgpu.scaled_ext_packed816→amdgpu.scaled_ext_packed_matrixfirstScaleLane:IntIsOneOf<[0, 1]>→IntIsOneOf<[0, 16]>firstScaleByte:IntMaxValue<2>→IntMaxValue<3>New Operations
amdgpu.sparse_mfma— Sparse MFMA (smfmac) on gfx942+. Operands:sourceA,sourceB,destC,sparseIdx. Attrs:m,n,k,cbsz,abid.amdgpu.scaled_wmma— Scaled WMMA on gfx1250+. Operands:sourceA,sourceB,destC,scaleA,scaleB. Shapes: 16×16×128, 32×16×128.ds_barrier_init,ds_barrier_poll_state,ds_async_barrier_arrive,ds_barrier_arrive,ds_barrier_state_phase,ds_barrier_state_pending_count,ds_barrier_state_init_count,ds_barrier_state_phase_parity.make_dma_base_gather,make_dma_base_scatter— descriptor-building for gather/scatter.New Types
TDMGatherBaseType,TDMScatterBaseTypefor TDM descriptor results.ROCDL
New Operations
cluster.id.x/y/z,wave.id.s.nopwithcountattribute.s.get.named.barrier.state,s.wakeup.barrier.global.load.tr4.b64,global.load.tr8.b64,global.load.tr6.b96,global.load.tr8.b128) and DS variants.cluster.load.async.to.lds.b8/b32/b64/b128.global.prefetch,flat.prefetch.ds.atomic.barrier.arrive.rtn.b64,ds.atomic.async.barrier.arrive.b64.rocdl.tanh,sin,cos,rcp,exp,exp2,log,sqrt,rsq.MFMA/WMMA Rework
ROCDL_Mfma_IntrOpwith explicitABType,CDType(no more variadic$args).ROCDL_Mfma_Scale_IntrOpfor scaled MFMA,ROCDL_Smfmac_IntrOpfor sparse MFMA.ROCDL_WMMA_IntrOp,ROCDL_WMMA_Opsel_IntrOp, etc.) with explicit operands/attrs.wmma.scale.f32.16x16x128.f8f6f4,wmma.scale16.f32.16x16x128.f8f6f4.Barrier API Changes
BarrierInitOp:$id→$memberCnt.ROCDL.barrierdeprecation notice; prefergpu.barrier.New Type
ROCDLGlobalBuffer:LLVM_PointerInAddressSpace<1>.AMDGPUToROCDL / GPUToROCDL
New Lowerings
SparseMFMAOpLoweringforamdgpu.sparse_mfma→ ROCDL smfmac (gfx942+).ScaledWMMAOpLoweringforamdgpu.scaled_wmma(gfx1250+).ScaledExtPackedMatrixOpLowering(gfx1250+).ds_barrier_*ops.make_dma_base_gather/scatteron gfx1250+.MemoryCounterWaitOp: Newtensorcounter andWaitTensorcntOp.API Changes
convertMFMAVectorOperand→packSmallFloatVectorOperand.castMFMAScaleOperand→castScaleOperand(supportsvector<8xi8>→ i64).gpu.barrierLoweringPat<GPU_BarrierOp, ROCDL_BarrierOp>replaced byGPUBarrierOpLoweringthat handlesmemfenceaddress spaces and chipset-specific behavior.Shared Infrastructure
amdgpu::populateCommonGPUTypeAndAttributeConversions()centralizes GPU→ROCDL type/attribute conversions (used by both NVVM and ROCDL paths).Buffer Descriptor (gfx1250+)
makeBufferRsrcfor new flag layout.getNumRecords: withboundsCheck=false, returns(1<<45)-1.Upstream Patches
1. Missing type converter for ConvertMemrefStore
Affected test:
Dialect/MHAL/emulate-narrow-type.mlirRoot cause: The upstream
memref::populateMemRefNarrowTypeEmulationPatternsnow registersConvertMemrefStoreviapatterns.insert<>()without the type converter, while all other patterns usepatterns.add<>(typeConverter, ...). This means theConvertMemrefStoreconversion pattern does not have a type converter associated, so operand adaptation may not work correctly.The error is:
failed to legalize operation 'memref.store' that was explicitly marked illegalformemref.storeonmemref<8xi4>.Fix: Fix is to add the
ConvertMemrefStorepattern with the type converter associated.Diff Files
diff -rup llvm-project/llvm rocMLIR/external/llvm-project/llvmllvm-diff.txt
diff -rup llvm-project/mlir rocMLIR/external/llvm-project/mlirmlir-diff.txt
Test Plan
External Tests
check-llvmcheck-mlirCI
Submission Checklist