Hybrid Morello varargs is inefficient and not completely foolproof
As discussed on Slack, the current implementation of hybrid Morello varargs is likely not what's desired.
The current implementation spills all capability registers below the normal GPR spill area, with the va_arg lowering code that counts registers using its va_list struct knowing that the GPCR corresponding to a GPR is at a fixed offset from the GPR on the stack.
This breaks in some edge cases when mixing plain AArch64 and hybrid Morello code, specifically when there is a plain AArch64 trampoline that calls va_start and passes the va_list to a hybrid function, which you could imagine happening when using complex frameworks where the framework has not been recompiled but your application has, and so this is not a 100% ABI-compatible change, even if it's unlikely to really matter in practice.
Secondly, and probably more importantly, this adds a bunch of overhead to every varargs function in hybrid that isn't present in plain AArch64. With our hybrid ABIs (well, the RISC-V one given MIPS's split register file renders this more difficult) we have tried very hard to not perturb code generation in any way when you're not using CHERI features, with about the only exception being the inability to inline memcpy in certain annoying cases. We often use the hybrid ABI as our baseline for CHERI comparisons due to that but also so that the CHERI-aware memcpy gets used to eliminate any performance wins for CHERI code from using a larger memcpy block size.
Our approach for handling capabilities passed to hybrid varargs functions is to just always pass them indirectly, done in the Clang lowering code so it's explicit in the IR rather than having to mess with the backend and ensure agreement between it and Clang (which would be a legitimate approach, and perhaps the better one, but isn't really necessary). This means that if you don't have capabilities passed to hybrid varargs functions then code generation is identical to before. I would suggest that Morello also adopt this approach.