@pinkforest well yeah of course you need to address contraints when designing "softcore" (i guess this a term?) :-)
First, you need some register space for book keeping (like one or two registers at least) and it unfortunately shares space with softcores registers.
So problem at hand is like:
1. Reduce softcores register space i.e., it will have less registers than host CPU.
2. "Share-and-swap" registers with the softcore for the colliding part.
This of course assumes that ISA is similar enough to x86-64 and aarch64 but that is quite fair constraint to set from the get go for obvious reasons :-)
Anyhow, it will quite simple and sound translation with no artificial stack machines blocking the way :-)