Does anyone see any flaws?
Not in the mechanic, but in the effect on gameplay. What you suggest would work well as a world simulator, but as a game it makes things too complicated for the player in my opinion.
For example when deciding whether to attack an adjacent monster I want to know:
a) how much time until the monster takes their next turn (ie can attack me)
b) how much time will my attack take
If b < a then I can hit the monster and step away before they can hit me. If not the monster will hit me after I hit them. Makes a huge difference. With simple RL timing this is easy, nearly always b = a so I don't even really need to ask the question. Sometimes a and b are some simple ratio like 1:2, 2:1 to simulate fast/slow creatures relative to the player, but again it's fairly simple to think through.
But for a complex RL timing as you describe a and b are not only different for each monster/player combo, they in fact change each turn as different entities will be out of sync with each other. So either somehow you visualize the numbers for a) and b) to the player (UI nightmare) or you don't and the player has to divine the ratios themselves (for optimal strategy).
This includes the player having to think "how long do I have to wait for b<a so I can kite this monster?" and then performing dummy actions to make that so. Unless of course the game offers a variable wait command (I want to wait for 0.157s!) which is a WTF in itself. With different attack and movement actions of various variable speeds (player and monster) the complexity of calculations required to figure out the optimal action to make explodes.
All this calculation is fairly boring IMO and even if the player does ignore it, it will leave a defeated player thinking "hmm I lost because I didn't spend time on the calculations." and possibly "I can't win without spending time on calculations. I don't want to spend time on calculations. I'll stop playing this game."
Another problem I found with variable action timings is combat vs movement. Imagine a lumbering monster a very slow move speed of 2s but a fast attack speed of 0.2s. The player is twice as fast. So movement 1s and attack 0.1s. So the monster steps towards the player. Now the player has 20 attacks before the monster can attack them back??? The problem is the monster's actions are frontloaded, as you put it " 'act now, pay later'". They perform some long move and then can't react again until some cooldown, even if in reality they've been "interupted". Okay you could paper over this problem by counting an attack as an "interruption" and allowing the monster to make a reaction attack, but this is just papering over the real problem IMO, which is that the turn/tile thing of roguelikes is a discrete abstraction of time and space with time still relative to the player. Trying to get variable action times to work on top of that is like trying to fit a square peg into a round hole (which is only possible in r'lyeh). The only way variable actions work somewhat with turn based is the (old) xcom style of turn units. Just my opinion though, I could be wrong, but this is the impression I got after spending time making a turn system similar to what you describe only to find I didn't like how it played.
So now my opinion has gone back to the idea of simple RL timing being better. In fact I wonder why I ever wanted to have complex RL timing. It's like having over complex combat mechanics. Sometimes simpler is better.