The second is that GC is non-deterministic and can happen at any time, causing potential stalls when rendering a frame. How is Manual Memory Management better for this?
Because you have control when and what to do. Even if you need to deallocate something, you control where to do that. With GC you don't have such control and VM runtime decides this for you.
I mean at the end of the day you need to free the memory eventually, right?
Yes, but that can be as easy as one sub operation. For example - allocate one "huge" block of 1GB for frame temporary memory. And at the end of frame you do one sub operation to move pointer back 1GB. That's it. This is what HH does for most of allocations. You allocate big block of memory - and then subdivide it in smaller "allocations" which are very cheap, just one add operation. Basically dellocation with this kind of scheme is super cheap - it is constant time, regardless of how many "objects" you have allocated. It could be million objects, but still deallocating all of them costs you ~one assembly instruction.
But with GC you'll need to walk all over the hierarchy to figure which objects are alive and which not. And this worse - there's no way to do that in controlled way once per frame. Maybe VM will decide to buffer up this process. So it will do that when some critical threshold of allocated memory happens. And then you'll pay x amount of times instead a bit in every frame. Let's say your GC takes ~4 msec to collect all the "garbage" objects that were allocated during the frame. That's reasonable and manageable, even for 60Hz rendering where you have 16msec time to render one frame. But what if VM decides to do that only ever 4 or 5 frames? Then suddenly you have 16 or 20 msec of delay where everything stops and only GC runs. That is one whole frame lost. You cannot do 60Hz rendering anymore. And its even worse for VR where 90Hz is required for smooth rendering.
Yeah, sure there are more modern GC algorithms which run in parallel and in background threads. That helps a bit, but still does not solve delay issue 100%. All algorithms eventually still need to stop all the threads in process to properly clean up "garbage". And you don't want such delay.
I mean when a Object has no more references, it seems like you would want to free it, so I don't really see the problem.
The problem is what does "free it" means. With GC there is no explicit "free" operation. You just "forget" about the object - think about it like assigning null to object. GC system will take care of actually "freeeing" it only later which you don't control. That's the problem.