EDIT: first off I should say this is after watching ep 32, so maybe this has already changed in the next couple of videos. ENDEDIT.
Seems like you're spending an awful lot of effort detecting the when a tile-relative coordinate leaves the tiles and "recanonicalizing" etc. and this is all caused by choosing to store your positions in a "decomposed" form which means you basically have to re-implement your own weirdo-radix math routines (e.g. handle the "carry bit" when you move something out of a tile etc.). I'd suggest storing things in a robust "raw" form and then doing the decomposition when needed. This removes the edges cases from the common case.
It would be much easier to just use two 64 bit ints for x/y, and define 1 in this unit to be 1/1024 of the tile size (or something.. enough to give you sub-pixel positioning). All the actual logic happens in "world units", and you don't have to handle any special cases anywhere. If you want to move the player you just add some world units to its location and there are no edge cases.
The "former edge cases" are automatically handled when the code that needs to know e.g. the tilemap coordinates does the modulo etc., but the difference is that this code is exceedingly trivial to write/read/understand. E.g. to get the tile coordinates just shift down 10 bits, to get the tile-relative offset mask out the lower 10 bits (and maybe scale it into SI units and use floats...), and more importantly it's no longer an "edge case" it's just the normal code you run all the time.