As I've shown la and o, they are collecting to the right, so they are currently running as exceptions to the rules. This wouldn't be a problem, as long as the exception is consistent. Simple enough with la, but o is a bit harder since it can effect subjects or verbs. I think I'd have to choose to either group subjects when it is an address or group verbs when it is a command. Whichever is left would have to work as an independent glyph block. I've tried it out grouping the subject, so with 'jan Ekitu o moku e moku ni. o moku e telo ni' it groups an address/command in the first sentence and sits independently in the second.
I have thought about this but am not good enough with programming to be able to do this well. Making vector versions of the glyphs would be easy enough though, so I could make them available for people to try moving them around in their favorite vector based application. That would be the best way to start tackling all the layering and sizing problems you've started to describe. The obvious endpoint would be when you could type in a toki pona sentence and have it spit out in glyphs. Easier said then done