Load Balancing Strategies for Multimodal AI Models