QoS Load Balancing for Edge AI Applications