Latency Profiling for Large Language Models