Batch vs. Layer Normalization in RNNs