[Relax][Frontend][TFLite] Add LSTM and SVDF converter#19633
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces support for the TFLite LSTM (single-step) and SVDF (single-step) operators in the TVM Relax frontend, along with corresponding unit tests. The review feedback highlights several critical issues: both LSTM and SVDF are stateful operators, but their state updates are currently not being propagated back to the expression table, which will cause incorrect behavior across multiple steps. Additionally, using to_int_list on the SVDF batch dimension will crash on dynamic shapes, assertions should be replaced with proper TVM exceptions to prevent silent failures when optimization flags are enabled, and a minor simplification can be made when creating float32 constants.
|
@tvm-bot rerun |
tlopex
left a comment
There was a problem hiding this comment.
Thanks for working on these converters. I think there are still a few semantic issues that should be fixed before this lands.
- The LSTM activation is applied in the wrong places.
TFLite uses LSTMOptions.FusedActivationFunction() as the internal cell activation: it is applied to the cell update gate and again to c_new before multiplying by the output gate. It is not a post-fused activation on o * tanh(c_new).
The current converter always uses tanh for the cell candidate and tanh(c_new), then applies convert_fused_activation_function to the final hidden state. This gives incorrect results for both NONE and TANH: for example, with NONE, TFLite would use a linear cell candidate/output activation, but this converter still applies tanh.
- Unsupported LSTM modes are silently ignored.
The docstring says peephole, projection, and layer norm are unsupported, but the converter only rejects the non-CIFG input gate case. If tensors [9:12], [16:18], or [20:24] are present, the converter currently ignores them and produces incorrect IR instead of raising OpNotImplemented.
- The SVDF shape/modeling looks inconsistent with TFLite.
In TFLite SVDF, rank is not the memory size. The output shape is [batch, num_units], where num_units = num_filters / rank; weights_time has shape [num_filters, memory_size]; and the state has shape [batch, memory_size * num_filters].
The current converter treats rank as the time/state dimension, reshapes state to [batch, num_filters, rank], and returns [batch, num_filters]. It also computes feat + time_output, but TFLite updates the state by shifting it, writes the feature projection into the latest state slot, then applies time weights and reduces over rank groups. The feature projection is not directly added to the output.
The current tests only verify parameter/output shapes, so they do not catch these numerical/semantic issues. I think we need numeric tests against TFLite or structural tests that check the actual equations, especially for activation=NONE, activation=TANH, and SVDF with rank > 1 and memory_size != rank.
|
Thanks for the detailed review. I updated both converters to address the semantic issues you pointed out.
I also strengthened the tests with LSTM NONE/TANH structural checks, an unsupported-mode rejection test, an SVDF shape/modeling test with rank > 1 and memory_size != rank, and a shared-state SVDF regression test covering two consecutive ops. |
6ff25aa to
a0d898f
Compare
Summary
Add LSTM (coupled input-forget) and SVDF single-step converters to the TFLite frontend. Both are float32-only; quantized variants are not supported yet.
From #19519.
Changes
Testing
test_lstm_none_activation— verifies LSTM converter produces correct IR shapes (batch, input_size) → (batch, num_units) with 3 params (input, h_state, c_state)test_svdf_none_activation— verifies SVDF converter produces correct IR shapes (batch, input_size) → (batch, num_filters) with 2 params (input, state)python -m pytest tests/python/relax/test_frontend_tflite.py -k "lstm or svdf" -vReferences