Missing post-activation layer for GRU & LSTM when parsing JSON #124

christoph-hart · 2023-12-30T11:31:01Z

Hi there,

I'm currently toying around with the library and I noticed that the JSON parser for the tensor flow model does not add a activation layer after the GRU / LSTM models:

RTNeural/RTNeural/model_loader.h

Line 687 in 0485da9

model->addLayer(gru.release());

I've compared the model layout of the model from your RTNeuralExample repository with the JSON and noticed that inconsistency. The tensorflow JSON does list a activation function as you can see here:

"type": "gru",
"activation": "tanh",
"shape": [ null, null, 8 ],

I'm just starting out with the entire ML stuff so it might be a silly question, but is there a reason for the activation layers to be omitted from the JSON parser for the GRU and LSTM layers?

janaboy74 · 2023-12-30T14:42:40Z

I'm working on it. The parser is more or less fixed, but the output formatter is still wrong.

jatinchowdhury18 · 2023-12-31T01:28:26Z

Hello!

So the "root" of the problem here is a bit of a "compatibility" problem between TensorFlow and RTNeural.

In TensorFlow (and I think PyTorch as well), the GRU and LSTM layers have their own "internal" activation functions. In RTNeural, we use the "default" tanh activation functions for these layers, and the activation functions are "built in" to the layer implementations.

The JSON files are typically generated from TensorFlow's representation of a model. It works something like this:

for layer in model.layers:
    layer_dict["activation"] = layer.activation

This way, when you define a TensorFlow layer with an activation function, the activation function will be part of the JSON file. However, for TensorFlow's GRU and LSTM implementations, layer.activation will return "tanh". Since the RTNeural implementations of these layers already include the "built in" activation functions, and we don't want to apply the activation function twice, we ignore the "activation" JSON field for those layers. The full Python script can be found here.

If you're manually writing/generating your own JSON file, and would like to have an additional activation applied after your GRU or LSTM layer, you could add another layer to your JSON file, that looks something like:

{
    "type": "gru",
    "activation": "tanh",
    "shape": [ null, null, 8 ],
}

Hope that this is helpful! I'm also curious about the fixes @janaboy74 is making for the parser?

janaboy74 · 2024-01-04T19:12:39Z

I think I have fixed the jsonparser:
"- Json parser now uses recursion and I hope it works now correctly."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing post-activation layer for GRU & LSTM when parsing JSON #124

Missing post-activation layer for GRU & LSTM when parsing JSON #124

christoph-hart commented Dec 30, 2023

janaboy74 commented Dec 30, 2023 •

edited

Loading

jatinchowdhury18 commented Dec 31, 2023

janaboy74 commented Jan 4, 2024

Missing post-activation layer for GRU & LSTM when parsing JSON #124

Missing post-activation layer for GRU & LSTM when parsing JSON #124

Comments

christoph-hart commented Dec 30, 2023

janaboy74 commented Dec 30, 2023 • edited Loading

jatinchowdhury18 commented Dec 31, 2023

janaboy74 commented Jan 4, 2024

janaboy74 commented Dec 30, 2023 •

edited

Loading