-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing post-activation layer for GRU & LSTM when parsing JSON #124
Comments
I'm working on it. The parser is more or less fixed, but the output formatter is still wrong. |
Hello! So the "root" of the problem here is a bit of a "compatibility" problem between TensorFlow and RTNeural. In TensorFlow (and I think PyTorch as well), the GRU and LSTM layers have their own "internal" activation functions. In RTNeural, we use the "default" The JSON files are typically generated from TensorFlow's representation of a model. It works something like this: for layer in model.layers:
layer_dict["activation"] = layer.activation This way, when you define a TensorFlow layer with an activation function, the activation function will be part of the JSON file. However, for TensorFlow's GRU and LSTM implementations, If you're manually writing/generating your own JSON file, and would like to have an additional activation applied after your GRU or LSTM layer, you could add another layer to your JSON file, that looks something like: {
"type": "gru",
"activation": "tanh",
"shape": [ null, null, 8 ],
} Hope that this is helpful! I'm also curious about the fixes @janaboy74 is making for the parser? |
I think I have fixed the jsonparser: |
Hi there,
I'm currently toying around with the library and I noticed that the JSON parser for the tensor flow model does not add a activation layer after the GRU / LSTM models:
RTNeural/RTNeural/model_loader.h
Line 687 in 0485da9
I've compared the model layout of the model from your RTNeuralExample repository with the JSON and noticed that inconsistency. The tensorflow JSON does list a activation function as you can see here:
I'm just starting out with the entire ML stuff so it might be a silly question, but is there a reason for the activation layers to be omitted from the JSON parser for the GRU and LSTM layers?
The text was updated successfully, but these errors were encountered: