Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large GRU Segmentation Fault #157

Open
MattyAB opened this issue Nov 21, 2024 · 3 comments
Open

Large GRU Segmentation Fault #157

MattyAB opened this issue Nov 21, 2024 · 3 comments

Comments

@MattyAB
Copy link

MattyAB commented Nov 21, 2024

A GRU larger than a certain size appears to cause a segmentation fault. This is not specific to the backend, the result has been replicated with all backends.

Minimal Replication

#include "RTNeural/RTNeural/RTNeural.h"
#include "RTNeural/tests/functional/load_csv.hpp"
#include <filesystem>
#include <iostream>

namespace fs = std::filesystem;

constexpr int vocab_size = 27;
constexpr int hidden_size = 512;

using ModelType = RTNeural::ModelT<float, vocab_size, vocab_size,
    RTNeural::DenseT<float, vocab_size, hidden_size>,
    RTNeural::GRULayerT<float, hidden_size, hidden_size>,
    RTNeural::GRULayerT<float, hidden_size, hidden_size>,
    RTNeural::DenseT<float, hidden_size, vocab_size>>;

int main([[maybe_unused]] int argc, char* argv[])
{
    ModelType model;

    return 0;
}

Build Environment

Macbook Pro with M2 Pro Processor. CMakeLists.txt is as follows:

cmake_minimum_required(VERSION 3.10)
project(GenerativeGRU VERSION 1.0 LANGUAGES CXX)

set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

add_executable(GenerativeGRU main.cpp)

set(RTNEURAL_STL ON CACHE BOOL "Use RTNeural with this backend" FORCE)
add_subdirectory(RTNeural)
target_link_libraries(GenerativeGRU LINK_PUBLIC RTNeural)
@jatinchowdhury18
Copy link
Owner

Thanks for the report! Do you have any more information about the root cause of the seg fault? My guess is that it's just a stack overflow, since the model might be too large to be allocated on the stack.

@MattyAB
Copy link
Author

MattyAB commented Nov 21, 2024

Hi! Yes, I think it's due to the fact that weights are being stored on the stack. I don't know how practical it is to reorg some of the code so that layer weights are stored on the heap using a vector or etc?

@jatinchowdhury18
Copy link
Owner

jatinchowdhury18 commented Nov 21, 2024

It would be possible to store the layer weights on the heap, but I'd rather not do that in the "compile-time" implementations of the layers, for performance reasons.

I would suggest trying one of two options:

  • Using the "run-time" API rather than the compile-time API. With the run-time API, the weights are stored on the heap.
  • Store the entire model on the heap, e.g. auto model = std::make_unique<ModelType>();.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants