Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested heredocs are not parsed correctly #148

Open
aibaars opened this issue Dec 16, 2020 · 4 comments
Open

Nested heredocs are not parsed correctly #148

aibaars opened this issue Dec 16, 2020 · 4 comments

Comments

@aibaars
Copy link
Contributor

aibaars commented Dec 16, 2020

puts <<HERE
  hello #{<<HERE}
  world
HERE
HERE

In the following parse tree the ranges of the heredoc bodies are not right

program [0, 0] - [6, 0])
  method_call [0, 0] - [0, 11])
    method: identifier [0, 0] - [0, 4])
    arguments: argument_list [0, 5] - [0, 11])
      heredoc_beginning [0, 5] - [0, 11])
  heredoc_body [0, 11] - [3, 4])
    interpolation [1, 8] - [1, 17])
      heredoc_beginning [1, 10] - [1, 16])
    heredoc_end [3, 0] - [3, 4])
  heredoc_body [3, 4] - [4, 4])
    heredoc_end [4, 0] - [4, 4])
@aegarbutt-stripe
Copy link

I've run into this issue when trying to run semgrep on a ruby file with nested heredocs. (semgrep/semgrep#3151)

When I paste the following into https://tree-sitter.github.io/tree-sitter/playground:

output =
  <<~ABC
    Top
    #{
      <<~DEF
        Middle
      DEF
    }
    Bottom
  ABC

puts output

I get the following output:

program [0, 0] - [13, 0])
  assignment [0, 0] - [1, 8])
    left: identifier [0, 0] - [0, 6])
    right: heredoc_beginning [1, 2] - [1, 8])
  heredoc_body [1, 8] - [9, 5])
    heredoc_content [1, 8] - [3, 4])
    interpolation [3, 4] - [7, 5])
      heredoc_beginning [4, 6] - [4, 12])
      constant [5, 8] - [5, 14])
      constant [6, 6] - [6, 9]). <-- This is the closing DEF of the HEREDOC string
    heredoc_content [7, 5] - [9, 2])
    heredoc_end [9, 2] - [9, 5])
  heredoc_body [9, 5] - [13, 0])
    heredoc_content [9, 5] - [13, 0])
    heredoc_end [13, 0] - [13, 0])

@dumblob
Copy link

dumblob commented Oct 1, 2023

I have the same issue with bourne shell. Nested heredocs are totally wrongly parsed and thus wrongly highlighted e.g. by neovim.

Do you have any news on the state of heredocs in treesitter?

@aibaars
Copy link
Contributor Author

aibaars commented Oct 2, 2023

@dumblob heredocs are not a treesitter feature. Support for special lexcical things such as heredocs are implemented by scanner.cc. I guess the scanner used tree-sitter-bash has a similar bug as the one of tree-sitter-ruby.

@AlexWayfer
Copy link

My example (here is the original: pulsar-edit/pulsar#881 (comment)):

# frozen_string_literal: true

some_hosts = ['192.168.1.56', '10.0.3.2']

some_config = <<~CONFIG
  #{some_hosts.map do |some_host|
    <<~SSH_HOST
      ---
      kind: SSHHost
      host: #{some_host}
    SSH_HOST
  end.join}
CONFIG

def log(data)
  puts data
end

log some_config

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants