-2

I have the following builder that builds Tokens:

// token_builder.h
class TokenBuilder
{
    Keyword m_keyword;
    TokenType m_type;
    std::string m_symbol;

public:
    TokenBuilder& set_keyword(Keyword);
    TokenBuilder& set_toktype(TokenType);
    TokenBuilder& set_symbol(std::string);
    std::unique_ptr<Token>&& build();
};
// token_builder.cpp
TokenBuilder& TokenBuilder::set_keyword(Keyword k)
  {
      m_keyword = k;
      return *this;
  }

  TokenBuilder& TokenBuilder::set_toktype(TokenType t)
  {
      m_type = t;
      return *this;
  }

  TokenBuilder& TokenBuilder::set_symbol(std::string sym)
  {
      m_symbol = sym; // This is the point where it segfaults
      return *this;
  }

  std::unique_ptr<Token>&& TokenBuilder::build()
  {
      std::unique_ptr<Token> token;
      token->m_keyword = m_keyword;
      token->m_type = m_type;
      token->m_symbol = m_symbol;
      return std::move(token);
  }

The program halts with SIGSEGV at the point where I am trying to initialize m_symbol by calling TokenBuilder::set_symbol. Here is how main looks like:

int main()
{
    auto token = Token::builder()
            .set_keyword(Keyword::IF)
            .set_toktype(TokenType::ID)
            .set_symbol("if")
            .build();

    std::cout << token->get_symbol() << std::endl;
}

As you can see, call to set_keyword and set_toktype don't result in any errors. But passing the literal "if" (or a std::string object) to set_symbol causes the program to crash. Passing by value is intentional. Stepping into set_symbol and analyzing the state of TokenBuilder doesn't reveal anything. As expected, m_keyword is set to Keyword::IF and m_type to TokenType::ID. This is the line that crashes: m_symbol = sym.

The program above might have other, non-trivial issues (am I using std::move correctly? returning to C++ after a long time) but this kind of feels like a silly mistake a rookie would make. I am using MinGW-w64 compiler on Windows 10 (64-bit).

Here is the entire code for conveniently copy pasting to your machine:

// src/include/token_builder.h
#ifndef TOKEN_BUILDER_H
#define TOKEN_BUILDER_H

#include <memory>
#include "token.h"

namespace funk
{
class Token;

class TokenBuilder
{
    Keyword m_keyword;
    TokenType m_type;
    std::string m_symbol;

public:
    TokenBuilder& set_keyword(Keyword);
    TokenBuilder& set_toktype(TokenType);
    TokenBuilder& set_symbol(std::string);
    std::unique_ptr<Token>&& build();
};

}

#endif
// src/token_builder.cpp
#include "include/token.h"
#include "include/token_builder.h"

namespace funk
{
  TokenBuilder& TokenBuilder::set_keyword(Keyword k)
  {
      m_keyword = k;
      return *this;
  }

  TokenBuilder& TokenBuilder::set_toktype(TokenType t)
  {
      m_type = t;
      return *this;
  }

  TokenBuilder& TokenBuilder::set_symbol(std::string sym)
  {
      m_symbol = sym;
      return *this;
  }

  std::unique_ptr<Token>&& TokenBuilder::build()
  {
      std::unique_ptr<Token> token;

      token->m_is_keyword = m_keyword != Keyword::_NONE_;
      token->m_keyword = m_keyword;
      token->m_type = m_type;
      token->m_symbol = m_symbol;

      return std::move(token);
  }
}
// src/include/token.h
#ifndef TOKEN_H
#define TOKEN_H
#include <memory>
#include "keyword.h"
#include "token_type.h"
#include "token_builder.h"

namespace funk
{
  class TokenBuilder;

  class Token
  {
      friend class TokenBuilder;

      Keyword m_keyword;
      bool m_is_keyword;
      std::string m_symbol;
      TokenType m_type;

      Token();

  public:

      Keyword& get_keyword ();
      bool is_keyword();
      std::string& get_symbol();
      TokenType& get_type();
      static TokenBuilder&& builder();
  };
}

#endif // END TOKEN_H
// src/token.cpp
#include <string>
#include <memory>
#include "include/keyword.h"
#include "include/token_builder.h"
#include "include/token_type.h"

namespace funk
{
  Keyword& Token::get_keyword()
  {
    return m_keyword;
  }

  bool Token::is_keyword()
  {
    return m_is_keyword;
  }

  std::string& Token::get_symbol()
  {
    return m_symbol;
  }

  TokenType& Token::get_type()
  {
    return m_type;
  }

  TokenBuilder&& Token::builder()
  {
    TokenBuilder builder;
    return std::move(builder);
  }
}

// src/main.cpp
#include <iostream>
#include "include/buffered_reader.h"
#include "include/token.h"
#include "include/token_builder.h"

int main()
{
    using namespace funk;

    auto token = Token::builder()
            .set_keyword(Keyword::IF)
            .set_toktype(TokenType::ID)
            .set_symbol("if")
            .build();

    std::cout << token->get_symbol() << std::endl;
}

Note: I looked up for similar questions on StackOverflow as best as I could but didn't find one that matches my case. If you think it is similar to a question you previously encountered, feel free to mark duplicate and post the link to the original question. Thanks!

  • 1
    Run your program under gdb and then run backtrace. It will tell you what line you are segfaulting on. This is way too much code you've provided and it will be better for you to learn how to debug your own programs instead of relying on StackOverflow to debug them for you. – Gillespie Feb 10 '20 at 15:56
  • 2
    `Token::builder()` returns a reference to a local variable. The variable is destroyed and the reference becomes dangling as soon as the function returns. Using that reference by the caller then exhibits undefined behavior. – Igor Tandetnik Feb 10 '20 at 15:57
  • @Gillespie I know which line causes the crash. Read the question. –  Feb 10 '20 at 15:57
  • 1
    @DevashishJaiswal Sorry, it was lost in the wall of code you posted as I scrolled down. I recommend trying to make a https://stackoverflow.com/help/minimal-reproducible-example next time. – Gillespie Feb 10 '20 at 15:58
  • The object referenced by the rvalue reference is dead before the caller has a chance to do anything with it. Remove the `&&` from the return type of `build()` and remove the `move()`: the compiler will elide moving the value entirely anyway. – Dietmar Kühl Feb 10 '20 at 16:02
  • 1
    In `TokenBuilder::build` you never set `token` to refer to a valid `Token`. It's effectively dereferenced whilst still a null pointer. – G.M. Feb 10 '20 at 16:03

1 Answers1

2
  std::unique_ptr<Token>&& TokenBuilder::build()
  {
      std::unique_ptr<Token> token;
      // ...
      return std::move(token);
  }

You return reference to a local variable. At the end of the function, the local variable is destroyed, and the returned reference will be left referring to an object outside of its lifetime. When you later attempt to access the non-existing object through the dangling reference, the behaviour of the program is undefined.

eerorika
  • 181,943
  • 10
  • 144
  • 256
  • I had suspicions that I was wrong about that part. But shouldn't the crash happen somewhere else? I mean, if I call `set_symbol` before calling `set_keyword` and `set_type`, the crash happens before any of `set_type` and `set_keyword` are called. –  Feb 10 '20 at 16:03
  • 1
    Allthough this is true too, this is not the reason for the segfault. There are multiple bugs in the shown code, and the fatal segfault occurs prior to this point. Its reason is much more obvious. – Sam Varshavchik Feb 10 '20 at 16:03
  • @DevashishJaiswal Behaviour of the program is undefined. There is no guarantee about when it crashes or whether it crashes at all. – eerorika Feb 10 '20 at 16:03
  • 1
    @DevashishJaiswal To help find all bugs, I recommend creating a [mcve]. – eerorika Feb 10 '20 at 16:05
  • @DevashishJaiswal `std::move`'ing the return value is also [pessimization](https://stackoverflow.com/a/4986802/12863668), simply [return `std::unique_ptr`](https://en.cppreference.com/w/cpp/language/copy_elision). –  Feb 10 '20 at 16:41