In C++ string, why after the last character, behavior is different when accessed by index and at()?

Question

string ss("test");
cout << ss[ss.size()] ;

Output is null , program terminates normally but when run this program with at()

string ss("test");
cout << ss.at(ss.size()) ;

throws an exception . My question is , shouldn't both behavior be the same either both( accessing by index and at()) give abnormal termination or exit normally?

`at` specifically has error handling for ensuring the index is in range, whereas `operator[]` does not. Note you are likely using `sizeof` wrong, why not test with values such as `0`, `3` and `4` where it might become apparent — Tas, Jul 02 '19 at 01:27
Why do you think `sizeof` on a `std::string` has anything to do with how many characters are held in it? — Shawn, Jul 02 '19 at 01:29
*I think this is the right behavior* -- There is no "right behavior". Second, read the documentation for the [at() function](https://en.cppreference.com/w/cpp/string/basic_string/at). — PaulMcKenzie, Jul 02 '19 at 01:29
Also, a thrown exception is **not** an abnormal termination. A thrown exception indicates that the underlying C++ library code detected the problem, and actually uses the `throw` keyword to inform the caller of the error. This is not a "segmentation fault" or "access violation" type error that the OS detects. — PaulMcKenzie, Jul 02 '19 at 01:40
Possible duplicate of [What is the difference between string::at and string::operator\[\]?](https://stackoverflow.com/questions/14699060/what-is-the-difference-between-stringat-and-stringoperator) — phuclv, Jul 02 '19 at 02:13
@phuclv This question is specifically about the behavior of `s[s.size()]` and `s.at(s.size())` which involves difference between standards. That question doesn't seem to address that. — L. F., Jul 02 '19 at 08:54

eerorika · Accepted Answer · 2019-07-02T02:48:24.630

shouldn't both behavior be the same either both( accessing by index and at()) give abnormal termination or exit normally?

No, they should not have the same behaviour. The behaviour is different intentionally. If it wasn't, then there would only be a need for one of them to exist.

The at member function performs bounds checks. Any access outside the bounds of the container results in an exception. This is the same as the at member function of std::array or std::vector for example. Note that an uncaught throw will cause the program to be terminated.

The subscript operator does not perform any out of bounds checks. Prior to C++11, any access to elements at indices > size() has undefined behaviour. Under no circumstance is the subscript operator guaranteed to throw an exception. This is the same as subscript operator of an array, std::array or std::vector for example.

Since C++11, the behaviour of the subscript operator of std::string was changed such that reading the element at index == size() (i.e. one past the last element) is well defined, and returns a null terminator. Only modifying the object through the returned reference has undefined behaviour. Reading other indices outside the bounds still has undefined behaviour.

I do not know for a fact the rationale for not making corresponding change to at to allow access to the null terminator, but I suspect that it was considered to be a backwards incompatible change. Making UB well defined is always backwards compatible, while ceasing exception throwing is not. Another possible reason is that it would have opened a route to UB (if the null terminator is modified), and the design of at is to keep it free from UB.

score 0 · Answer 2 · edited Jul 02 '19 at 03:50

0

operator[] does not check whether the index is valid, and invokes undefined behaviour when the index is out of range. Important the last character is valid('\0') when you use operator[], ss[ss.size()] = ss[4] = '\0'. So output is null, program terminates normally. Please try cout << ss[ss.size() + 1], you will get the error.
ss.at(ss.size()): the last character('\0') InvalidPassing an invalid index (less than 0 or greater than or equal to size()) throws an out_of_range exception: :https://www.geeksforgeeks.org/string-at-in-cpp/

edited Jul 02 '19 at 03:50

Tas

6,589
3
31
47

answered Jul 02 '19 at 02:13

bruce_zhu

1

`ss[ss.size()` refers to the char just past the end of the string. In all implementations I'm aware of this is a `\0` but there is no guarantee of this. Formally, it's UB. – doug Jul 02 '19 at 02:24
1

@doug Reading `ss[ss.size()]` is formally well defined since C++11. – eerorika Jul 02 '19 at 02:38
1

@bruce_zhu `please try cout << ss[ss.size() + 1],you will get the error.` There is no guarantee for there to be an error. The behaviour is undefined. – eerorika Jul 02 '19 at 02:42
`an invalid index (less than 0 or ...` the index is unsigned, so it is impossible to pass a negative index. – eerorika Jul 02 '19 at 02:44
when using operator[](std::basic_string::reference std::basic_string::operator[](size_type pos);) to get characters in std::string , the range of pos is form 0 to size().Where 0 to size()-1 are the stored characters, and for pos == size() there are the following: If pos == size(), a reference to the character with value CharT() (the null character) is returned. For the first (non-const) version, the behavior is undefined if this character is modified. – bruce_zhu Jul 02 '19 at 02:47
@eerorika Thanks. Hadn't been aware this had been done formally back in C++11. – doug Jul 02 '19 at 04:01

In C++ string, why after the last character, behavior is different when accessed by index and at()?

2 Answers2