73

While working with COM in C++ the strings are usually of BSTR data type. Someone can use BSTR wrapper like CComBSTR or MS's CString. But because I can't use ATL or MFC in MinGW compiler, is there standard code snippet to convert BSTR to std::string (or std::wstring) and vice versa?

Are there also some non-MS wrappers for BSTR similar to CComBSTR?

Update

Thanks to everyone who helped me out in any way! Just because no one has addressed the issue on conversion between BSTR and std::string, I would like to provide here some clues on how to do it.

Below are the functions I use to convert BSTR to std::string and std::string to BSTR respectively:

std::string ConvertBSTRToMBS(BSTR bstr)
{
    int wslen = ::SysStringLen(bstr);
    return ConvertWCSToMBS((wchar_t*)bstr, wslen);
}

std::string ConvertWCSToMBS(const wchar_t* pstr, long wslen)
{
    int len = ::WideCharToMultiByte(CP_ACP, 0, pstr, wslen, NULL, 0, NULL, NULL);

    std::string dblstr(len, '\0');
    len = ::WideCharToMultiByte(CP_ACP, 0 /* no flags */,
                                pstr, wslen /* not necessary NULL-terminated */,
                                &dblstr[0], len,
                                NULL, NULL /* no default char */);

    return dblstr;
}

BSTR ConvertMBSToBSTR(const std::string& str)
{
    int wslen = ::MultiByteToWideChar(CP_ACP, 0 /* no flags */,
                                      str.data(), str.length(),
                                      NULL, 0);

    BSTR wsdata = ::SysAllocStringLen(NULL, wslen);
    ::MultiByteToWideChar(CP_ACP, 0 /* no flags */,
                          str.data(), str.length(),
                          wsdata, wslen);
    return wsdata;
}
ezpresso
  • 7,110
  • 11
  • 54
  • 90

4 Answers4

98

BSTR to std::wstring:

// given BSTR bs
assert(bs != nullptr);
std::wstring ws(bs, SysStringLen(bs));

 
std::wstring to BSTR:

// given std::wstring ws
assert(!ws.empty());
BSTR bs = SysAllocStringLen(ws.data(), ws.size());

Doc refs:

  1. std::basic_string<typename CharT>::basic_string(const CharT*, size_type)
  2. std::basic_string<>::empty() const
  3. std::basic_string<>::data() const
  4. std::basic_string<>::size() const
  5. SysStringLen()
  6. SysAllocStringLen()
ildjarn
  • 59,718
  • 8
  • 115
  • 201
  • Won't this fail if `bs` contains null? – Mooing Duck Jul 02 '14 at 20:29
  • @MooingDuck : Assuming you mean an embedded null character, then no – that's why the constructor taking a length is used instead of the one only taking a `wchar_t const*`. `SysStringLen` handles embedded nulls, unlike e.g. `wcslen`. – ildjarn Jul 02 '14 at 20:35
  • 2
    @ildjam I'm sorry if this is a rookie question, but I started doing "go to definition" on `BSTR` (you'll need to run VS as admin) and `BSTR` seems to be no more than a `wchar_t*`. On the other hand, I also found [Microsoft's documentation](http://msdn.microsoft.com/en-us/library/windows/desktop/ms221240(v=vs.85).aspx) which, as you say, says that this constructor can handle embedded null characters. How can this constructor find the length of a `BSTR` if all it contains is a pointer to `wchar_t`? – HerrKaputt Nov 13 '14 at 17:35
  • 2
    @HerrKaputt : Because BSTRs are allocated on a special heap that retains the length of the allocation, and allows that length to be queried given a BSTR. – ildjarn Nov 13 '14 at 17:38
  • I see. Thanks for your reply! :) – HerrKaputt Nov 14 '14 at 10:41
  • 2
    `NULL` is a **valid** state for `BSTR`, it is equivalent to an empty string. So the code should perhaps be `std::wstring(bs ? bs : L"");` – M.M Sep 10 '15 at 02:48
  • @M.M : COM considers a null string equivalent to an empty string, but IME, programmers do not. I intentionally had the code treat them differently; IMO, the possibility of null `BSTR`s should be represented via `boost::optional<>` or somesuch, not via invisible semantics that contributed heavily to COM's considerable Pain List™ to begin with. – ildjarn Sep 10 '15 at 02:52
  • @M.M : "*`BSTR` are null-terminated (as well as length-counted) so `std::wstring(bs)` works too.*" `std::wstring(bs)` does not support embedded null-characters, my code does – this was intentional. :-] – ildjarn Sep 10 '15 at 02:54
  • @ildjarn OK. Being permissive in what one accepts is often a good idea. Good point about embedded nulls – M.M Sep 10 '15 at 02:55
  • @M.M : Embedded nulls are unfortunately quite common on Windows, as the Windows Shell often uses nulls as delimiters and double-nulls as terminators. :-[ This usually leads to using `std::vector` + algorithms rather than `std::wstring` in shell code, as it's more obvious how to handle these obtuse values; or rather, because dealing with them as string-typed objects appears highly non-idiomatic/scary. – ildjarn Sep 10 '15 at 03:00
  • This is not about personal preference: `NULL` **is** a valid `BSTR`, and your conversion code needs to be prepared for it. This is part of the documented contract for `BSTR`s. Otherwise you'll have to add a disclaimer, saying: *"Converts most `BSTR`s to `std::wstring`!"* – IInspectable Feb 25 '16 at 19:22
  • @IInspectable : It's not handled here because it's not obvious how it should be handled. I made a pointed note of it so the reader knows it needs to be addressed somehow, but there is no single correct semantic thus it would only confuse a useful answer. – ildjarn Feb 25 '16 at 21:48
  • 1
    There is no confusion about semantics. A `NULL` `BSTR` is semantically identical to an empty `BSTR`. That maps easily to a `std::wstring`. Constructing an empty `std::wstring` is the single correct and obvious conversion for a `NULL` `BSTR`. – IInspectable Feb 25 '16 at 22:16
  • @IInspectable : A `NULL` `BSTR` _can be_ semantically identical to an empty `BSTR`, because the runtime _allows_ for that, but literally every real codebase I've ever seen does **not** treat them semantically identically. If one wants to go that route, then the solution is obvious and I won't be motivated to edit it into my answer; and in the (IMO, likely) event that one _doesn't_ want to treat `NULL` and empty the same, they'll have to encode that however they see fit, which again, I can't predict. In any case, your point is valid, but sufficiently covered by my answer as-is. – ildjarn Feb 25 '16 at 22:25
  • @ildjarn see point 1 of the [first list here](https://blogs.msdn.microsoft.com/ericlippert/2003/09/12/erics-complete-guide-to-bstr-semantics/). It would be a bug if a codebase treated a null BSTR differently to an empty one. – M.M Mar 09 '18 at 04:52
10

You could also do this

#include <comdef.h>

BSTR bs = SysAllocString("Hello");
std::wstring myString = _bstr_t(bs, false); // will take over ownership, so no need to free

or std::string if you prefer

EDIT: if your original string contains multiple embedded \0 this approach will not work.

AndersK
  • 33,910
  • 6
  • 56
  • 81
  • This answer is not correct and will give you incorrect results if your string contains NULL characters. – Chronial Jul 21 '20 at 13:28
  • In that case it is a problem with _bstr_t although I have never had any problems with that. – AndersK Jul 21 '20 at 14:19
  • It's only a problem of `_bstr_t` in so far that it's a bit missleading. It just implicitly casts to `wchar_t*`. It never claims that that points to a null-terminated string. It's when you pass that value to the `std::wstring` constructor that you create the problem. It's just as wrong as `std::wstring otherstring = something(); std::wstring mystring = otherstring.c_str()`. – Chronial Jul 21 '20 at 15:27
  • The example I gave is not wrong, however as you point out if the BSTR contains multiple \0 it will not work. No surprise there, the worst that can happen is a truncated string. – AndersK Jul 21 '20 at 16:21
8

There is a c++ class called _bstr_t. It has useful methods and a collection of overloaded operators.

For example, you can easily assign from a const wchar_t * or a const char * just doing _bstr_t bstr = L"My string"; Then you can convert it back doing const wchar_t * s = bstr.operator const wchar_t *();. You can even convert it back to a regular char const char * c = bstr.operator char *(); You can then just use the const wchar_t * or the const char * to initialize a new std::wstring oe std::string.

4

Simply pass the BSTR directly to the wstring constructor, it is compatible with a wchar_t*:

BSTR btest = SysAllocString(L"Test");
assert(btest != NULL);
std::wstring wtest(btest);
assert(0 == wcscmp(wtest.c_str(), btest));

Converting BSTR to std::string requires a conversion to char* first. That's lossy since BSTR stores a utf-16 encoded Unicode string. Unless you want to encode in utf-8. You'll find helper methods to do this, as well as manipulate the resulting string, in the ICU library.

Hans Passant
  • 873,011
  • 131
  • 1,552
  • 2,371