10

Simple code (below, malloc()/free() sequence being run in 100 threads) crashes on any Windows OS I tried it to run.

Any help would be greatly appreciated.

Maybe using some compiler directive can help?

We build the executable in VS2017 in Release/x64; the executable file crashes on any Windows platform I tried after several minutes of running.

I tried building with VS2015 as well but it doesn't help.

The same code on Linux works fine.

Actually, problem is more serious than it looks; we faced the situation when our server code crashes several times a day in a production environment without any reason (when user calls' number exceeds a certain value). We tried to nail down the issue and created simplest solution that reproduces the problem.

Archive with VS project is here.

VS says that command line is:

/Yu"stdafx.h" /GS /GL /W3 /Gy /Zc:wchar_t /Zi /Gm- /O2 /sdl 
/Fd"x64\Release\vc140.pdb" /Zc:inline /fp:precise /D "NDEBUG"
/D "_CONSOLE" /D "_UNICODE" /D "UNICODE" /errorReport:prompt /WX- /Zc:forScope /Gd
/Oi /MD /Fa"x64\Release\" /EHsc /nologo /Fo"x64\Release\" /Fp"x64\Release\MallocTest.pch" 

Code:

#include "stdafx.h"
#include <iostream>
#include <thread>
#include <conio.h>

using namespace std;

#define MAX_THREADS 100

void task(void) {
    while (true) {
        char *buffer;
        buffer = (char *)malloc(4096);
        if (buffer == NULL) {
            cout << "malloc error" << endl;
        }
        free(buffer);
    }
}

int main(int argc, char** argv) {    
    thread some_threads[MAX_THREADS];

    for (int i = 0; i < MAX_THREADS; i++) {
        some_threads[i] = thread(task);
    }

    for (int i = 0; i < MAX_THREADS; i++) {
        some_threads[i].join();
    }

    _getch();
    return 0;
}
chqrlie
  • 98,886
  • 10
  • 89
  • 149
peterg
  • 101
  • 6
  • 3
    Are you linking with a thread-safe version of the runtime library? – molbdnilo Feb 21 '18 at 12:54
  • Which version of the CRT are you using? You should paste the exact compiler and linker command line. – Mike Vine Feb 21 '18 at 12:54
  • @ Mike Vine I posted VS project archive here: https://drive.google.com/open?id=1A9NC52glqnIsaGLUxNgF6d1xJZRo-77e VS says that command line is: /Yu"stdafx.h" /GS /GL /W3 /Gy /Zc:wchar_t /Zi /Gm- /O2 /sdl /Fd"x64\Release\vc140.pdb" /Zc:inline /fp:precise /D "NDEBUG" /D "_CONSOLE" /D "_UNICODE" /D "UNICODE" /errorReport:prompt /WX- /Zc:forScope /Gd /Oi /MD /Fa"x64\Release\" /EHsc /nologo /Fo"x64\Release\" /Fp"x64\Release\MallocTest.pch" – peterg Feb 21 '18 at 13:02
  • Possible duplicate of [Visual C++ thread safety of free and malloc?](https://stackoverflow.com/questions/4826479/visual-c-thread-safety-of-free-and-malloc) – stijn Feb 21 '18 at 13:05
  • 1
    @molbdnilo "Are you linking with a thread-safe version of the runtime library?" I guess yes because there is /MD flag in command line: /Yu"stdafx.h" /GS /GL /W3 /Gy /Zc:wchar_t /Zi /Gm- /O2 /sdl /Fd"x64\Release\vc140.pdb" /Zc:inline /fp:precise /D "NDEBUG" /D "_CONSOLE" /D "_UNICODE" /D "UNICODE" /errorReport:prompt /WX- /Zc:forScope /Gd /Oi /MD /Fa"x64\Release\" /EHsc /nologo /Fo"x64\Release\" /Fp"x64\Release\MallocTest.pch" – peterg Feb 21 '18 at 13:10
  • Do you get a crash if you use new/delete instead? – AndersK Feb 21 '18 at 13:13
  • Does it work with a smaller number of threads? Is 100 your magic limit? – molbdnilo Feb 21 '18 at 13:13
  • @stijn "Possible duplicate of Visual C++ thread safety of free and malloc?" Looks similar; ome of replyes is "apparently in visual C++ all stdlib and stdio are thread safe, so long as you use the /MD compiler directive to use the multithread libraries.". I use /MD and my code is pretty simple but I'm still having problems on Windows. – peterg Feb 21 '18 at 13:13
  • You forgot to mention what kind of crash it is. – molbdnilo Feb 21 '18 at 13:23
  • 3
    I was able to reproduce this. Error happens even if `new` is used (it actually still calls the same `_malloc_base`). Callstack ends with `ntdll.dll!RtlpLowFragHeapAllocFromContext(); tdll.dll!RtlpAllocateHeapInternal(); ucrtbase.dll!_malloc_base()` – user7860670 Feb 21 '18 at 13:23
  • @molbdnilo "Does it work with a smaller number of threads? Is 100 your magic limit?" It definitely crashes on 100 threads; I don't know actual magic number, may be it depends on hardware. In production environment we often have thousands of threads. – peterg Feb 21 '18 at 13:24
  • 1
    It's definitely not a duplicate of https://stackoverflow.com/questions/4826479/visual-c-thread-safety-of-free-and-malloc. I could reproduce this using the multithreaded crt (BTW on my VS2017 there is no single thread crt at all). It's a crash in ntdll, call stack: `ntdll.dll!RtlpLowFragHeapAllocFromContext(); tdll.dll!RtlpAllocateHeapInternal(); ucrtbase.dll!_malloc_base()` – Jabberwocky Feb 21 '18 at 13:25
  • I could not repro on VS2013, so it could be a new issue in VS. Of course it could also be a complete coincidence that I have failed to crash it. – molbdnilo Feb 21 '18 at 13:28
  • @AndersK. new/delete pair crashes as well. – peterg Feb 21 '18 at 13:32
  • I enclosed both the `malloc` and the `free` into a critical section and now it apparenty doesn't crash anymore. – Jabberwocky Feb 21 '18 at 13:49
  • 1
    @molbdnilo crashes on my machine with 1000 threads on VC2017, Release x64. Note: Malloc is threadsafe according to the documentation. Digging into the asm might reveal more information. – Zacharias Feb 21 '18 at 13:59
  • See my previous comment: It has been running now for >15 minutes without crashing but the computer becomes more and more painful to use, now every 10 to 15 seconds the whole computer becomes unresponsive for a couple of seconds, even the mouse cursor won't move anymore. It looks like a MS bug to me. – Jabberwocky Feb 21 '18 at 14:08
  • 3
    You made my day. Reproduced issue with VC2017 100 threads. It shows access violation in HeapAlloc (somewhere in ntdll.dll)... There is some stupid error... or you've found bug in MS c++ or ms windows... – Pavlo K Feb 21 '18 at 14:15
  • I can't spot anything wrong in this code provided that `malloc` and `free` are thread safe which they should be (usage of `/MD` switch). – Jabberwocky Feb 21 '18 at 14:47
  • @ Michael Walz " and now it apparenty doesn't crash " - Could you please put your code here, in comments? – peterg Feb 21 '18 at 15:04
  • @peterg This is the modified code that works. https://pastebin.com/wAWxCCeN. It's just an experience, the code is very poor. – Jabberwocky Feb 21 '18 at 15:27
  • 1
    @peterg. I was able to reproduce issue with HeapAlloc and private LFH heap on Windows 10. I'm using similar code, but written mostly on Win32 calls to reduce influence of c++\c. So this is 99.9% bug in Windows 10!!! Could you please confirm that your problem reproducible on windows server? – Pavlo K Feb 22 '18 at 07:50
  • @PavloK could you post that code somewhere, for example on pasebin.com? – Jabberwocky Feb 22 '18 at 08:40
  • @MichaelWalz Here you go. https://pastebin.com/TPDeq7wr. – Pavlo K Feb 22 '18 at 08:51
  • @molbdnilo "You forgot to mention what kind of crash it is" - it's APPCRASH, details: Problem signature: Problem Event Name:APPCRASH Fault Module Name: StackHash_e3c2 Fault Module Version: 6.3.9600.17031 Fault Module Timestamp: 530895af Exception Code: c0000374 Exception Offset: PCH_24_FROM_ntdll+0x000000000009B13A OS Version: 6.3.9600.2.0.0.272.7 Locale ID: 1033 Additional Information 1: e3c2 Additional Information 2: e3c2bb91516b405e48fec31ed1cb5192 Additional Information 3: b92e Additional Information 4: b92ebbb6dbf28e28f6b2b620a162a1f4 – peterg Feb 22 '18 at 09:16
  • @PavloK Yes, it is reproduced on Windows Server as well. – peterg Feb 22 '18 at 09:18
  • @peterg. What version of windows server? I guess this issue has to be reported to MS. Maybe they can provide workaround or tell what's wrong... Would you do a honors and communicate that problem? – Pavlo K Feb 22 '18 at 09:51
  • @PavloK It definitely fails on Windows Server 2008, Windows Server 2012R2. "I guess this issue has to be reported to MS" - yes, I reported it via support channel but I have only MSDN subscription support level. They told me that this support level accepts issues only when MSDN sample code doesn't work, and my code is self-written. If you suggets me some MS channel to report the issue I would be grateful. – peterg Feb 22 '18 at 10:57
  • I think it make sense post this issue on microsoft technet. I don't expect quick turnaround in this case, but it can work. IMO it sounds like a great potential problem for MS server platform. I think it makes sense to include my research results since they localize problem to core windows functions. Let me know if you need any further help – Pavlo K Feb 22 '18 at 11:09
  • @PavloK "I think it make sense post this issue on microsoft technet." - what forum would you consider the best for issue like this? I didn't find suitable category - all categories about products, not about development: https://social.technet.microsoft.com/Forums/en-us/newthread Or would you suggets some other way for reporting on TechNet? Thanks in advance. – peterg Feb 22 '18 at 11:22
  • Well, I see dev questions in category Windows Servers... However I support your concern. You can also try social.msdn.microsoft.com. – Pavlo K Feb 22 '18 at 12:15
  • @peterg Could you please share link to question\discussion on ms forum. – Pavlo K Feb 25 '18 at 11:23
  • @PavloK https://social.technet.microsoft.com/Forums/en-US/40ccd599-e386-471f-95ce-721a5e80648b/mallocfree-in-several-threads-crahes-on-windows-whats-wrong?forum=winservercore https://social.msdn.microsoft.com/Forums/en-US/63e75495-245b-43eb-8e03-e629573ee079/mallocfree-in-several-threads-crahes-on-windows-whats-wrong?forum=vcgeneral – peterg Feb 27 '18 at 14:28
  • Posted issue to MS sites, @PavloK https://social.technet.microsoft.com/Forums/en-US/40ccd599-e386-471f-95ce-721a5e80648b/mallocfree-in-several-threads-crahes-on-windows-whats-wrong?forum=winservercore https://social.msdn.microsoft.com/Forums/en-US/63e75495-245b-43eb-8e03-e629573ee079/mallocfree-in-several-threads-crahes-on-windows-whats-wrong?forum=vcgeneral – peterg Feb 27 '18 at 14:58
  • 1
    All threads share the same heap, access to the allocator/deallocator must be synchronized. – Hans Apr 16 '19 at 20:01

1 Answers1

1

Nothing in your remarkably small MVCE indicates a programming error, malloc() and free() are supposed to be thread safe and so should the methods invoked on cout. The program is not designed to ever stop, so it appears to be a fine stress test for malloc() in a multi-thread context.

Note however that if malloc() fails, it is questionable to try and report the error to cout, which might make further calls to malloc() for buffering. Reporting the error to cerr or making cout unbuffered is advisable. In any case malloc() failure should not cause a crash, even in the stream methods.

It looks like you found a bug in the runtime library you link to on the VS target platform. It would be interesting to track the memory usage of the program up to the crash. A steady increase in memory usage would indicate some problems in the runtime library too. The program never allocates more than MAX_THREADS blocks of 4K at a time, so the memory use should remain quite low, below 2MB, including the overhead associated with the thread based caches used by modern implementations of malloc().

chqrlie
  • 98,886
  • 10
  • 89
  • 149