8

I'm a little bit confused. In the OS course we were told that all OSes take care of memory fragmentation by paging or segmentation and there is no contiguous physical memory allocation at all. OS uses different levels of addressing (logical/physical) to avoid the contiguous memory allocation. Now here there are so many discussions about it. My question is: Is this problem real in c++ programming for OSes supporting logical addressing (does any process crash just because of memory fragmentation)? if yes, why in the first place each OS tries to avoid contiguous addressing?

Afshin
  • 272
  • 2
  • 9
  • 4
    There are 2 layers: fragmentation in the virtual process address space and fragmentation in the physical memory. What are you interested in? – StaceyGirl Aug 25 '18 at 14:15
  • I think the fragmentation in the virtual process address is negligible and can not cause memory starvation. Actually I believe that no process would crash because of memory fragmentation. – Afshin Aug 25 '18 at 14:21
  • 2
    In general c++ programing there is no problem with memory fragmentation. You always see the `virtual` memory and you always allocate contiguous virtual memory chunks. The only thing that you can notice that the sequentially allocated chunks are not necessarily adjacent in memory. – Serge Aug 25 '18 at 14:24
  • Yes @Serge, I saw people are looking for a way to solve it in their c++ codes!! – Afshin Aug 25 '18 at 14:26
  • 2
    @Afshin It depends. If you look at any modern application, you can see how its memory usage grows over time as memory is not released to the OS. You can say this is caused by other things, but memory fragmentation e.g. non-contiguous location of allocated memory chunks is the core reason for this. You can check [this](https://stackoverflow.com/questions/48651432/glibc-application-holding-onto-unused-memory-until-just-before-exit/48652734#48652734) question. – StaceyGirl Aug 25 '18 at 14:26
  • @Ivan I think this is OS level problem. Programmers should not care about the memory fragmentation in their codes. – Afshin Aug 25 '18 at 14:30
  • What did your course have to say about memory in embedded systems? – Richard Critten Aug 25 '18 at 14:33
  • @Afshin Depends on what you call an OS. If OS is a kernel, then it cannot take care of fragmentaion in process address space. If you are including C library into OS, then yes, system allocator ideally should do the job, but it sometimes fails to do so. See the linked question. – StaceyGirl Aug 25 '18 at 14:34
  • @RichardCritten In embedded systems you have more authority to take on some OS responsibilities as well. But this problem still is a part of OS memory management. – Afshin Aug 25 '18 at 14:36
  • @Afshin: If you define every application where you might encounter fragmentation as "taking on some OS responsibilities," then yes, this is just OS memory management. But semantics games don't actually solve problems. – Nicol Bolas Aug 25 '18 at 14:44
  • @Afshin there are special situation when due to performance reasons or caching reaons people like to take care of allocation memory from a contiguous virtual region, avoiding use of generic new/malloc. They should take care of fragmentation there. – Serge Aug 25 '18 at 14:45
  • @NicolBolas: My point is that when the OS takes care of fragmentation, the application development have nothing to do with memory management. And I'm gonna make it clear in my question. – Afshin Aug 25 '18 at 14:52
  • 4
    Fragmentation is absolutely an issue in *any* long-running program. How much of a problem it is depends on the allocation patterns and the quality of the heap allocator. Some OSes (e.g. Windows) have crappy built-in heap allocators, which result in massive fragmentation over time, necessitating the use of alternative allocators (e.g. jemalloc). – rustyx Aug 25 '18 at 14:56
  • @rustyx: My problem is exactly here. I cannot understand this part. If the OS does not allocate any large continuous chunk of physical memory, then I expect that external fragmentation is not gonna occur at all (I mean for memory larger that the page size). As far as I can make out jemalloc addresses the "scalability" & "cache alignment" in mulch-threaded applications and neither of them have anything to do with "fragmentation". – Afshin Aug 25 '18 at 16:31
  • 2
    @Afshin If you are talking about allocations of physical pages, then this is about how OS kernel works and it has little (read as "nothing") to do with C++ programming in user space. As for jemalloc, it tries to address many issues, but it is about handling memory allocations in user space (outside of OS). – StaceyGirl Aug 25 '18 at 16:52

2 Answers2

6

There are 2 layers: fragmentation in the virtual process address space and fragmentation in the physical memory.

If you look at any modern application, you can see how its memory usage grows over time as memory is not released to the OS. You can say this is caused by other things, but memory fragmentation (e.g. non-contiguous location of allocated memory chunks) is the core reason for this. In short, memory allocators refuse to release memory to the OS.

If you are interested about fragmentation in physical memroy, then even with memory organized in pages, there is still a need to allocate physically contiguous memory chunks. For example if you need to avoid virtual memory overhead, you might want to use large pages ("huge pages" in terms of Linux). x86_64 supports 4KiB, 2MiB and 1GiB pages. If there is no contiguous physical memory of the required size, you won't be able to use them.

If by OS you mean "kernel", then it cannot help you with fragmentation that happens in process address space (heap fragmentation). C library should try to avoid fragmentation, unfortunately, it is not always able to do so. See the linked question.

Memory allocator is usually not able to release large chunk of memory if there is at least something allocated in it. There is a partial solution to this that takes advantage of virtual memory organization in pages - so called "lazy-free" mechanism represented by MADV_FREE on Linux and BSDs and DiscardVirtualMemory on Windows. When you have a huge chunk of memory that is only partially used, you can notify the kernel that part of that memory is not needed anymore and that it can take it back under memory pressure. This is done lazily and only under memory pressure because memory deallocation is extremely expensive. But many memory allocators still do not use it for performance reasons.

So the answer to your question - it depends on how much you care about efficiency of your program. Most program do not care, as standard allocator just does the job for them. Some programs might suffer when standard allocator is not able to do its job efficiently.

StaceyGirl
  • 6,826
  • 12
  • 33
  • 59
  • 1
    Reclaiming memory regions may require not only clever allocation but moving of still alive memory objects, e.g. in [compacting garbage collection](https://en.wikipedia.org/wiki/Mark-compact_algorithm). Many programs use only simple end markers (stack pointer, [program break](http://man7.org/linux/man-pages/man2/brk.2.html)), so the OS really can't return the free memory without extensions like madvise. Other structures are possible e.g. [Mill](https://millcomputing.com/docs/memory/), [hardware garbage collection](https://pdfs.semanticscholar.org/aa06/47e543c5989606c458620d8fdc20fdb90801.pdf) – Yann Vernier Aug 25 '18 at 15:09
  • I agree with you about kernel level programming that the memory management task falls upon programmers themselves. About the efficiency I do not agree with you. I think the memory locators other than standard (built-in) locators, do not address the fragmentation but the "scalability" and etc. for parallel processing. – Afshin Aug 25 '18 at 16:57
  • @Afshin They try to address many issues. Many are willing to give up on fragmentation to achieve higher performance. This might bother you when you see that your long running application starts consuming too much memory and is not able to free it without restarting the whole process. So fragmentation _is_ an issue in some cases, you should be at least aware of this. – StaceyGirl Aug 25 '18 at 17:03
  • @Ivan They can make your application efficient but I think it is not about fragmentation. I might be wrong but it is not convincing to me? :) – Afshin Aug 25 '18 at 17:08
  • @Afshin They can make allocation and deallocation faster, but if allocator does not address fragmentation issues, this may increase your programs memory consumption. So this is still a trade-off. If you are midnlessly allocating lots of data, you might run in a situation when your allocator is just not able to solve fragmentation issues. – StaceyGirl Aug 25 '18 at 17:11
3

OS is not avoiding contiguous memory allocation. At the top level you have hardware and software. Hardware has limited resources, physical memory in this case. To share the resource and to avoid user programs from taking care of it's sharing, virtual addressing layer was invented. It just maps contiguous virtual addressing space into sparse physical regions. In other words 0x10000 virtual address can point to 0x80000 physical address in one process and to 0xf0000 in another.

Paging and swapping means writing some pages or the whole app memory to disk and then bring it back at some point. It will most likely have different physical page mapping after it.

So, your program will always see contiguous virtual addressing space, which is really fragmented in physical hardware space. BTW, it is done with constant block sizes, and there is no waste or unused memory holes.

Now, the second level of fragmentation which is caused by the new/malloc functions and it is related to the fact that you allocate and delete different sizes of memory. This fragments your heap in virtual space. The functions make sure that there is as little waste as possible.

So, in your generic C++ (or any other language) programming you do not care about any of the memory fragmentation. All chunks which you allocate are guaranteed to be contiguous in virtual space (not necessarily in physical).

1201ProgramAlarm
  • 30,320
  • 7
  • 40
  • 49
Serge
  • 8,185
  • 2
  • 14
  • 22
  • The block sizes aren't necessarily constant. For instance, we have [page size extension](https://en.wikipedia.org/wiki/Page_Size_Extension) in the x86 family. This is mainly done to reduce the overhead of tracking many pages. – Yann Vernier Aug 25 '18 at 15:00
  • @YannVernier Yes, i do not want to complicate the answer with x86 paging modes and different page sizes. Essential is that OS takes care of effiient physical memory usage. Sure, if you ask for a couple of 4k pages, and then switch to 2M pages, you can get fragmentation there as well. – Serge Aug 25 '18 at 19:18
  • 1
    "in your generic C++ (or any other language) programming you do not care about any of the memory fragmentation" Anyone that develops any long-running applications (for example games) will absolutely care about memory fragmentation. Additionally memory fragmentation can affect performance. – Tara Nov 18 '19 at 07:49