libgomp: fine-grained pinned memory allocator

This patch introduces a new custom memory allocator for use with pinned
memory (in the case where the Cuda allocator isn't available).  In future,
this allocator will also be used for Managed Memory.  Both memories are
incompatible with the system malloc because allocated memory cannot share a
page with memory allocated for other purposes.

This means that small allocations will no longer consume an entire page of
pinned memory.  Unfortunately, it also means that pinned memory pages will
never be unmapped (although they may be reused).  This isn't a technical
limitation; the "free" algorithm could be extended in future, if needed.

The implementation is not perfect; there are various corner cases (especially
related to extending onto new pages) where allocations and reallocations may
be sub-optimal, but it should still be a step forward in support for small
allocations.

I have considered using libmemkind's "fixed" memory but rejected it for three
reasons: 1) libmemkind may not always be present at runtime, 2) there's no
currently documented means to extend a "fixed" kind one page at a time
(although the code appears to have an undocumented function that may do the
job, and/or extending libmemkind to support the MAP_LOCKED mmap flag with its
regular kinds would be straight-forward), 3) Managed Memory benefits from
having the metadata located in different memory and using an external
implementation makes it hard to guarantee this.

libgomp/ChangeLog:

	* Makefile.am (libgomp_la_SOURCES): Add simple-allocator.c.
	* Makefile.in: Regenerate.
	* basic-allocator.c: Mention simple-allocator in the comment.
	* config/linux/allocator.c: Include unistd.h.
	(pin_ctx): New variable.
	(ctxlock): New variable.
	(linux_init_pin_ctx): New function.
	(linux_memspace_alloc): Use simple-allocator for pinned memory.
	(linux_memspace_free): Likewise.
	(linux_memspace_realloc): Likewise.
	* libgomp.h (gomp_simple_alloc_init_context): New prototype.
	(gomp_simple_alloc_register_memory): New prototype.
	(gomp_simple_alloc): New prototype.
	(gomp_simple_free): New prototype.
	(gomp_simple_realloc): New prototype.
	* libgomp.texi: Update pinned memory trait documentation.
	* testsuite/libgomp.c/alloc-pinned-8.c: New test.
	* simple-allocator.c: New file.
This commit is contained in:
Andrew Stubbs
2025-10-20 14:57:41 +00:00
parent 3b8d9d579c
commit 9e5a9aa490
8 changed files with 533 additions and 41 deletions

View File

@@ -0,0 +1,122 @@
/* { dg-do run } */
/* { dg-skip-if "Pinning not implemented on this host" { ! *-*-linux-gnu* } } */
/* { dg-additional-options -DOFFLOAD_DEVICE_NVPTX { target offload_device_nvptx } } */
/* Test that pinned memory works for small allocations. */
#include <stdio.h>
#include <stdlib.h>
#ifdef __linux__
#include <sys/types.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/resource.h>
#define PAGE_SIZE sysconf(_SC_PAGESIZE)
#define CHECK_SIZE(SIZE) { \
struct rlimit limit; \
if (getrlimit (RLIMIT_MEMLOCK, &limit) \
|| limit.rlim_cur <= SIZE) \
fprintf (stderr, "insufficient lockable memory; please increase ulimit\n"); \
}
int
get_pinned_mem ()
{
int pid = getpid ();
char buf[100];
sprintf (buf, "/proc/%d/status", pid);
FILE *proc = fopen (buf, "r");
if (!proc)
abort ();
while (fgets (buf, 100, proc))
{
int val;
if (sscanf (buf, "VmLck: %d", &val))
{
fclose (proc);
return val;
}
}
abort ();
}
#else
#error "OS unsupported"
#endif
static void
verify0 (char *p, size_t s)
{
for (size_t i = 0; i < s; ++i)
if (p[i] != 0)
abort ();
}
#include <omp.h>
int
main ()
{
/* Choose a small size where all our allocations fit on one page. */
const int SIZE = 10;
#ifndef OFFLOAD_DEVICE_NVPTX
CHECK_SIZE (SIZE*4);
#endif
const omp_alloctrait_t traits[] = {
{ omp_atk_pinned, 1 }
};
omp_allocator_handle_t allocator = omp_init_allocator (omp_default_mem_space, 1, traits);
// Sanity check
if (get_pinned_mem () != 0)
abort ();
void *p = omp_alloc (SIZE, allocator);
if (!p)
abort ();
int amount = get_pinned_mem ();
#ifdef OFFLOAD_DEVICE_NVPTX
/* This doesn't show up as process 'VmLck'ed memory. */
if (amount != 0)
abort ();
#else
if (amount == 0)
abort ();
#endif
p = omp_realloc (p, SIZE * 2, allocator, allocator);
int amount2 = get_pinned_mem ();
#ifdef OFFLOAD_DEVICE_NVPTX
/* This doesn't show up as process 'VmLck'ed memory. */
if (amount2 != 0)
abort ();
#else
/* A small allocation should not allocate another page. */
if (amount2 != amount)
abort ();
#endif
p = omp_calloc (1, SIZE, allocator);
#ifdef OFFLOAD_DEVICE_NVPTX
/* This doesn't show up as process 'VmLck'ed memory. */
if (get_pinned_mem () != 0)
abort ();
#else
/* A small allocation should not allocate another page. */
if (get_pinned_mem () != amount2)
abort ();
#endif
verify0 (p, SIZE);
return 0;
}