Questions on x86-64/32 kernel w/ large arrays..

Mark Hahn hahn at physics.mcmaster.ca
Fri May 30 11:23:14 EDT 2003


> Recalling that with 32-bit systems, the default linux behaviour was to 
> load system stuff around the 1GB mark, making it impossible to 

I'm being persnickety, but the details are somewhat illuminating:
it's no "system stuff", but rather just the mmap arena.  there are,
after all, three arenas: the traditional sbrk/malloc heap, growing 
up from the end of program text; the stack, growing down from 3G,
and the mmap arena, which has to fit in there somewhere (and grows 
up by default).

you might think of it as "system stuff" just because you probably
notice shared libraries, which are mmaped, in /proc/<pid>/maps.
yes, it's true: a totally static program can avoid any use of mmap,
and therefore get the whole region for heap or stack!  caveats:
the last time I tried this, static libc stdio mmaped a single page.

also, there exist patches to tweak this in two ways: you can change
TASK_UNMAPPED_BASE (one even makes it a sysctl), and you can make the 
mmap arena grow down (if you only ever need an 8M stack, this makes 
very good sense).

another alternative would probably be to abandon the sbrk heap,
and use a malloc implementation that was entirely based on allocating
arenas using mmap.  actually, this sounds like a pretty good idea - 
glibc could probably be hacked to do this, since it already uses mmap
for large allocations...

>   (See: http://www.pgroup.com/faq/execute.htm#2GB_mem )

technically, the kernel merely starts mmaps at TASK_UNMAPPED_BASE - 
it's ld.so (userspace) which uses that for shlibs.  actually, it 
occurs to me that you could probably do some trickery wherein 
you did say a 1.5 GB mmap *before* ld.so starts grubbing around,
then munmap it when ld.so's done.  that would let the heap expand
all the way up to ~2.5GB, I think.

>   .. Now with the x86-64 kernel, as supplied by GinGin64, in
> 'include/asm-x86-64/processor.h', I see the following:
> 
> #define TASK_UNMAPPED_32 0xa0000000
> #define TASK_UNMAPPED_64 (TASK_SIZE/3)
> #define TASK_UNMAPPED_BASE      \
>         ((current->thread.flags & THREAD_IA32) ? TASK_UNMAPPED_32 : 
> TASK_UNMAPPED_64)
> 
>   .. Does this mean that in 32-bit mode on the Opteron, I automatically 
> get bumped up from the 1GB limit to nearly 2.5GB (0xa0000000)?  And, more 

that's the way I read it.

> importantly, since the OS itself is in 64-bit mode, can I alter this 
> setting to allow myself to have very nearly (or all!) 4GB of space for a 
> static allocation for a 32-bit executable?

hmm, good question.  just for background, the 3G limit (TASK_SIZE)
is also not a hard limit - you can set it to 3.5, for instance.
the area above TASK_SIZE is an "aperture" used by the kernel
so that it can avoid switching address spaces.  if you make it small,
you'll probably run into problems on big-memory machines (page tables
need to live in there, I think), and possibly IO to certain kinds of 
devices.

offhand, I'd guess the x86-64 people are sensible enough to have 
figured out a way to avoid this, which is indeed a pretty cool.
advantage even for 32b tasks...

I hope to have my own opteron to play with next week ;)

regards, mark hahn.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list