How to map two virtual adresses on the same physical memory on linux?_问答_开发者

How to map two virtual adresses on the same physical memory on linux?

开发者 https://www.devze.com 2023-04-03 11:35 出处：网络

I\'m facing a quite tricky problem. I\'m trying to get 2 virtual memory areas pointing to the same physical memory. The point is to have different page protection parameters on开发者_高级运维 differen

I'm facing a quite tricky problem. I'm trying to get 2 virtual memory areas pointing to the same physical memory. The point is to have different page protection parameters on开发者_高级运维 different memory areas.

On this forum, the user seems to have a solution, but it seems kinda hacky and it's pretty clear that something better can be done performance-wise : http://www.linuxforums.org/forum/programming-scripting/19491-map-two-virtual-memory-addres-same-physical-page.html

As I'm facing the same problem, I want to give a shot here to know if somebody has a better idea. Don't be afraid to mention the dirty details behind the hood, this is what this question is about.

Thank by advance.

Since Linux kernel 3.17 (released in October 2014) you can use memfd_create system call to create a file descriptor backed by anonymous memory. Then mmap the same region several times, as mentioned in the above answers.

Note that glibc wrapper for the memfd_create system call was added in glibc 2.27 (released in February 2018). The glibc manual also describes how the descriptor returned can be used to create multiple mappings to the same underlying memory.

I'm trying to get 2 virtual memory area pointing on the same physical memory.

mmap the same region in the same file, twice, or use System V shared memory (which does not require mapping a file in memory).

I suppose if you dislike Sys V shared memrory you could use POSIX shared memory objects. They're not very popular but available on Linux and BSDs at least.

Once you get an fd with shm_open you could immediately call shm_unlink. Then no other process can attach to the same shared memory, and you can mmap it multiple times. Still a small race period available though.

As suggested by @PerJohansson, I wrote & tested following code, it works well on linux, using mmap with MAP_SHARED|MAP_FIXED flag, we can map the same physical page allocated by POSIX shm object multiple times and continuously into very large virtual memory.

#include "stdio.h"
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <sys/stat.h>        /* For mode constants */
#include <fcntl.h>           /* For O_* constants */


void * alloc_1page_mem(int size) {
    int fd;
    char * ptr_base;
    char * rptr;
    /* Create shared memory object and set its size */
    fd = shm_open("/myregion", O_CREAT | O_RDWR, S_IRUSR | S_IWUSR);
    if (fd == -1) {
        perror("error in shm_open");
        return NULL;
    }

    if (ftruncate(fd, 4096) == -1) {
        perror("error in ftruncate");
        return NULL;
    }

    // following trick reserves big enough holes in VM space
    ptr_base = rptr = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    munmap(rptr, size);

    for(int i=0; i<size; i+=4096) {
        rptr = mmap(rptr, 4096, PROT_READ | PROT_WRITE, MAP_SHARED|MAP_FIXED, fd, 0);
        if (rptr == MAP_FAILED) {
            perror("error in mmap");
            return NULL;
        }
        rptr += 4096;
    }
    close(fd);
    shm_unlink("/myregion");
    return ptr_base;
}

void check(int * p, int total_cnt){
    for (int i=0;i<4096/sizeof(int);i++) {
        p[i] = i;
    }

    int fail_cnt = 0;
    for (int k=0; k<total_cnt; k+= 4096/sizeof(int)) {
        for (int i=0;i<4096/sizeof(int);i++) {
            if (p[k+i] != i)
                fail_cnt ++;
        }
    }
    printf("fail_cnt=%d\n", fail_cnt);
}

int main(int argc, const char * argv[]) {
    const char * cmd = argv[1];
    int sum;
    int total_cnt = 32*1024*1024;
    int * p = NULL;
    if (*cmd++ == '1')
        p = alloc_1page_mem(total_cnt*sizeof(int));
    else
        p = malloc(total_cnt*sizeof(int));

    sum = 0;
    while(*cmd) {
        switch(*cmd++) {
            case 'c':
                check(p, total_cnt);
                break;
            case 'w':
                // save only 4bytes per cache line
                for (int k=0;k<total_cnt;k+=64/sizeof(int)){
                    p[k] = sum;
                }
                break;
            case 'r':
                // read only 4bytes per cache line
                for (int k=0;k<total_cnt;k+=64/sizeof(int)) {
                    sum += p[k];
                }
                break;
            case 'p':
                // prevent sum from being optimized
                printf("sum=%d\n", sum);
        }
    }

    return 0;
}

You can observe very low cache miss rate on memory allocated in such method:

$ sudo perf stat -e mem_load_retired.l3_miss -- ./a.out 0wrrrrr
  # this produces L3 miss linearly increase with number of 'r' charaters
$ sudo perf stat -e mem_load_retired.l3_miss -- ./a.out 1wrrrrr
  # this produces almost constant L3 miss.

If you are root, you can mmap("/dev/mem", ...) but there are caveats in the newer kernels, see accessing mmaped /dev/mem?