开发者

How to create a file with file holes?

开发者 https://www.devze.com 2023-02-17 04:08 出处:网络
File holes are the empty spaces 开发者_Go百科in file, which, however, doesn\'t take up any disk space and contains null bytes. Therefore, the file size is larger than its actual size on disk.

File holes are the empty spaces 开发者_Go百科in file, which, however, doesn't take up any disk space and contains null bytes. Therefore, the file size is larger than its actual size on disk.

However, I don't know how to create a file with file holes for experimenting with.


Use the dd command with a seek parameter.

dd if=/dev/urandom bs=4096 count=2 of=file_with_holes
dd if=/dev/urandom bs=4096 seek=7 count=2 of=file_with_holes

That creates for you a file with a nice hole from byte 8192 to byte 28671.

Here's an example, demonstrating that indeed the file has holes in it (the ls -s command tells you how many disk blocks are being used by a file):

$ dd if=/dev/urandom bs=4096 count=2 of=fwh # fwh = file with holes
2+0 records in
2+0 records out
8192 bytes (8.2 kB) copied, 0.00195565 s, 4.2 MB/s

$ dd if=/dev/urandom seek=7 bs=4096 count=2 of=fwh
2+0 records in
2+0 records out
8192 bytes (8.2 kB) copied, 0.00152742 s, 5.4 MB/s

$ dd if=/dev/zero bs=4096 count=9 of=fwnh # fwnh = file with no holes
9+0 records in
9+0 records out
36864 bytes (37 kB) copied, 0.000510568 s, 72.2 MB/s

$ ls -ls fw*
16 -rw-rw-r-- 1 hopper hopper 36864 Mar 15 10:25 fwh
36 -rw-rw-r-- 1 hopper hopper 36864 Mar 15 10:29 fwnh

As you can see, the file with holes takes up fewer disk blocks, despite being the same size.

If you want a program that does it, here it is:

#include <unistd.h>
#include <sys/types.h>
#include <stdio.h>
#include <fcntl.h>

int main(int argc, const char *argv[])
{
    char random_garbage[8192]; /* Don't even bother to initialize */
    int fd = -1;
    if (argc < 2) {
        fprintf(stderr, "Usage: %s <filename>\n", argv[0]);
        return 1;
    }
    fd = open(argv[1], O_WRONLY | O_CREAT | O_TRUNC, 0666);
    if (fd < 0) {
        perror("Can't open file: ");
        return 2;
    }
    write(fd, random_garbage, 8192);
    lseek(fd, 5 * 4096, SEEK_CUR);
    write(fd, random_garbage, 8192);
    close(fd);
    return 0;
}

The above should work on any Unix. Someone else replied with a nice alternative method that is very Linux specific. I highlight it here because it's a method distinct from the two I gave, and can be used to put holes in existing files.


  1. Create a file.
  2. Seek to position N.
  3. Write some data.

There will be a hole at the start of the file (up to, and excluding, position N). You can similarly create files with holes in the middle.

The following document has some sample C code (search for "Sparse files"): http://www.win.tue.nl/~aeb/linux/lk/lk-6.html


Aside from creating files with holes, since ~2 months ago (mid-January 2011), you can punch holes on existing files on Linux, using fallocate(2) FALLOC_FL_PUNCH_HOLE LWN article, git commit on Linus' tree, patch to Linux's manpages.


The problem is carefully discussed in section 3.6 of W.Richard Stevens famous book "Advanced Programming in the UNIX Environment" (APUE for short). The lseek funstion included in unistd.h is used here, which is designed to set an open file's offset explicitly. The prototype of the lseek function is as follows:

off_t lseek(int filedes, off_t offset, int whence);

Here, filedes is the file descriptor, offset is the value we are willing to set, and whence is a constant set in the header file, specifically SEEK_SET, meaning that the offset is set from the beginning of the file; SEEK_CUR, meaning that the offset is set to its current value plus the offset in the arguement list; SEEK_END, meaning that the file's offset is set the the size of the file plus the offset in the arguement list.

The example to create a file with holes in C under UNIX like OSs is as follows:

/*Creating a file with a hole of size 810*/
#include <fcntl.h>

/*Two strings to write to the file*/    
char buf1[] = "abcde";
char buf2[] = "ABCDE";

int main()
{
    int fd; /*file descriptor*/

    if((fd = creat("file_with_hole", FILE_MODE)) < 0)
        err_sys("creat error");
    if(write(fd, buf1, 5) != 5)
        err_sys("buf1 write error");
    /*offset now 5*/

    if(lseek(fd, 815, SEEK_SET) == -1)
        err_sys("lseek error");
    /*offset now 815*/

    if(write(fd, buf2, 5) !=5)
        err_sys("buf2 write error");
    /*offset now 820*/

    return 0;
}

In the code above, err_sys is the function to deal with fatal error related to a system call.


A hole is created when data is written at an offset beyond the current file size or the file size is truncated to something larger than the current file size

0

精彩评论

暂无评论...
验证码 换一张
取 消