当前位置：首页 > news >正文

重定向与文件缓冲机制

news 2025/11/5 7:33:44

一、重定向的原理与实践

1. 输出重定向：让数据流向新目的地

2. 追加重定向：在文件末尾追加数据

3. 输入重定向：从指定文件读取数据

4. 标准输出流与标准错误流的区别

5. 使用 dup2 实现重定向

二、FILE 结构体的奥秘

1. FILE 中的文件描述符

2. FILE 中的缓冲区

3. 缓冲区的提供者与位置

4. 操作系统的缓冲区

5. 缓冲体系的层级架构

编辑

一、重定向的原理与实践

1. 输出重定向：让数据流向新目的地

原理：

输出重定向的本质是修改文件描述符下标对应的 struct file* 内容，将原本应该输出到一个文件（通常是显示器）的数据，改为输出到另一个文件。

代码示例 ：

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main() {
    // 关闭标准输出流（文件描述符 1）
    close(1);

    // 打开 log.txt 文件，获取文件描述符
    int fd = open("log.txt", O_WRONLY | O_CREAT, 0666);
    if (fd < 0) {
        perror("open");
        return 1;
    }

    // 使用 printf 输出数据
    printf("hello world\n");
    printf("hello world\n");
    printf("hello world\n");
    printf("hello world\n");
    printf("hello world\n");

    // 刷新缓冲区，确保数据写入文件
    fflush(stdout);

    // 关闭文件描述符
    close(fd);

    return 0;
}

运行结果 ：运行程序后，显示器上没有输出数据，而 log.txt 文件中写入了多行 "hello world"。

说明：printf 函数默认向标准输出流（stdout）输出数据，而 stdout 指向的 FILE 结构体中存储的文件描述符是 1。通过关闭文件描述符 1 并重新打开文件，我们实现了输出重定向。

2. 追加重定向：在文件末尾追加数据

原理：

追加重定向与输出重定向类似，区别在于数据是追加到目标文件末尾，而不是覆盖原有内容。通过O_APPEND标志，每次写入都会自动定位到文件末尾，保留原有内容。

代码示例 ：

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main() {
    // 关闭标准输出流（文件描述符 1）
    close(1);

    // 以追加方式打开 log.txt 文件
    int fd = open("log.txt", O_WRONLY | O_APPEND | O_CREAT, 0666);
    if (fd < 0) {
        perror("open");
        return 1;
    }

    // 使用 printf 输出数据
    printf("hello Linux\n");
    printf("hello Linux\n");
    printf("hello Linux\n");
    printf("hello Linux\n");
    printf("hello Linux\n");

    // 刷新缓冲区，确保数据写入文件
    fflush(stdout);

    // 关闭文件描述符
    close(fd);

    return 0;
}

运行结果 ：运行程序后，log.txt 文件中新增了多行 "hello Linux"，且这些内容追加在原有内容之后。

3. 输入重定向：从指定文件读取数据

原理：

输入重定向是将原本从标准输入流（通常是键盘）读取数据的操作，改为从指定文件读取数据。

代码示例 ：

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main() {
    // 关闭标准输入流（文件描述符 0）
    close(0);

    // 打开 log.txt 文件，获取文件描述符
    int fd = open("log.txt", O_RDONLY | O_CREAT, 0666);
    if (fd < 0) {
        perror("open");
        return 1;
    }

    // 定义字符数组，用于存储读取的数据
    char str[40];

    // 使用 scanf 从文件读取数据
    while (scanf("%s", str) != EOF) {
        printf("%s\n", str);
    }

    // 关闭文件描述符
    close(fd);

    return 0;
}

运行结果 ：运行程序后，scanf 函数从 log.txt 文件中读取数据，并通过 printf 输出到显示器。

说明：scanf 函数默认从标准输入流（stdin）读取数据，而 stdin 指向的 FILE 结构体中存储的文件描述符是 0。通过关闭文件描述符 0 并重新打开文件，我们实现了输入重定向。

4. 标准输出流与标准错误流的区别

原理：

标准输出流（stdout，文件描述符 1）和标准错误流（stderr，文件描述符 2）都默认输出到显示器，但它们在重定向时的行为不同。重定向操作只影响标准输出流，而标准错误流不受影响。

操作	stdout表现	stderr表现
直接运行	屏幕输出	屏幕输出
重定向到文件	写入文件	仍显示屏幕

代码示例 ：

#include <stdio.h>

int main() {
    // 向标准输出流输出数据
    printf("hello printf\n");

    // 向标准错误流输出数据
    perror("perror");

    // 使用 fprintf 向标准输出流和标准错误流输出数据
    fprintf(stdout, "stdout:hello fprintf\n");
    fprintf(stderr, "stderr:hello fprintf\n");

    return 0;
}

运行结果 ：直接运行程序时，显示器上输出四行字符串。若将程序运行结果重定向到文件 log.txt，则 log.txt 中只有标准输出流的两行字符串，而标准错误流的两行数据仍输出到显示器。

5. 使用 `dup2` 实现重定向

原理：

dup2 函数用于将一个文件描述符的内容拷贝到另一个文件描述符，从而实现重定向。

函数原型 ：

int dup2(int oldfd, int newfd);

代码示例 ：

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>

int main() {
    // 打开 log.txt 文件，获取文件描述符
    int fd = open("log.txt", O_WRONLY | O_CREAT, 0666);
    if (fd < 0) {
        perror("open");
        return 1;
    }

    // 关闭标准输出流（文件描述符 1）
    close(1);

    // 使用 dup2 将 fd 拷贝到文件描述符 1
    // 将log.txt克隆到标准输出
    dup2(fd, 1);

    // 使用 printf 输出数据
    // 自动写入log.txt
    printf("hello printf\n");
    fprintf(stdout, "hello fprintf\n");

    // 关闭文件描述符
    close(fd);

    return 0;
}

运行结果 ：运行程序后，log.txt 文件中写入了 "hello printf" 和 "hello fprintf"。

二、FILE 结构体的奥秘

1. FILE 中的文件描述符

原理：

C 语言的库函数是对系统调用接口的封装，文件操作本质上是通过文件描述符进行的。FILE 结构体内部封装了文件描述符。

在 /usr/include/stdio.h 头文件中，FILE 是 struct _IO_FILE 的别名。

typedef struct _IO_FILE FILE;

在 /usr/include/libio.h 头文件中，struct _IO_FILE 定义了 _fileno 成员，用于存储文件描述符。

struct _IO_FILE {
	int _flags;       /* High-order word is _IO_MAGIC; rest is flags. */
#define _IO_file_flags _flags

	//缓冲区相关
	/* The following pointers correspond to the C++ streambuf protocol. */
	/* Note:  Tk uses the _IO_read_ptr and _IO_read_end fields directly. */
	char* _IO_read_ptr;   /* Current read pointer */
	char* _IO_read_end;   /* End of get area. */
	char* _IO_read_base;  /* Start of putback+get area. */
	char* _IO_write_base; /* Start of put area. */
	char* _IO_write_ptr;  /* Current put pointer. */
	char* _IO_write_end;  /* End of put area. */
	char* _IO_buf_base;   /* Start of reserve area. */
	char* _IO_buf_end;    /* End of reserve area. */
	/* The following fields are used to support backing up and undo. */
	char *_IO_save_base; /* Pointer to start of non-current get area. */
	char *_IO_backup_base;  /* Pointer to first valid character of backup area */
	char *_IO_save_end; /* Pointer to end of non-current get area. */

	struct _IO_marker *_markers;

	struct _IO_FILE *_chain;

	int _fileno; //封装的文件描述符
#if 0
	int _blksize;
#else
	int _flags2;
#endif
	_IO_off_t _old_offset; /* This used to be _offset but it's too small.  */

#define __HAVE_COLUMN /* temporary */
	/* 1+column number of pbase(); 0 is unknown. */
	unsigned short _cur_column;
	signed char _vtable_offset;
	char _shortbuf[1];

	/*  char* _save_gptr;  char* _save_egptr; */

	_IO_lock_t *_lock;
#ifdef _IO_USE_OLD_IO_FILE
};

示例代码 ：

#include <stdio.h>

int main() {
    FILE *fp = fopen("test.txt", "w");
    if (fp == NULL) {
        perror("fopen");
        return 1;
    }

    // 获取 FILE 结构体中的文件描述符
    int fd = fileno(fp);

    // 使用文件描述符进行操作
    // ...

    // 关闭文件
    fclose(fp);

    return 0;
}

说明：fopen 函数在上层为用户申请 FILE 结构体变量，并返回该结构体的地址。在底层通过系统接口 open 打开文件，得到文件描述符，并将其存储在 FILE 结构体的 _fileno 成员中。

2. FILE 中的缓冲区

原理：

C 语言中的文件操作函数（如 printf、fputs 等）使用缓冲区来提高效率。缓冲区的类型有无缓冲、行缓冲和全缓冲。

缓冲类型对比表：

缓冲类型	刷新条件	典型应用场景
无缓冲	立即写入	标准错误
行缓冲	遇换行符或缓冲区满	终端输出
全缓冲	缓冲区满或强制刷新	文件操作

代码示例 ：

#include <stdio.h>
#include <unistd.h>

int main() {
    // 使用 printf 输出数据
    printf("hello printf\n");

    // 使用 fputs 输出数据
    fputs("hello fputs\n", stdout);

    // 使用 write 系统接口输出数据
    write(1, "hello write\n", 12);

    // 创建子进程
    fork();

    return 0;
}

运行结果 ：直接运行程序时，数据输出到显示器。若将程序运行结果重定向到文件 log.txt，则 log.txt 中 printf 和 fputs 的数据出现两次，而 write 的数据只出现一次。

说明：printf 和 fputs 使用缓冲区，当程序运行结果重定向到文件时，缓冲区的数据会被复制到父进程和子进程中，导致数据重复。而 write 是系统接口，没有缓冲区，数据直接输出到目标文件。

3. 缓冲区的提供者与位置

原理：

缓冲区由 C 语言提供，在 FILE 结构体中维护。FILE 结构体中包含多个成员，用于记录缓冲区的相关信息。

在 /usr/include/libio.h 头文件中，struct _IO_FILE 定义了多个与缓冲区相关的成员，如 _IO_read_ptr、_IO_read_end 等。

//缓冲区相关
/* The following pointers correspond to the C++ streambuf protocol. */
/* Note:  Tk uses the _IO_read_ptr and _IO_read_end fields directly. */
char* _IO_read_ptr;   /* Current read pointer */
char* _IO_read_end;   /* End of get area. */
char* _IO_read_base;  /* Start of putback+get area. */
char* _IO_write_base; /* Start of put area. */
char* _IO_write_ptr;  /* Current put pointer. */
char* _IO_write_end;  /* End of put area. */
char* _IO_buf_base;   /* Start of reserve area. */
char* _IO_buf_end;    /* End of reserve area. */
/* The following fields are used to support backing up and undo. */
char *_IO_save_base; /* Pointer to start of non-current get area. */
char *_IO_backup_base;  /* Pointer to first valid character of backup area */
char *_IO_save_end; /* Pointer to end of non-current get area. */

示例代码 ：

#include <stdio.h>

int main() {
    FILE *fp = fopen("test.txt", "w");
    if (fp == NULL) {
        perror("fopen");
        return 1;
    }

    // 使用 fputs 写入数据
    fputs("hello fputs", fp);

    // 刷新缓冲区
    fflush(fp);

    // 关闭文件
    fclose(fp);

    return 0;
}

说明：fputs 函数将数据写入 FILE 结构体中的缓冲区，fflush 函数用于刷新缓冲区，将数据写入文件。

4. 操作系统的缓冲区

原理：

操作系统也有自己的缓冲区。当用户缓冲区的数据被刷新到操作系统缓冲区后，操作系统会根据自己的刷新机制，将数据写入磁盘或显示器。

示例代码 ：

#include <stdio.h>
#include <unistd.h>

int main() {
    // 使用 printf 输出数据
    printf("hello printf\n");

    // 刷新用户缓冲区
    fflush(stdout);

    // 暂停 2 秒
    sleep(2);

    return 0;
}

说明：printf 输出的数据先存储在用户缓冲区，fflush 刷新用户缓冲区后，数据进入操作系统缓冲区。操作系统会根据自己的刷新机制，将数据写入显示器。