当前位置：首页 > news >正文

POSIX线程库

news 来源：原创 2025/6/15 21:37:45

一、POSIX线程库简介

二、线程创建

2.1 单线程创建示例

2.2 多线程创建示例

2.3 线程ID的获取与使用

2.4 线程与进程的关系

2.5 线程的调度与LWP

三、线程等待

1. 为什么需要线程等待？

2. 线程等待实例

2.1 基本线程等待

2.2 获取线程退出码

3. 线程异常终止的影响

四、线程终止

1. 线程终止的三种方式

2. return：线程的自然结束

3. pthread_exit：线程的主动谢幕

4. pthread_cancel：线程的强制终止

五、分离线程

1. 什么是分离线程？

2. 自动分离线程

3. 分离线程的优势

4. 注意事项

六、线程ID与进程地址空间布局

1. 线程ID的双重身份

1.1 操作系统调度器的线程ID

1.2 NPTL线程库的线程ID

2. 线程ID的存储与管理

2.1 线程控制块（struct pthread）

2.2 线程栈与资源

3. 进程地址空间布局

一、POSIX线程库简介

POSIX线程库是一组标准化的线程接口，几乎所有支持POSIX标准的系统（比如Linux、macOS）都内置了对它的支持。它的核心函数都以 pthread_ 开头，使用时需要包含头文件<pthread.h>，并在编译时加上 -lpthread 选项。

线程是进程中的一个执行单元，多个线程共享同一进程的资源（比如内存、文件描述符等），但它们可以独立运行。线程的轻量化特性使得它在处理并发任务时比进程更加高效。

二、线程创建

创建线程的核心函数是 pthread_create，它的原型如下：

int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine)(void*), void *arg);

thread：用于存储新创建线程的ID。

attr：线程属性（比如栈大小、优先级等），通常传NULL表示使用默认属性。

start_routine：线程启动后执行的函数地址。

arg：传递给启动函数的参数。

线程创建成功返回0，失败返回错误码。

2.1 单线程创建示例

我们先从最简单的例子开始：主线程创建一个新线程。

#include <stdio.h>
#include <pthread.h>
#include <unistd.h>

// 线程执行的函数
void* thread_function(void* arg) 
{
    char* msg = (char*)arg;
    while (1) 
    {
        printf("I am %s\n", msg); // 打印线程信息
        sleep(1); // 模拟任务执行
    }
    return NULL; // 线程结束
}

int main() 
{
    pthread_t tid; // 存储线程ID
    pthread_create(&tid, NULL, thread_function, (void*)"thread 1"); // 创建线程
    while (1) 
    {
        printf("I am main thread!\n"); // 主线程任务
        sleep(2);
    }
    return 0;
}

运行后，你会看到主线程和新线程交替打印信息。新线程每秒打印一次，而主线程每两秒打印一次。

当我们用 ps axj 命令查看当前进程的信息时，虽然此时该进程中有两个线程，但是我们看到的进程只有一个，因为这两个线程都是属于同一个进程的。

2.2 多线程创建示例

如果我们想创建多个线程，可以稍微修改代码：

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>

void* thread_function(void* arg) 
{
    char* msg = (char*)arg;
    while (1) 
    {
        printf("I am %s...tid: %lu\n", msg, getpid()); // 打印线程ID
        sleep(1);
    }
    return NULL;
}

int main()
{
    pthread_t threads[5]; // 存储5个线程ID
    for (int i = 0; i < 5; i++) 
    {
        char* msg = (char*)malloc(64);
        sprintf(msg, "thread %d", i); // 动态生成线程名称
        pthread_create(&threads[i], NULL, thread_function, msg);
    }
    while (1) 
    {
        printf("I am main thread...tid: %lu\n", getpid()); // 主线程ID
        sleep(2);
    }
    return 0;
}

运行后，你会看到5个新线程和主线程同时运行，主线程和新线程的PID是一样的，也就是说主线程和新线程虽然是两个执行流，但它们仍然属于同一个进程。

2.3 线程ID的获取与使用

线程ID是区分线程的重要标识，获取方式有两种：

创建线程时通过输出型参数获取：这是pthread_create的默认行为。
调用pthread_self函数：返回当前线程的ID。

pthread_t pthread_self(void);

代码示例：

#include <stdio.h>
#include <pthread.h>
#include <unistd.h>  // 用于 sleep 函数

// 线程函数
void* thread_function(void* arg) 
{
    // 获取当前线程的 ID
    pthread_t tid = pthread_self();
    // 将 pthread_t 转换为 unsigned long 以便打印
    printf("子线程 ID: %lu\n", (unsigned long)tid);
    sleep(1);

    return NULL;
}

int main() 
{
    pthread_t thread_id;

    // 创建一个新线程
    if (pthread_create(&thread_id, NULL, thread_function, NULL) != 0) 
    {
        perror("线程创建失败");
        return 1;
    }

    // 主线程中打印自己的线程 ID
    printf("主线程 ID: %lu\n", (unsigned long)pthread_self());

    return 0;
}

2.4 线程与进程的关系

线程是进程中的一个执行单元，多个线程共享同一进程的资源。我们可以通过getpid和getppid函数验证这一点：

#include <stdio.h>
#include <pthread.h>
#include <unistd.h>

void* thread_function(void* arg) 
{
    char* msg = (char*)arg;
    while (1) 
    {
        printf("I am %s...pid: %d, ppid: %d\n", msg, getpid(), getppid());
        sleep(1);
    }
    return NULL;
}

int main() 
{
    pthread_t tid;
    pthread_create(&tid, NULL, thread_function, (void*)"thread 1");
    while (1) 
    {
        printf("I am main thread...pid: %d, ppid: %d\n", getpid(), getppid());
        sleep(2);
    }
    return 0;
}

运行后，你会发现主线程和新线程的pid和ppid完全相同，这说明它们属于同一个进程。

2.5 线程的调度与LWP

在Linux中，线程实际上是通过内核的轻量级进程（LWP）实现的。每个线程都有一个对应的LWP，而LWP才是操作系统调度的最小单位。

#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>

#define NUM_THREADS 3

// 线程函数：打印线程 ID
void *thread_task(void *arg)
{
    int thread_id = *(int *)arg;
    for (int i = 0; i < 3; i++)
    {
        printf("线程 %d (pthread_t: %lu) - 执行次数: %d\n",
               thread_id, (unsigned long)pthread_self(), i);
        sleep(1);
    }
    return NULL;
}

int main()
{
    pthread_t threads[NUM_THREADS];
    int thread_args[NUM_THREADS];

    // 创建多个线程
    for (int i = 0; i < NUM_THREADS; i++)
    {
        thread_args[i] = i;
        pthread_create(&threads[i], NULL, thread_task, &thread_args[i]);
    }

    // 主线程也打印自己的线程 ID
    printf("主线程 ID: %lu\n", (unsigned long)pthread_self());

    // 等待所有子线程结束
    for (int i = 0; i < NUM_THREADS; i++)
    {
        pthread_join(threads[i], NULL);
    }

    return 0;
}

我们可以通过ps -aL命令查看线程的LWP信息：

运行上面的命令，你会看到每个线程的LWP ID，以及它们所属的进程ID（PID）。

三、线程等待

1. 为什么需要线程等待？

线程被创建后，就像进程一样，它的资源需要被回收。如果不等待线程结束，这些资源就会变成“僵尸”，导致内存泄漏。就像餐厅的服务员完成任务后，经理需要确认他们已经下班，否则餐厅的运营效率会受到影响。

线程等待的核心函数是 thread_join，它的原型如下：

int pthread_join(pthread_t thread, void **retval);

thread：需要等待的线程ID。

retval：存储线程退出时的返回值。

返回值：成功返回0，失败返回错误码。

调用pthread_join的线程会挂起，直到目标线程结束。

2. 线程等待实例

2.1 基本线程等待

我们先来看一个简单的例子，主线程创建5个新线程，并等待它们完成。

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>

// 线程执行函数
void *thread_function(void *arg)
{
    char *msg = (char *)arg;
    int count = 0;
    while (count < 5)
    {
        printf("I am %s...tid: %lu\n", msg, pthread_self());
        sleep(1);
        count++;
    }
    return NULL; // 正常退出
}

int main()
{
    pthread_t threads[5]; // 存储5个线程ID
    for (int i = 0; i < 5; i++)
    {
        char *msg = (char *)malloc(64);
        sprintf(msg, "thread %d", i);
        pthread_create(&threads[i], NULL, thread_function, msg);
    }

    // 等待所有线程完成
    for (int i = 0; i < 5; i++)
    {
        pthread_join(threads[i], NULL);
        printf("Thread %d[%lu]...done\n", i, threads[i]);
    }
    return 0;
}

运行后，你会看到每个线程打印5次信息，然后主线程依次等待它们完成。

2.2 获取线程退出码

线程退出时可以返回一个值，就像函数返回值一样。我们可以通过pthread_join获取这个值。

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>

// 线程执行函数
void *thread_function(void *arg)
{
    char *msg = (char *)arg;
    int count = 0;
    while (count < 5)
    {
        printf("I am %s...tid: %lu\n", msg, pthread_self());
        sleep(1);
        count++;
    }
    return (void *)2023; // 返回退出码
}

int main()
{
    pthread_t threads[5];
    for (int i = 0; i < 5; i++)
    {
        char *msg = (char *)malloc(64);
        sprintf(msg, "thread %d", i);
        pthread_create(&threads[i], NULL, thread_function, msg);
    }

    // 等待并获取退出码
    for (int i = 0; i < 5; i++)
    {
        void *exit_code;
        pthread_join(threads[i], &exit_code);
        printf("Thread %d[%lu]...done, exit code: %d\n", i, threads[i], (int)exit_code);
    }
    return 0;
}

运行后，你会看到每个线程的退出码被成功捕获。

3. 线程异常终止的影响

线程异常终止（比如崩溃）会导致整个进程崩溃。这是因为线程共享进程的资源，一个线程的崩溃会影响整个进程。

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>

// 线程执行函数
void *thread_function(void *arg)
{
    char *msg = (char *)arg;
    int count = 0;
    while (count < 5)
    {
        printf("I am %s...tid: %lu\n", msg, pthread_self());
        sleep(1);
        count++;
        int a = 1 / 0; // 故意制造错误
    }
    return NULL;
}

int main()
{
    pthread_t threads[5];
    for (int i = 0; i < 5; i++)
    {
        char *msg = (char *)malloc(64);
        sprintf(msg, "thread %d", i);
        pthread_create(&threads[i], NULL, thread_function, msg);
    }

    // 等待线程（可能永远不会到达这里）
    for (int i = 0; i < 5; i++)
    {
        pthread_join(threads[i], NULL);
    }
    return 0;
}

运行这段代码，你会发现一旦某个线程崩溃，整个程序都会退出。这说明多线程程序的健壮性需要特别注意。

四、线程终止

1. 线程终止的三种方式

在POSIX线程库中，有三种方式可以终止线程：

从线程函数返回：就像函数执行完毕一样，线程自然结束。
调用pthread_exit：线程主动请求结束自己。
调用pthread_cancel：由其他线程强制终止某个线程。

2. return：线程的自然结束

核心原理：通过函数返回来结束线程执行流
特点：

主线程return会导致进程终止
子线程return仅终止当前线程

void* worker(void* arg) {
    int count = 0;
    while(count++ < 3) {
        printf("子线程工作中...\n");
        sleep(1);
    }
    return NULL;  // 正常退出
}

int main() {
    pthread_t tid;
    pthread_create(&tid, NULL, worker, NULL);
    
    // 主线程立即返回导致进程终止
    return 0;  // 错误示例！子线程来不及执行
}

执行现象：看不到任何输出，因为主线程退出导致进程终止

3. pthread_exit：线程的主动谢幕

API：pthread_exit(void *retval)
特性：

可携带退出状态码
退出码内存必须全局或堆分配

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>

void* task(void* arg) 
{
    int* exit_code = malloc(sizeof(int));  // 必须堆分配
    *exit_code = 2023;
    
    for(int i=0; i<3; i++)
    {
        printf("处理任务阶段%d\n", i+1);
        sleep(1);
    }
    pthread_exit(exit_code);  // 携带状态码退出
}

int main() 
{
    pthread_t tid;
    void* ret_val;
    
    pthread_create(&tid, NULL, task, NULL);
    pthread_join(tid, &ret_val);
    
    printf("线程退出码：%d\n", *(int*)ret_val);
    free(ret_val);  // 记得释放堆内存
    return 0;
}

4. pthread_cancel：线程的强制终止

API：pthread_cancel(pthread_t thread)
重要特性：

异步终止可能引发资源泄漏
被取消线程退出码为-1（PTHREAD_CANCELED）

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>

// 线程执行函数
void *thread_function(void *arg)
{
    char *msg = (char *)arg;
    int count = 0;
    while (count < 5)
    {
        printf("I am %s...tid: %lu\n", msg, pthread_self());
        sleep(1);
        count++;
    }
    pthread_exit((void *)2025);
}

int main()
{
    pthread_t threads[5];
    for (int i = 0; i < 5; i++)
    {
        char *msg = (char *)malloc(64);
        sprintf(msg, "thread %d", i);
        pthread_create(&threads[i], NULL, thread_function, msg);
    }

    // 取消前4个线程
    for (int i = 0; i < 4; i++)
    {
        pthread_cancel(threads[i]);
        printf("Canceled thread %d[%lu]\n", i, threads[i]);
    }

    // 等待所有线程
    for (int i = 0; i < 5; i++)
    {
        void *ret;
        pthread_join(threads[i], &ret);
        printf("Thread %d[%lu]...done, exit code: %d\n", i, threads[i], (int)ret);
    }
    return 0;
}

运行后，你会发现前4个线程被强制终止，退出码为-1，而第5个线程正常结束。

五、分离线程

1. 什么是分离线程？

分离线程是一种特殊的线程状态，它告诉系统：当线程退出时，自动释放所有资源，无需手动调用pthread_join。就像快递员完成任务后直接离开，不需要快递公司再确认他是否完成任务。

分离线程的核心函数是pthread_detach，它的原型如下：

int pthread_detach(pthread_t thread);

thread：需要分离的线程ID。
返回值：成功返回0，失败返回错误码。

分离线程后，线程的资源会在它退出时自动释放。一个线程不能同时是joinable和分离的。

2. 自动分离线程

我们可以通过pthread_detach将线程设置为分离状态。例如：

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>

// 线程执行函数
void *thread_function(void *arg)
{
    printf("%s\n", (char *)arg);    // 打印传入的消息
    pthread_detach(pthread_self()); // 将自己设置为分离线程
    return NULL;
}

int main()
{
    pthread_t tid;
    if (pthread_create(&tid, NULL, thread_function, "Hello from detached thread!") != 0)
    {
        printf("Failed to create thread\n");
        return 1;
    }

    // 等待一段时间，确保线程完成
    sleep(1);

    // 尝试等待线程（会失败，因为线程已经分离）
    if (pthread_join(tid, NULL) == 0)
    {
        printf("Thread joined successfully\n");
    }
    else
    {
        printf("Failed to join thread (expected behavior for detached thread)\n");
    }

    return 0;
}

运行后，你会看到线程打印了消息，但pthread_join会失败，因为线程已经分离。

3. 分离线程的优势

分离线程非常适合以下场景：

不关心线程返回值：比如后台日志记录线程，我们只需要它运行，不需要关心它的结果。
简化代码：避免手动等待线程，减少代码复杂度。

4. 注意事项

分离线程无法被等待：一旦线程被分离，pthread_join将始终失败。
分离时机很重要：最好在主线程中等待一段时间，确保子线程完成分离操作。
资源自动释放：分离线程的资源会在它退出时自动释放，无需手动干预。

六、线程ID与进程地址空间布局

1. 线程ID的双重身份

1.1 操作系统调度器的线程ID

线程是操作系统调度器的最小单位，每个线程都有一个由内核分配的唯一标识符，称为轻量级进程ID（LWP ID）。这个ID用于操作系统调度线程。

1.2 NPTL线程库的线程ID

在Linux的NPTL（Native POSIX Thread Library）实现中，pthread_create函数生成的线程ID是线程库内部使用的标识符。它本质上是一个指向线程控制块（struct pthread）的地址。
pthread_t pthread_self(void);
pthread_t类型的线程ID是进程地址空间中的一个地址，指向线程的控制块。线程库通过这个地址来管理线程的生命周期。

2. 线程ID的存储与管理

2.1 线程控制块（`struct pthread`）

每个线程都有一个对应的控制块（struct pthread），它存储了线程的栈、局部存储等信息。pthread_t类型的线程ID实际上是指向这个控制块的地址。

2.2 线程栈与资源

线程的栈和局部存储是线程运行时的核心资源。这些资源在进程的地址空间中分配，由线程库管理。

3. 进程地址空间布局

进程的地址空间由多个区域组成，包括内核空间、主线程栈、动态库、线程栈等。以下是具体布局：

内核空间：操作系统内核代码和数据。

主线程栈：主线程的栈空间。

动态库：加载的动态链接库。

线程控制块：每个线程的控制块。

线程栈：每个线程的栈空间。

堆：动态分配的内存区域。

未初始化数据段：未初始化的全局变量。

已初始化数据段：已初始化的全局变量。

代码段：程序的可执行代码。