Postgres数据库源码编译及部署
文章目录
- 一、源码编译及安装
- 1. 指定安装路径
- 2. 编译安装
- 3. 安装后 PostgreSQL 的目录结构
- 4. 设置环境变量
- 二、PG数据库初始化
- 1. 初始化数据库目录
- 2. 启动数据库
- 3. 连接数据库
- 4. 数据库启动失败,端口被占用
- 三、GIST索引使用
- 1. 下载数据文件
- 2. 将数据集加载到 Postgres 中
- 2.1 连接数据库
- 2.2 定义一个枚举类型
- 2.3 创建一张数据表
- 2.4 加载到数据表
- 源码编译安装PROJ库
- 2.5 为数据设置经纬度坐标
- 2.5.1 安装PROJ库
- 2.5.2 安装GDAL
- 2.5.3 安装postgis扩展
- 3. 创建gist空间索引
- 4. 创建Btree-gist索引
一、源码编译及安装
git clone https://github.com/postgres/postgres.git
1. 指定安装路径
--prefix=/opt/pgsql/postgresql
指定安装路径,最终make install
的时候,程序会被安装到这个目录下,而不是默认的/usr/lcoal
./configure --prefix=/opt/pgsql/postgresql
2. 编译安装
make
make install
3. 安装后 PostgreSQL 的目录结构
/opt/pgsql/postgresql/bin/ # 可执行文件
/opt/pgsql/postgresql/lib/ # 库文件
/opt/pgsql/postgresql/share/ # 共享资源,如配置模板、文档
4. 设置环境变量
vi ~/.bashrc
export PATH=/opt/pgsql/postgresql/bin:$PATH
source ~/.bashrc
二、PG数据库初始化
1. 初始化数据库目录
该操作只有在部署完后的首次需要执行,指定数据库文件存放地址-D $HOME/pgdata
,指定超级用户名-U postgres
/opt/pgsql/postgresql/bin/initdb -D /opt/postgres-custom/mydb -U postgres -E UTF8 --locale=en_US.UTF-8
2. 启动数据库
/opt/pgsql/postgresql/bin/pg_ctl -D /opt/postgres-custom/mydb -l logfile start
其中:-D
指定数据库目录,-l logfile
指定日志文件
3. 连接数据库
/usr/local/pgsql/postgresql/bin/psql -U postgres
输出
(base) hhhh@ubuntu-Super-Server:~$ psql -U postgres
psql (13.18)
Type "help" for help.
postgres=#
4. 数据库启动失败,端口被占用
检查端口是否被占用,执行sudo ss -tuln | grep 5432
检查是否被占用
(base) hhhhh@ubuntu-Super-Server:~$ sudo ss -tuln | grep 5432
tcp LISTEN 0 200 127.0.0.1:5432 0.0.0.0:*
或者通过ps aux | grep postgres
检查
(base) hhhh@ubuntu-Super-Server:~$ ps aux | grep postgres
hhhhh 1273591 0.0 0.0 214064 22400 ? Ss 2月12 5:40 /opt/pgsql/postgresql/bin/postgres -D /opt/postgres-custom/mydb
hhhhh 1273592 0.0 0.0 214440 24972 ? Ss 2月12 0:20 postgres: checkpointer
上述内容表明:数据目录是 /opt/postgres-custom/mydb,而且正是它占用了 5432 端口,通过下面的命令关闭当前进程
/opt/pgsql/postgresql/bin/pg_ctl -D /opt/postgres-custom/mydb stop
三、GIST索引使用
GIST作为一个通用索引,支持实现各种树索引如Btree、Rtree等,在这里我们以GIST中的btree实现作为例子
1. 下载数据文件
cd $HOME/gist/test_dataset
wget https://data.police.uk/data/archive/2017-04.zip
查看一共多少条数据xsv cat rows **/*-street.csv | xsv count
,并查看都有哪些列
(base) hhhhh@ubuntu-Super-Server:~/gist/test_dataset$ xsv cat rows **/*-street.csv | xsv count
38595130
(base) hhhhh@ubuntu-Super-Server:~/gist/test_dataset$ xsv headers 2010-12/2010-12-avon-and-somerset-street.csv
1 Crime ID
2 Month
3 Reported by
4 Falls within
5 Longitude
6 Latitude
7 Location
8 LSOA code
9 LSOA name
10 Crime type
11 Last outcome category
12 Context
(base) hanbaofu@ubuntu-Super-Server:~/gist/test_dataset$
统计多个 *-street.csv
文件中,不同类型犯罪(Crime type)出现的频率,并以表格形式展示。
(base) hanbaofu@ubuntu-Super-Server:~/gist/test_dataset$ xsv cat rows **/*-street.csv \
> | pv -l -s 38595130 \
> | xsv frequency --select 'Crime type' --limit 0 \
> | xsv table
38.6M 0:00:38 [1.01M/s] [===============================================================================================>] 100%
field value count
Crime type Anti-social behaviour 13869607
Crime type Violence and sexual offences 4047849
Crime type Other theft 3258648
Crime type Criminal damage and arson 3129836
Crime type Burglary 2866490
Crime type Vehicle crime 2477833
Crime type Other crime 2147319
Crime type Shoplifting 1890071
Crime type Violent crime 1673219
Crime type Drugs 987402
Crime type Public order 792808
Crime type Robbery 391709
Crime type Bicycle theft 369895
Crime type Theft from the person 346421
Crime type Public disorder and weapons 242145
Crime type Possession of weapons 103878
(base) hanbaofu@ubuntu-Super-Server:~/gist/test_dataset$
2. 将数据集加载到 Postgres 中
2.1 连接数据库
(base) hanbaofu@ubuntu-Super-Server:~/gist$ /usr/local/pgsql/postgresql/bin/psql -U postgres
psql (18devel)
Type "help" for help.
postgres=#
执行create database crimes_db;
生成数据库,并切换到当前数据库\c crimes_db
postgres=# create database crimes_db;
CREATE DATABASE
postgres=# \c crimes_db
You are now connected to database "crimes_db" as user "postgres".
crimes_db=#
2.2 定义一个枚举类型
CREATE TYPE crime_type AS ENUM (
'Anti-social behaviour',
'Violence and sexual offences',
'Other theft',
'Criminal damage and arson',
'Burglary',
'Vehicle crime',
'Other crime',
'Shoplifting',
'Violent crime',
'Drugs',
'Public order',
'Robbery',
'Bicycle theft',
'Theft from the person',
'Public disorder and weapons',
'Possession of weapons'
);
2.3 创建一张数据表
CREATE TABLE crimes (
month DATE,
longitude REAL,
latitude REAL,
crime crime_type
);
2.4 加载到数据表
执行下面的命令,把多个 CSV 文件中的数据导入到 PostgreSQL 中的 crimes 表里
(base) hhhhh@ubuntu-Super-Server:~/gist/test_dataset$ xsv cat rows **/*-street.csv \
> | pv -l -s 38595130 \
> | xsv select Month,Longitude,Latitude,'Crime type' \
> | sed -E 's/^([0-9]+-[0-9]+)/\1-01/g' \
> | psql -U postgres -d crimes_db -c 'copy crimes from stdin csv header'
38.6M 0:01:25 [ 452k/s] [===============================================================================================>] 100%
COPY 38595130
(base) hhhhh@ubuntu-Super-Server:~/gist/test_dataset$
通过快速 交叉表快速检查数据是否已加载完毕,在此之前,先安装如下扩展
crimes_db=# create extension tablefunc;
ERROR: extension "tablefunc" is not available
DETAIL: Could not open extension control file "/usr/local/pgsql/postgresql/share/extension/tablefunc.control": No such file or directory.
HINT: The extension must first be installed on the system where PostgreSQL is running.
我们发现安装扩展失败,这是因为在从进行源码编译 PostgreSQL时,默认只编译了核心部分,而 PostgreSQL 自带的很多扩展(比如 tablefunc、uuid-ossp 等)其实都在一
postgresql/contrib
中
执行下面的命令安装 contrib 扩展,这会把所有 .control、.sql 文件安装到 PostgreSQL 的安装目录下
cd ~/gist/postgres/contrib
make
sudo make install
接下来重新回到 psql,执行刚才的创建扩展命令即可。
crimes_db=# CREATE EXTENSION tablefunc;
CREATE EXTENSION
crimes_db=#
通过快速交叉表看看多年来数据的分布情况:
select * from crosstab(
$$ select crime, to_char(month, 'YYYY') as year, count(*) from crimes group by crime, year order by crime asc, year asc $$,
$$ select gs from generate_series(2010, 2017) gs $$
) as ("Crime" text, "2010" int, "2011" int, "2012" int, "2013" int, "2014" int, "2015" int, "2016" int, "2017" int);
得到如下输出:注意在psql中,只要你不输入分号,就会默认这个命令没结束
crimes_db=# select * from crosstab(
crimes_db(# $$ select crime, to_char(month, 'YYYY') as year, count(*) from crimes group by crime, year order by crime asc, year asc $$,
crimes_db(# $$ select gs from generate_series(2010, 2017) gs $$
crimes_db(# ) as ("Crime" text, "2010" int, "2011" int, "2012" int, "2013" int, "2014" int, "2015" int, "2016" int, "2017" int);
Crime | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017
------------------------------+--------+---------+---------+---------+---------+---------+---------+--------
Anti-social behaviour | 201016 | 2792756 | 2369431 | 2191485 | 2035155 | 1875251 | 1850397 | 554116
Violence and sexual offences | | | | 477713 | 839521 | 1049988 | 1232210 | 448417
Other theft | | 251482 | 737010 | 582768 | 515839 | 505800 | 496783 | 168966
Criminal damage and arson | | 207293 | 563360 | 526600 | 516930 | 549252 | 570616 | 195785
Burglary | 37893 | 505841 | 476006 | 455384 | 427448 | 410673 | 409102 | 144143
Vehicle crime | 29416 | 411012 | 392300 | 378911 | 356789 | 368599 | 394087 | 146719
Other crime | 142705 | 1530533 | 182205 | 71454 | 53501 | 63296 | 74160 | 29465
Shoplifting | | 104179 | 303366 | 321207 | 331093 | 339237 | 361982 | 129007
Violent crime | 57580 | 729387 | 676732 | 209520 | | | |
Drugs | | 71007 | 208111 | 198252 | 177470 | 151401 | 137739 | 43422
Public order | | | | 91823 | 152113 | 194164 | 254129 | 100579
Robbery | 5731 | 75068 | 68042 | 60648 | 52416 | 51651 | 56042 | 22111
Bicycle theft | | | | 73342 | 95021 | 88193 | 86670 | 26669
Theft from the person | | | | 68442 | 81964 | 83056 | 83483 | 29476
Public disorder and weapons | | 51555 | 147405 | 43185 | | | |
Possession of weapons | | | | 14133 | 21415 | 25168 | 31218 | 11944
(16 rows)
crimes_db=#
源码编译安装PROJ库
# 安装编译所需依赖
sudo apt-get install build-essential cmake sqlite3 libsqlite3-dev
# 下载和编译PROJ
cd ~
wget https://download.osgeo.org/proj/proj-9.3.0.tar.gz
tar -xvzf proj-9.3.0.tar.gz
cd proj-9.3.0
mkdir build && cd build
cmake ..
make -j$(nproc)
sudo make install
sudo ldconfig
在执行cmake …的时候出现如下错误,这是因为在编译PROJ库时缺少TIFF库
-- Found Sqlite3: /usr/lib/x86_64-linux-gnu/libsqlite3.so
-- Sqlite3 version: 3.45.1
CMake Error at /usr/share/cmake-3.28/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
Could NOT find TIFF (missing: TIFF_LIBRARY TIFF_INCLUDE_DIR)
Call Stack (most recent call first):
/usr/share/cmake-3.28/Modules/FindPackageHandleStandardArgs.cmake:600 (_FPHSA_FAILURE_MESSAGE)
/usr/share/cmake-3.28/Modules/FindTIFF.cmake:272 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
CMakeLists.txt:198 (find_package)
安装TIFF库的开发包:
sudo apt-get install libtiff-dev
如果遇到如下错误,则通过conda install -c conda-forge libtiff
安装即可
-- Configuring incomplete, errors occurred!
(base) hanbaofu@ubuntu-Super-Server:~/proj-9.3.0/build$ sudo apt-get install libtiff-dev
正在读取软件包列表... 完成
正在分析软件包的依赖关系树... 完成
正在读取状态信息... 完成
有一些软件包无法被安装。如果您用的是 unstable 发行版,这也许是
因为系统无法达到您要求的状态造成的。该版本中可能会有一些您需要的软件
包尚未被创建或是它们已被从新到(Incoming)目录移出。
下列信息可能会对解决问题有所帮助:
下列软件包有未满足的依赖关系:
libdeflate-dev : 依赖: libdeflate0 (= 1.19-1build1) 但是 1.19-1build1.1 正要被安装
libzstd-dev : 依赖: libzstd1 (= 1.5.5+dfsg2-2build1) 但是 1.5.5+dfsg2-2build1.1 正要被安装
E: 无法修正错误,因为您要求某些软件包保持现状,就是它们破坏了软件包间的依赖关系。
(base) hanbaofu@ubuntu-Super-Server:~/proj-9.3.0/build$ conda install -c conda-forge libtiff
Retrieving notices: done
2.5 为数据设置经纬度坐标
在正式开始之前,需要先安装相应的依赖
2.5.1 安装PROJ库
cd ~/proj-9.3.0
rm -rf build
mkdir build && cd build
cmake .. -DCMAKE_PREFIX_PATH=$CONDA_PREFIX
make -j$(nproc)
sudo make install
sudo ldconfig
2.5.2 安装GDAL
下载源码
wget https://github.com/OSGeo/gdal/releases/download/v3.7.3/gdal-3.7.3.tar.gz
tar -xzf gdal-3.7.3.tar.gz
cd gdal-3.7.3
mkdir build && cd build
cmake ..
make -j$(nproc)
sudo make install
sudo ldconfig
安装完之后检查是否正确安装
# 检查gdal-config是否存在
which gdal-config
# 验证GDAL库文件是否存在
ls -la /usr/local/lib/libgdal*
# 检查GDAL版本
gdal-config --version
设置环境变量
export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
export CPLUS_INCLUDE_PATH=/usr/local/include:$CPLUS_INCLUDE_PATH
export C_INCLUDE_PATH=/usr/local/include:$C_INCLUDE_PATH
export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig:$PKG_CONFIG_PATH
2.5.3 安装postgis扩展
在配置PostGIS时,确保 PostGIS 与相应的PG13 环境绑定,如果是其他的PG版本,则修改下面的内容即可
cd ~/gist/postgis-3.4.0
./configure --with-pgconfig=/opt/pgsql/postgresql/bin/pg_config
显式指定GDAL的所有相关路径:
./configure \
--with-gdal=/usr/local/bin/gdal-config \
--with-gdalconfig=/usr/local/bin/gdal-config \
LDFLAGS="-L/usr/local/lib" \
CPPFLAGS="-I/usr/local/include"
make
sudo make install
输出
configure: WARNING: unrecognized options: --with-gdal
PostGIS is now configured for x86_64-pc-linux-gnu
-------------- Compiler Info -------------
C compiler: gcc -std=gnu99 -g -O2 -fno-math-errno -fno-signed-zeros -Wall
C++ compiler (Wagyu): gcc -std=c++11 -x c++
C++ compiler (FlatGeobuf): gcc -std=c++11 -x c++
CPPFLAGS: -I$HOME/miniconda3/include -I$HOME/miniconda3/include/libxml2 -I/home/hanbaofu/miniconda3/include -DNDEBUG -I/usr/local/include
LDFLAGS: -L/usr/local/lib -lm
SQL preprocessor: /usr/bin/cpp -traditional-cpp -w -P -Upixel -Ubool
Archiver: gcc-ar rs
-------------- Additional Info -------------
Interrupt Tests: ENABLED
-------------- Dependencies --------------
GEOS config: $HOME/miniconda3/bin/geos-config
GEOS version: 3.13.1
GDAL config: /usr/local/bin/gdal-config
GDAL version: 3.7.3
PostgreSQL config: /usr/local/pgsql/postgresql/bin/pg_config
PostgreSQL version: PostgreSQL 18devel
PROJ4 version: 93
Libxml2 config: $HOME/miniconda3/bin/xml2-config
Libxml2 version: 2.13.7
JSON-C support: no
protobuf support: yes
protobuf-c version: 1004001
PCRE support: not found
Perl: /usr/bin/perl
--------------- Extensions ---------------
PostgreSQL EXTENSION support: enabled
PostGIS Raster: enabled
PostGIS Topology: enabled
SFCGAL support: disabled
Address Standardizer support: disabled
-------- Documentation Generation --------
xsltproc:
xsl style sheets:
dblatex:
convert:
mathml2.dtd: http://www.w3.org/Math/DTD/mathml2/mathml2.dtd
然后在psql
中执行下面的命令
create extension postgis;