【苹果cms 2】资源站动漫采集爬取
之前网站搭建好后还没有数据,通过各采集站教程可以导入数据,但是不能自定义,这里写下如何快速采集特定类型的资源,比如说动漫视频
量子资源
https://lzizy.net/
根据教程找到接口地址
https://cj.lziapi.com/api.php/provide/vod/?ac=list
请求结果样式如下
{
"code": 1,
"msg": "数据列表",
"page": 1,
"pagecount": 5097,
"limit": "20",
"total": 101932,
"list": [
{
"vod_id": 103400,
"vod_name": "一息尚存",
"type_id": 11,
"type_name": "剧情片",
"vod_en": "zuihoudehuxi",
"vod_time": "2025-04-06 04:08:49",
"vod_remarks": "HD",
"vod_play_from": "liangzi,lzm3u8"
},
]
"class": [
{
"type_id": 30,
"type_pid": 4,
"type_name": "日韩动漫"
},
]
}
比如只想要 日韩动漫的数据,需要筛选 type_id
maccms 对外接口 Provide
发现其接口也是 maccms 原生提供的,查看源码分析参数
application\api\controller\Provide.php
if (!empty($this->_param['t'])) {
if (empty($GLOBALS['config']['api']['vod']['typefilter']) || strpos($GLOBALS['config']['api']['vod']['typefilter'], $this->_param['t']) !== false) {
$where['type_id'] = $this->_param['t'];
}
}
if (empty($this->_param['pg'])) {
$this->_param['pg'] = 1;
}
$pagesize = $GLOBALS['config']['api']['vod']['pagesize'];
if (!empty($this->_param['pagesize']) && $this->_param['pagesize'] > 0) {
$pagesize = min((int)$this->_param['pagesize'], 100);
}
$res = model('vod')->listData($where, $order, $this->_param['pg'], $pagesize, 0, $field, 0);
可见通过 t 对应 type_id, pg 对应页数
采集爬取
构建请求
https://cj.lziapi.com/api.php/provide/vod/?ac=detail&t=30&pg=1
返回样式
{
"code": 1,
"msg": "数据列表",
"page": 1,
"pagecount": 189,
"limit": "20",
"total": 3761,
"list": []
}
遍历 pagecount 可获取全部数据。
数据入库
将 list 中数据这里通过 python 批量导入 mysql 数据库后,之前网站便有数据了,把这个资源站所有的日韩动漫全部爬取了总共 79 页数据,共计 6x8x78+3x6-1 = 3761 条,对应接口给出的 total