当前位置: 首页 > news >正文

爬虫逆向——RPC技术

学习目标:

  1. 了解 websocket 协议
  2. 熟悉 websocket 实现原理
  3. 掌握 RPC 启用和注入方式

RPC,英文 RangPaCong,中文让爬虫,旨在为爬虫开路,秒杀一切,让爬虫畅通无阻!

WebSocket的出现,使得浏览器具备了实时双向通信的能力。

参考:https://blog.csdn.net/ningmengjing_/article/details/131693721?spm=1011.2415.3001.5331

参考:https://blog.csdn.net/ningmengjing_/article/details/131693687?spm=1011.2415.3001.5331

参考:https://www.cnblogs.com/chyingp/p/websocket-deep-in.html

一、websocket

1.websocket介绍与原理

WebSocket 是 HTML5 提出的一种基于 TCP 协议的全双工通信协议,它实现了浏览器与服务器之间的持久化连接,能够更高效地节省服务器资源和带宽,并提供实时通信能力。

WebSocket 通过一套特定的握手机制建立连接,使客户端和服务器之间可以建立一个类似于 TCP 的连接,从而支持双向实时通信。在 WebSocket 出现之前,Web 应用通常依赖 HTTP 协议的短连接或长连接进行通信,而 WebSocket 是一种独立于 HTTP 的无状态协议,其协议标识为“ws”。

连接建立过程简要如下:

  1. 客户端发起一个 HTTP 请求,该请求经过三次握手建立 TCP 连接。请求头中包含 Upgrade、Connection、WebSocket-Version 等字段,表明希望升级到 WebSocket 协议;

  2. 服务器确认支持 WebSocket 后,返回 HTTP 响应完成握手;

  3. 握手成功后,客户端和服务器即可基于已建立的 TCP 连接进行全双工通信。

2.websocket实现方式

(1)客户端
<!DOCTYPE html>
<html lang="en">
<head><meta charset="UTF-8"><title>Title</title>
</head>
<body>
<input type="text" id="box">
<button onclick="ps()">发送</button><script><!--    创建链接    -->var ws = new WebSocket('ws://127.0.0.1:8000')// 执行失败的回调ws.onerror = function () {console.log('链接失败')}// 链接成功的回调// ws.onopen// 接收消息的回调方法ws.onmessage = function (event) {console.log(event.data)}// 关闭链接// ws.onclosefunction ps() {var text = document.getElementById('box').value// 发送数据ws.send(text)}
</script>
</body>
</html>
(2)服务端
import asyncio
import websocketsasync def echo(ws):await ws.send('hello')return Trueasync def recv_msg(ws):while 1:data = await ws.recv()print(data)async def main(ws,path):await echo(ws)await recv_msg(ws)if __name__ == '__main__':loop = asyncio.get_event_loop()ser = websockets.serve(main,'127.0.0.1',8000)loop.run_until_complete(ser)loop.run_forever()
(3)实际案例
案例目标
  • 网址:https://jzsc.mohurd.gov.cn/data/company
  • 需求:通过websocket解析加密数据
  • 实际注入网站代码
!(function () {if (window.flag) {} else {const websocket = new WebSocket('ws://127.0.0.1:8000')// 创建一个标记用来判断是否创建套接字window.flag = true;// 接收服务端发送的信息websocket.onmessage = function (event) {var data = event.data// 调用js解密var res = b(data)console.log(res)// 发送解密数据给服务端websocket.send(res)}}}())
解题分析
  • 定位到加密位置
  • 将我们写的websocket命令注入到代码当中(通过替换的方式实现)

  • 注入之后需要刷新页面才能把js 执行起来
  • python执行代码
# encoding: utf-8
import asyncio
import websockets
import requests
import time
import jsondef get_data(page):headers = {"v": "231012","Referer": "https://jzsc.mohurd.gov.cn/data/company","User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36",}url = "https://jzsc.mohurd.gov.cn/APi/webApi/dataservice/query/comp/list"params = {"pg": page,"pgsz": "15","total": "450"}response = requests.get(url, headers=headers, params=params)print(response.text)return response.textasync def echo(websocket):for i in range(1, 4):data = get_data(i)await websocket.send(data)# time.sleep(2)# return Trueasync def recv_msg(websocket):while 1:# 接收数据recv_text = await websocket.recv()print(json.loads(recv_text))async def main_logic(websocket, path):await echo(websocket)await recv_msg(websocket)start_server = websockets.serve(main_logic, '127.0.0.1', 8080)
loop = asyncio.get_event_loop()
loop.run_until_complete(start_server)
# 创建了一个连接对象之后,需要不断监听返回的数据,则调用 run_forever 方法,要保持长连接即可
loop.run_forever()

二、RPC

1.RPC简介

RPC(Remote Procedure Call,远程过程调用)是一种通过网络从远程计算机程序中请求服务,而不需要了解底层网络细节的技术。在使用 WebSocket 等协议进行通信时,通常需要手动建立连接、维护会话并处理数据传输,过程较为繁琐。而 RPC 可以帮助我们封装这些底层操作,直接调用远程服务暴露的接口,极大简化开发流程。

RPC 技术的本质是实现进程间通信,允许一个进程调用另一个进程(往往位于不同的机器或环境中)中的方法或函数。由于其适用于微服务、分布式系统等场景,RPC 在架构设计中广泛使用。尽管 RPC 本身技术复杂度较高,但在爬虫和逆向领域中,我们无需深入其全部细节,重点在于掌握如何借助 RPC 实现高效的数据交互和函数调用。

在逆向工程中,RPC 的一个典型应用是将浏览器环境与本地代码视为服务端和客户端,通过 WebSocket 等协议建立 RPC 通信通道。这样一来,可以在浏览器中暴露加密函数等关键方法,供本地代码直接远程调用。这种方式避免了复杂算法还原、代码扣取和环境补全等操作,显著节省逆向分析的时间成本。

2.Sekiro-RPC

Sekiro-RPC 是一个基于 WebSocket 实现的轻量级 RPC 框架,主要用于浏览器环境与本地服务之间的远程调用。官网文档地址:https://sekiro.iinti.cn/sekiro-doc/

(1)使用方法

1. 启动服务端(本地)
需预先安装 Java 运行环境(JRE)。若未安装,可参考:JDK 8 安装配置指南
JDK 下载地址(华为镜像站):https://repo.huaweicloud.com/java/jdk/8u201-b09/

启动方式:

  • Linux & Mac:执行 bin/sekiro.sh

  • Windows:双击运行 bin/sekiro.bat

2. 客户端环境
在前端中引入 Sekiro 客户端脚本,用于建立与 Sekiro 服务端的通信:

https://file.virjar.com/sekiro_web_client.js?_=123

3. 使用原理与参数说明
核心机制是通过注入浏览器环境的客户端脚本(SekiroClient),与本地启动的 Sekiro 服务端建立通信,从而实现对浏览器内部方法的 RPC 调用。开发者可通过官方提供的 SekiroClient 代码样例,快速实现远程调用功能。

// 生成唯一标记uuid编号
function guid() {function S4() {return (((1 + Math.random()) * 0x10000) | 0).toString(16).substring(1)}return (S4() + S4() + "-" + S4() + "-" + S4() + "-" + S4() + "-" + S4() + S4() + S4());
}// 连接服务端
var client = new SekiroClient("ws://127.0.0.1:5620/business-demo/register?group=ws-group&clientId=" + guid());
// 业务接口 
client.registerAction("登陆", function (request, resolve, reject) {resolve("" + new Date());
})
  • Group(业务组):用于区分不同的业务类型或接口组。每个 Group 下可注册多个终端(SekiroClient),并可挂载多个 Action(接口),实现对不同业务功能的逻辑隔离与管理。

  • ClientId(设备标识):用于唯一标识一个终端设备。支持多设备同时注册,具备负载均衡与群控能力,适用于需要多机协同提供 API 服务的场景。

  • SekiroClient(服务提供端):运行在浏览器、手机等环境中的客户端,作为实际的服务提供者。每个客户端需具有唯一的 ClientId,Sekiro 服务端会将调用请求转发至相应的 SekiroClient 执行。

  • registerAction(接口注册):在同一个 Group 下可注册多个 Action,每个 Action 对应一个具体的功能接口,实现不同业务操作的分离与调用。

  • request(请求对象):代表从服务端接收到的调用请求,其中包含调用参数。可通过键值对形式提取参数,供业务逻辑处理使用。

  • resolve(结果回传):用于将处理结果返回给服务端的方法,确保调用方能及时接收到响应数据。

(2)测试使用
1.前端代码:
<!DOCTYPE html>
<html lang="en">
<head><meta charset="UTF-8"><title>Title</title>
</head>
<body><script>(function () {function SekiroClient(wsURL) {this.wsURL = wsURL;this.handlers = {};this.socket = {};this.base64 = false;// checkif (!wsURL) {throw new Error('wsURL can not be empty!!')}this.webSocketFactory = this.resolveWebSocketFactory();this.connect()}SekiroClient.prototype.resolveWebSocketFactory = function () {if (typeof window === 'object') {var theWebSocket = window.WebSocket ? window.WebSocket : window.MozWebSocket;return function (wsURL) {function WindowWebSocketWrapper(wsURL) {this.mSocket = new theWebSocket(wsURL);}WindowWebSocketWrapper.prototype.close = function () {this.mSocket.close();};WindowWebSocketWrapper.prototype.onmessage = function (onMessageFunction) {this.mSocket.onmessage = onMessageFunction;};WindowWebSocketWrapper.prototype.onopen = function (onOpenFunction) {this.mSocket.onopen = onOpenFunction;};WindowWebSocketWrapper.prototype.onclose = function (onCloseFunction) {this.mSocket.onclose = onCloseFunction;};WindowWebSocketWrapper.prototype.send = function (message) {this.mSocket.send(message);};return new WindowWebSocketWrapper(wsURL);}}if (typeof weex === 'object') {// this is weex env : https://weex.apache.org/zh/docs/modules/websockets.htmltry {console.log("test webSocket for weex");var ws = weex.requireModule('webSocket');console.log("find webSocket for weex:" + ws);return function (wsURL) {try {ws.close();} catch (e) {}ws.WebSocket(wsURL, '');return ws;}} catch (e) {console.log(e);//ignore}}//TODO support ReactNativeif (typeof WebSocket === 'object') {return function (wsURL) {return new theWebSocket(wsURL);}}// weex 鍜� PC鐜鐨剋ebsocket API涓嶅畬鍏ㄤ竴鑷达紝鎵€浠ュ仛浜嗘娊璞″吋瀹�throw new Error("the js environment do not support websocket");};SekiroClient.prototype.connect = function () {console.log('sekiro: begin of connect to wsURL: ' + this.wsURL);var _this = this;// 涓峜heck close锛岃// if (this.socket && this.socket.readyState === 1) {//     this.socket.close();// }try {this.socket = this.webSocketFactory(this.wsURL);} catch (e) {console.log("sekiro: create connection failed,reconnect after 2s");setTimeout(function () {_this.connect()}, 2000)}this.socket.onmessage(function (event) {_this.handleSekiroRequest(event.data)});this.socket.onopen(function (event) {console.log('sekiro: open a sekiro client connection')});this.socket.onclose(function (event) {console.log('sekiro: disconnected ,reconnection after 2s');setTimeout(function () {_this.connect()}, 2000)});};SekiroClient.prototype.handleSekiroRequest = function (requestJson) {console.log("receive sekiro request: " + requestJson);var request = JSON.parse(requestJson);var seq = request['__sekiro_seq__'];if (!request['action']) {this.sendFailed(seq, 'need request param {action}');return}var action = request['action'];if (!this.handlers[action]) {this.sendFailed(seq, 'no action handler: ' + action + ' defined');return}var theHandler = this.handlers[action];var _this = this;try {theHandler(request, function (response) {try {_this.sendSuccess(seq, response)} catch (e) {_this.sendFailed(seq, "e:" + e);}}, function (errorMessage) {_this.sendFailed(seq, errorMessage)})} catch (e) {console.log("error: " + e);_this.sendFailed(seq, ":" + e);}};SekiroClient.prototype.sendSuccess = function (seq, response) {var responseJson;if (typeof response == 'string') {try {responseJson = JSON.parse(response);} catch (e) {responseJson = {};responseJson['data'] = response;}} else if (typeof response == 'object') {responseJson = response;} else {responseJson = {};responseJson['data'] = response;}if (typeof response == 'string') {responseJson = {};responseJson['data'] = response;}if (Array.isArray(responseJson)) {responseJson = {data: responseJson,code: 0}}if (responseJson['code']) {responseJson['code'] = 0;} else if (responseJson['status']) {responseJson['status'] = 0;} else {responseJson['status'] = 0;}responseJson['__sekiro_seq__'] = seq;var responseText = JSON.stringify(responseJson);console.log("response :" + responseText);if (responseText.length < 1024 * 6) {this.socket.send(responseText);return;}if (this.base64) {responseText = this.base64Encode(responseText)}//澶ф姤鏂囪鍒嗘浼犺緭var segmentSize = 1024 * 5;var i = 0, totalFrameIndex = Math.floor(responseText.length / segmentSize) + 1;for (; i < totalFrameIndex; i++) {var frameData = JSON.stringify({__sekiro_frame_total: totalFrameIndex,__sekiro_index: i,__sekiro_seq__: seq,__sekiro_base64: this.base64,__sekiro_is_frame: true,__sekiro_content: responseText.substring(i * segmentSize, (i + 1) * segmentSize)});console.log("frame: " + frameData);this.socket.send(frameData);}};SekiroClient.prototype.sendFailed = function (seq, errorMessage) {if (typeof errorMessage != 'string') {errorMessage = JSON.stringify(errorMessage);}var responseJson = {};responseJson['message'] = errorMessage;responseJson['status'] = -1;responseJson['__sekiro_seq__'] = seq;var responseText = JSON.stringify(responseJson);console.log("sekiro: response :" + responseText);this.socket.send(responseText)};SekiroClient.prototype.registerAction = function (action, handler) {if (typeof action !== 'string') {throw new Error("an action must be string");}if (typeof handler !== 'function') {throw new Error("a handler must be function");}console.log("sekiro: register action: " + action);this.handlers[action] = handler;return this;};SekiroClient.prototype.encodeWithBase64 = function () {this.base64 = arguments && arguments.length > 0 && arguments[0];};SekiroClient.prototype.base64Encode = function (s) {if (arguments.length !== 1) {throw "SyntaxError: exactly one argument required";}s = String(s);if (s.length === 0) {return s;}function _get_chars(ch, y) {if (ch < 0x80) y.push(ch);else if (ch < 0x800) {y.push(0xc0 + ((ch >> 6) & 0x1f));y.push(0x80 + (ch & 0x3f));} else {y.push(0xe0 + ((ch >> 12) & 0xf));y.push(0x80 + ((ch >> 6) & 0x3f));y.push(0x80 + (ch & 0x3f));}}var _PADCHAR = "=",_ALPHA = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/",_VERSION = "1.1";//Mr. Ruan fix to 1.1 to support asian char(utf8)//s = _encode_utf8(s);var i,b10,y = [],x = [],len = s.length;i = 0;while (i < len) {_get_chars(s.charCodeAt(i), y);while (y.length >= 3) {var ch1 = y.shift();var ch2 = y.shift();var ch3 = y.shift();b10 = (ch1 << 16) | (ch2 << 8) | ch3;x.push(_ALPHA.charAt(b10 >> 18));x.push(_ALPHA.charAt((b10 >> 12) & 0x3F));x.push(_ALPHA.charAt((b10 >> 6) & 0x3f));x.push(_ALPHA.charAt(b10 & 0x3f));}i++;}switch (y.length) {case 1:var ch = y.shift();b10 = ch << 16;x.push(_ALPHA.charAt(b10 >> 18) + _ALPHA.charAt((b10 >> 12) & 0x3F) + _PADCHAR + _PADCHAR);break;case 2:var ch1 = y.shift();var ch2 = y.shift();b10 = (ch1 << 16) | (ch2 << 8);x.push(_ALPHA.charAt(b10 >> 18) + _ALPHA.charAt((b10 >> 12) & 0x3F) + _ALPHA.charAt((b10 >> 6) & 0x3f) + _PADCHAR);break;}return x.join("");};function startRpc() {if (window.flag) {} else {function guid() {function S4() {return (((1 + Math.random()) * 0x10000) | 0).toString(16).substring(1);}return (S4() + S4() + "-" + S4() + "-" + S4() + "-" + S4() + "-" + S4() + S4() + S4());}// 创建一个标记用来判断是否创建套接字window.flag = true;var client = new SekiroClient("ws://127.0.0.1:5620/business-demo/register?group=rpc-test&clientId=" + guid());client.registerAction("ths", function (request, resolve, reject) {resolve(rt.update());})}}setTimeout(startRpc, 1000)})()function guid() {function S4() {return (((1 + Math.random()) * 0x10000) | 0).toString(16).substring(1);}return (S4() + S4() + "-" + S4() + "-" + S4() + "-" + S4() + "-" + S4() + S4() + S4());}var client = new SekiroClient("ws://127.0.0.1:5620/business-demo/register?group=rpc-test&clientId=" + guid());client.registerAction("clientTime", function (request, resolve, reject) {resolve("" + new Date());})</script>
</body>
</html>
2.SK API

Sekiro 为我们提供了一些 API

  • 查看分组列表:http://127.0.0.1:5620/business-demo/groupList
  • 查看队列状态:http://127.0.0.1:5620/business-demo/clientQueue?group=rpc-test
  • 调用转发:http://127.0.0.1:5620/business-demo/invoke?group=rpc-test&action=clientTime
3.python调用
import requestsparams = {"group": "rpc-test","action": "clientTime",
}
res = requests.get("http://127.0.0.1:5620/business-demo/invoke", params=params)
print(res.text)

三、案例

案例一:

1.逆向目标

地址:http://q.10jqka.com.cn/

加密参数:

2.逆向分析

hook定位加密位置:

// hook   cookie
(function(){cookie_val=document.cookieObject.defineProperty(document,'cookie',{set:function(new_val){debuggercookie_data=new_val},get:function(){return cookie_val},})
})()

3.代码实现

要插入的代码:

(function () {function SekiroClient(wsURL) {this.wsURL = wsURL;this.handlers = {};this.socket = {};this.base64 = false;// checkif (!wsURL) {throw new Error('wsURL can not be empty!!')}this.webSocketFactory = this.resolveWebSocketFactory();this.connect()}SekiroClient.prototype.resolveWebSocketFactory = function () {if (typeof window === 'object') {var theWebSocket = window.WebSocket ? window.WebSocket : window.MozWebSocket;return function (wsURL) {function WindowWebSocketWrapper(wsURL) {this.mSocket = new theWebSocket(wsURL);}WindowWebSocketWrapper.prototype.close = function () {this.mSocket.close();};WindowWebSocketWrapper.prototype.onmessage = function (onMessageFunction) {this.mSocket.onmessage = onMessageFunction;};WindowWebSocketWrapper.prototype.onopen = function (onOpenFunction) {this.mSocket.onopen = onOpenFunction;};WindowWebSocketWrapper.prototype.onclose = function (onCloseFunction) {this.mSocket.onclose = onCloseFunction;};WindowWebSocketWrapper.prototype.send = function (message) {this.mSocket.send(message);};return new WindowWebSocketWrapper(wsURL);}}if (typeof weex === 'object') {// this is weex env : https://weex.apache.org/zh/docs/modules/websockets.htmltry {console.log("test webSocket for weex");var ws = weex.requireModule('webSocket');console.log("find webSocket for weex:" + ws);return function (wsURL) {try {ws.close();} catch (e) {}ws.WebSocket(wsURL, '');return ws;}} catch (e) {console.log(e);//ignore}}//TODO support ReactNativeif (typeof WebSocket === 'object') {return function (wsURL) {return new theWebSocket(wsURL);}}// weex 鍜� PC鐜鐨剋ebsocket API涓嶅畬鍏ㄤ竴鑷达紝鎵€浠ュ仛浜嗘娊璞″吋瀹�throw new Error("the js environment do not support websocket");};SekiroClient.prototype.connect = function () {console.log('sekiro: begin of connect to wsURL: ' + this.wsURL);var _this = this;// 涓峜heck close锛岃// if (this.socket && this.socket.readyState === 1) {//     this.socket.close();// }try {this.socket = this.webSocketFactory(this.wsURL);} catch (e) {console.log("sekiro: create connection failed,reconnect after 2s");setTimeout(function () {_this.connect()}, 2000)}this.socket.onmessage(function (event) {_this.handleSekiroRequest(event.data)});this.socket.onopen(function (event) {console.log('sekiro: open a sekiro client connection')});this.socket.onclose(function (event) {console.log('sekiro: disconnected ,reconnection after 2s');setTimeout(function () {_this.connect()}, 2000)});};SekiroClient.prototype.handleSekiroRequest = function (requestJson) {console.log("receive sekiro request: " + requestJson);var request = JSON.parse(requestJson);var seq = request['__sekiro_seq__'];if (!request['action']) {this.sendFailed(seq, 'need request param {action}');return}var action = request['action'];if (!this.handlers[action]) {this.sendFailed(seq, 'no action handler: ' + action + ' defined');return}var theHandler = this.handlers[action];var _this = this;try {theHandler(request, function (response) {try {_this.sendSuccess(seq, response)} catch (e) {_this.sendFailed(seq, "e:" + e);}}, function (errorMessage) {_this.sendFailed(seq, errorMessage)})} catch (e) {console.log("error: " + e);_this.sendFailed(seq, ":" + e);}};SekiroClient.prototype.sendSuccess = function (seq, response) {var responseJson;if (typeof response == 'string') {try {responseJson = JSON.parse(response);} catch (e) {responseJson = {};responseJson['data'] = response;}} else if (typeof response == 'object') {responseJson = response;} else {responseJson = {};responseJson['data'] = response;}if (typeof response == 'string') {responseJson = {};responseJson['data'] = response;}if (Array.isArray(responseJson)) {responseJson = {data: responseJson,code: 0}}if (responseJson['code']) {responseJson['code'] = 0;} else if (responseJson['status']) {responseJson['status'] = 0;} else {responseJson['status'] = 0;}responseJson['__sekiro_seq__'] = seq;var responseText = JSON.stringify(responseJson);console.log("response :" + responseText);if (responseText.length < 1024 * 6) {this.socket.send(responseText);return;}if (this.base64) {responseText = this.base64Encode(responseText)}//澶ф姤鏂囪鍒嗘浼犺緭var segmentSize = 1024 * 5;var i = 0, totalFrameIndex = Math.floor(responseText.length / segmentSize) + 1;for (; i < totalFrameIndex; i++) {var frameData = JSON.stringify({__sekiro_frame_total: totalFrameIndex,__sekiro_index: i,__sekiro_seq__: seq,__sekiro_base64: this.base64,__sekiro_is_frame: true,__sekiro_content: responseText.substring(i * segmentSize, (i + 1) * segmentSize)});console.log("frame: " + frameData);this.socket.send(frameData);}};SekiroClient.prototype.sendFailed = function (seq, errorMessage) {if (typeof errorMessage != 'string') {errorMessage = JSON.stringify(errorMessage);}var responseJson = {};responseJson['message'] = errorMessage;responseJson['status'] = -1;responseJson['__sekiro_seq__'] = seq;var responseText = JSON.stringify(responseJson);console.log("sekiro: response :" + responseText);this.socket.send(responseText)};SekiroClient.prototype.registerAction = function (action, handler) {if (typeof action !== 'string') {throw new Error("an action must be string");}if (typeof handler !== 'function') {throw new Error("a handler must be function");}console.log("sekiro: register action: " + action);this.handlers[action] = handler;return this;};SekiroClient.prototype.encodeWithBase64 = function () {this.base64 = arguments && arguments.length > 0 && arguments[0];};SekiroClient.prototype.base64Encode = function (s) {if (arguments.length !== 1) {throw "SyntaxError: exactly one argument required";}s = String(s);if (s.length === 0) {return s;}function _get_chars(ch, y) {if (ch < 0x80) y.push(ch);else if (ch < 0x800) {y.push(0xc0 + ((ch >> 6) & 0x1f));y.push(0x80 + (ch & 0x3f));} else {y.push(0xe0 + ((ch >> 12) & 0xf));y.push(0x80 + ((ch >> 6) & 0x3f));y.push(0x80 + (ch & 0x3f));}}var _PADCHAR = "=",_ALPHA = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/",_VERSION = "1.1";//Mr. Ruan fix to 1.1 to support asian char(utf8)//s = _encode_utf8(s);var i,b10,y = [],x = [],len = s.length;i = 0;while (i < len) {_get_chars(s.charCodeAt(i), y);while (y.length >= 3) {var ch1 = y.shift();var ch2 = y.shift();var ch3 = y.shift();b10 = (ch1 << 16) | (ch2 << 8) | ch3;x.push(_ALPHA.charAt(b10 >> 18));x.push(_ALPHA.charAt((b10 >> 12) & 0x3F));x.push(_ALPHA.charAt((b10 >> 6) & 0x3f));x.push(_ALPHA.charAt(b10 & 0x3f));}i++;}switch (y.length) {case 1:var ch = y.shift();b10 = ch << 16;x.push(_ALPHA.charAt(b10 >> 18) + _ALPHA.charAt((b10 >> 12) & 0x3F) + _PADCHAR + _PADCHAR);break;case 2:var ch1 = y.shift();var ch2 = y.shift();b10 = (ch1 << 16) | (ch2 << 8);x.push(_ALPHA.charAt(b10 >> 18) + _ALPHA.charAt((b10 >> 12) & 0x3F) + _ALPHA.charAt((b10 >> 6) & 0x3f) + _PADCHAR);break;}return x.join("");};if (window.flag) {} else {function guid() {function S4() {return (((1 + Math.random()) * 0x10000) | 0).toString(16).substring(1);}return (S4() + S4() + "-" + S4() + "-" + S4() + "-" + S4() + "-" + S4() + S4() + S4());}// 创建一个标记用来判断是否创建套接字window.flag = true;var client = new SekiroClient("ws://127.0.0.1:5620/business-demo/register?group=ths&clientId=" + guid());client.registerAction("get_cookie", function (request, resolve, reject) {resolve(qn.update());})}})()

下图表示注入成功

python 代码:

import requests
from lxml import etree
import csv
import os
import time
class TongHuaShun():def __init__(self):self.headers = {"accept": "text/html, */*; q=0.01","accept-language": "zh-CN,zh;q=0.9","cache-control": "no-cache","hexin-v": "A0zJmRO2SEen1Fy7XzZobjejHaF7hfEA8isE4KYNWSuaqOKfzpXAv0I51Ij1","pragma": "no-cache","priority": "u=1, i","referer": "https://q.10jqka.com.cn/","sec-ch-ua": "\"Chromium\";v=\"140\", \"Not=A?Brand\";v=\"24\", \"Google Chrome\";v=\"140\"","sec-ch-ua-mobile": "?0","sec-ch-ua-platform": "\"Windows\"","sec-fetch-dest": "empty","sec-fetch-mode": "cors","sec-fetch-site": "same-origin","user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/140.0.0.0 Safari/537.36","x-requested-with": "XMLHttpRequest"}self.url = "https://q.10jqka.com.cn/index/index/board/all/field/zdf/order/desc/page/{}/ajax/1/"self.filename='ths.csv'def get_info(self,page):# 或者cookies变化的值params = {'group': 'ths','action': 'get_cookie',}res = requests.get('http://127.0.0.1:5620/business-demo/invoke', params=params)if res.status_code != 200:print("无法从Sekiro服务获取cookie")return ""try:cookie_data = res.json()['data']except KeyError:print("返回的数据中没有预期的cookie值")return ""cookies = {"v": cookie_data}response = requests.get(self.url.format(page), headers=self.headers, cookies=cookies)return response.textdef parse_data(self,data):html=etree.HTML(data)tr_list=html.xpath('//table/tbody/tr')for i in tr_list:data_list=i.xpath('./td//text()')print(data_list)item = {'序号': data_list[0],'代码': data_list[1],'名称': data_list[2],'现价': data_list[3],'涨跌幅(%)': data_list[4],'涨跌': data_list[5],'涨速(%)': data_list[6],'换手(%)': data_list[7],'量比': data_list[8],'振幅(%)': data_list[9],'成交额': data_list[10],'流通股': data_list[11],'流通市值': data_list[12],'HR': data_list[13]}print(item)self.save(item)print(f'{data_list[2]}-----保存成功')def save(self,item):file_exists = os.path.exists(self.filename)with open('ths.csv', 'a', encoding='utf-8',newline='')as f:header = ['序号','代码','名称','现价','涨跌幅(%)','涨跌','涨速(%)','换手(%)','量比','振幅(%)','成交额','流通股','流通市值','HR']f_csv = csv.DictWriter(f, fieldnames=header)if not file_exists:f_csv.writeheader()f_csv.writerow(item)def main(self):for page in range(1,15):print(f'正在爬取第{page}页')data=self.get_info(page)self.parse_data(data)time.sleep(3)if __name__ == '__main__':ths=TongHuaShun()ths.main()

案例二:

1.逆向目标

地址:https://jzsc.mohurd.gov.cn/data/company

接口:https://jzsc.mohurd.gov.cn/APi/webApi/dataservice/query/comp/list?pg=1&pgsz=15&total=450

加密参数:

2.逆向分析

3.代码实现

这个案例需要传参数,注意如何传参数

javascirpt代码:

(function () {function SekiroClient(wsURL) {this.wsURL = wsURL;this.handlers = {};this.socket = {};this.base64 = false;// checkif (!wsURL) {throw new Error('wsURL can not be empty!!')}this.webSocketFactory = this.resolveWebSocketFactory();this.connect()}SekiroClient.prototype.resolveWebSocketFactory = function () {if (typeof window === 'object') {var theWebSocket = window.WebSocket ? window.WebSocket : window.MozWebSocket;return function (wsURL) {function WindowWebSocketWrapper(wsURL) {this.mSocket = new theWebSocket(wsURL);}WindowWebSocketWrapper.prototype.close = function () {this.mSocket.close();};WindowWebSocketWrapper.prototype.onmessage = function (onMessageFunction) {this.mSocket.onmessage = onMessageFunction;};WindowWebSocketWrapper.prototype.onopen = function (onOpenFunction) {this.mSocket.onopen = onOpenFunction;};WindowWebSocketWrapper.prototype.onclose = function (onCloseFunction) {this.mSocket.onclose = onCloseFunction;};WindowWebSocketWrapper.prototype.send = function (message) {this.mSocket.send(message);};return new WindowWebSocketWrapper(wsURL);}}if (typeof weex === 'object') {// this is weex env : https://weex.apache.org/zh/docs/modules/websockets.htmltry {console.log("test webSocket for weex");var ws = weex.requireModule('webSocket');console.log("find webSocket for weex:" + ws);return function (wsURL) {try {ws.close();} catch (e) {}ws.WebSocket(wsURL, '');return ws;}} catch (e) {console.log(e);//ignore}}//TODO support ReactNativeif (typeof WebSocket === 'object') {return function (wsURL) {return new theWebSocket(wsURL);}}// weex 鍜� PC鐜鐨剋ebsocket API涓嶅畬鍏ㄤ竴鑷达紝鎵€浠ュ仛浜嗘娊璞″吋瀹�throw new Error("the js environment do not support websocket");};SekiroClient.prototype.connect = function () {console.log('sekiro: begin of connect to wsURL: ' + this.wsURL);var _this = this;// 涓峜heck close锛岃// if (this.socket && this.socket.readyState === 1) {//     this.socket.close();// }try {this.socket = this.webSocketFactory(this.wsURL);} catch (e) {console.log("sekiro: create connection failed,reconnect after 2s");setTimeout(function () {_this.connect()}, 2000)}this.socket.onmessage(function (event) {_this.handleSekiroRequest(event.data)});this.socket.onopen(function (event) {console.log('sekiro: open a sekiro client connection')});this.socket.onclose(function (event) {console.log('sekiro: disconnected ,reconnection after 2s');setTimeout(function () {_this.connect()}, 2000)});};SekiroClient.prototype.handleSekiroRequest = function (requestJson) {console.log("receive sekiro request: " + requestJson);var request = JSON.parse(requestJson);var seq = request['__sekiro_seq__'];if (!request['action']) {this.sendFailed(seq, 'need request param {action}');return}var action = request['action'];if (!this.handlers[action]) {this.sendFailed(seq, 'no action handler: ' + action + ' defined');return}var theHandler = this.handlers[action];var _this = this;try {theHandler(request, function (response) {try {_this.sendSuccess(seq, response)} catch (e) {_this.sendFailed(seq, "e:" + e);}}, function (errorMessage) {_this.sendFailed(seq, errorMessage)})} catch (e) {console.log("error: " + e);_this.sendFailed(seq, ":" + e);}};SekiroClient.prototype.sendSuccess = function (seq, response) {var responseJson;if (typeof response == 'string') {try {responseJson = JSON.parse(response);} catch (e) {responseJson = {};responseJson['data'] = response;}} else if (typeof response == 'object') {responseJson = response;} else {responseJson = {};responseJson['data'] = response;}if (typeof response == 'string') {responseJson = {};responseJson['data'] = response;}if (Array.isArray(responseJson)) {responseJson = {data: responseJson,code: 0}}if (responseJson['code']) {responseJson['code'] = 0;} else if (responseJson['status']) {responseJson['status'] = 0;} else {responseJson['status'] = 0;}responseJson['__sekiro_seq__'] = seq;var responseText = JSON.stringify(responseJson);console.log("response :" + responseText);if (responseText.length < 1024 * 6) {this.socket.send(responseText);return;}if (this.base64) {responseText = this.base64Encode(responseText)}//澶ф姤鏂囪鍒嗘浼犺緭var segmentSize = 1024 * 5;var i = 0, totalFrameIndex = Math.floor(responseText.length / segmentSize) + 1;for (; i < totalFrameIndex; i++) {var frameData = JSON.stringify({__sekiro_frame_total: totalFrameIndex,__sekiro_index: i,__sekiro_seq__: seq,__sekiro_base64: this.base64,__sekiro_is_frame: true,__sekiro_content: responseText.substring(i * segmentSize, (i + 1) * segmentSize)});console.log("frame: " + frameData);this.socket.send(frameData);}};SekiroClient.prototype.sendFailed = function (seq, errorMessage) {if (typeof errorMessage != 'string') {errorMessage = JSON.stringify(errorMessage);}var responseJson = {};responseJson['message'] = errorMessage;responseJson['status'] = -1;responseJson['__sekiro_seq__'] = seq;var responseText = JSON.stringify(responseJson);console.log("sekiro: response :" + responseText);this.socket.send(responseText)};SekiroClient.prototype.registerAction = function (action, handler) {if (typeof action !== 'string') {throw new Error("an action must be string");}if (typeof handler !== 'function') {throw new Error("a handler must be function");}console.log("sekiro: register action: " + action);this.handlers[action] = handler;return this;};SekiroClient.prototype.encodeWithBase64 = function () {this.base64 = arguments && arguments.length > 0 && arguments[0];};SekiroClient.prototype.base64Encode = function (s) {if (arguments.length !== 1) {throw "SyntaxError: exactly one argument required";}s = String(s);if (s.length === 0) {return s;}function _get_chars(ch, y) {if (ch < 0x80) y.push(ch);else if (ch < 0x800) {y.push(0xc0 + ((ch >> 6) & 0x1f));y.push(0x80 + (ch & 0x3f));} else {y.push(0xe0 + ((ch >> 12) & 0xf));y.push(0x80 + ((ch >> 6) & 0x3f));y.push(0x80 + (ch & 0x3f));}}var _PADCHAR = "=",_ALPHA = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/",_VERSION = "1.1";//Mr. Ruan fix to 1.1 to support asian char(utf8)//s = _encode_utf8(s);var i,b10,y = [],x = [],len = s.length;i = 0;while (i < len) {_get_chars(s.charCodeAt(i), y);while (y.length >= 3) {var ch1 = y.shift();var ch2 = y.shift();var ch3 = y.shift();b10 = (ch1 << 16) | (ch2 << 8) | ch3;x.push(_ALPHA.charAt(b10 >> 18));x.push(_ALPHA.charAt((b10 >> 12) & 0x3F));x.push(_ALPHA.charAt((b10 >> 6) & 0x3f));x.push(_ALPHA.charAt(b10 & 0x3f));}i++;}switch (y.length) {case 1:var ch = y.shift();b10 = ch << 16;x.push(_ALPHA.charAt(b10 >> 18) + _ALPHA.charAt((b10 >> 12) & 0x3F) + _PADCHAR + _PADCHAR);break;case 2:var ch1 = y.shift();var ch2 = y.shift();b10 = (ch1 << 16) | (ch2 << 8);x.push(_ALPHA.charAt(b10 >> 18) + _ALPHA.charAt((b10 >> 12) & 0x3F) + _ALPHA.charAt((b10 >> 6) & 0x3f) + _PADCHAR);break;}return x.join("");};if (window.flag) {} else {function guid() {function S4() {return (((1 + Math.random()) * 0x10000) | 0).toString(16).substring(1);}return (S4() + S4() + "-" + S4() + "-" + S4() + "-" + S4() + "-" + S4() + S4() + S4());}// 创建一个标记用来判断是否创建套接字window.flag = true;var client = new SekiroClient("ws://127.0.0.1:5620/business-demo/register?group=jzsc&clientId=" + guid());client.registerAction("get_data", function (request, resolve, reject) {py_data=request["data"]resolve(b(py_data));})}})()

python代码:

import requests
import pymysql
import jsonclass Jzsc():def __init__(self):self.db = pymysql.connect(host='localhost',user='root',password='123456',database='py_spider')  # 数据库名字# 使用cursor()方法获取操作游标self.cursor = self.db.cursor()self.headers = {"Accept": "application/json, text/plain, */*","Accept-Language": "zh-CN,zh;q=0.9","Cache-Control": "no-cache","Connection": "keep-alive","Pragma": "no-cache","Referer": "https://jzsc.mohurd.gov.cn/data/company","Sec-Fetch-Dest": "empty","Sec-Fetch-Mode": "cors","Sec-Fetch-Site": "same-origin","User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/140.0.0.0 Safari/537.36","accessToken;": "","sec-ch-ua": "\"Chromium\";v=\"140\", \"Not=A?Brand\";v=\"24\", \"Google Chrome\";v=\"140\"","sec-ch-ua-mobile": "?0","sec-ch-ua-platform": "\"Windows\"","timeout": "30000","v": "231012"}self.cookies = {"Hm_lvt_b1b4b9ea61b6f1627192160766a9c55c": "1758382618,1758992952","Hm_lpvt_b1b4b9ea61b6f1627192160766a9c55c": "1758992952","HMACCOUNT": "C844EA1E4B3823E0"}self.url = "https://jzsc.mohurd.gov.cn/APi/webApi/dataservice/query/comp/list"def get_info(self, page):params = {"pg": str(page),"pgsz": "15","total": "450"}response = requests.get(self.url, headers=self.headers, cookies=self.cookies, params=params)return response.textdef parse_data(self, data1):data = {'group': 'jzsc','action': 'get_data','data': data1}res = requests.post('http://127.0.0.1:5620/business-demo/invoke', data=data)for i in json.loads(res.json()['data'])['data']['list']:code = i['QY_ORG_CODE']Legal_Person = i['QY_FR_NAME']  # 法人company_name = i['QY_NAME']  # 公司address = i['QY_REGION_NAME']  # 企业注册地址self.save(code, Legal_Person, company_name, address)"""创建数据表"""def create_table(self):sql = """create table if not exists jzsc1(id int primary key auto_increment,code varchar(100) ,Legal_Person varchar(50),company_name varchar(100),address varchar(150))"""try:self.cursor.execute(sql)print('创建成功')except Exception as e:print('表创建成功', e)"""插入数据"""def save(self, code, Legal_Person, company_name, address):sql = """insert into jzsc1(code,Legal_Person,company_name,address) values(%s,%s,%s,%s)"""try:self.cursor.execute(sql, (code, Legal_Person, company_name, address))self.db.commit()print('插入成功', code)except Exception as e:print(f'插入失败{e}')self.db.rollback()def main(self):self.create_table()for page in range(1, 15):print(f'正在爬取第{page}页')data = self.get_info(page)self.parse_data(data)if __name__ == '__main__':jz = Jzsc()jz.main()

http://www.dtcms.com/a/418432.html

相关文章:

  • tldr的安装与使用
  • Imatest-Star模块(西门子星图)
  • Unity 3D笔记——《B站阿发你好》
  • R语言从入门到精通Day3之【包的使用】
  • rocr专栏介绍
  • 济南网站建设 推搜点搜索优化的培训免费咨询
  • pc网站建设哪个好重庆seo网站运营
  • 沙箱1111111
  • 2、order-service 企业级代码目录结构规范
  • C# MVVM模式和Qt中MVC模式的比较
  • html mip 网站阿里云装wordpress慢
  • 权限校验是否应该在 Spring Cloud Gateway 中进行?
  • MariaDB数据库管理
  • 21.mariadb 数据库
  • GFM100 地线连续性检测监控器:破解工业接地痛点,筑牢电力系统安全防线
  • 2、Nginx 与 Spring Cloud Gateway 详细对比:定位、场景与分工
  • 玳瑁的嵌入式日记---0928(ARM--I2C)
  • 微服务故障排查
  • 离散时间马尔可夫链
  • 怎么做网站快照网站域名跳转代码html
  • 基于 OpenCV + 深度学习的实时人脸检测与年龄性别识别系统
  • c++ opencv 复现Fiji 配对拼接算法中的加权融合
  • 中秋国庆双节餐饮零售破局!Deepoc 具身模型外拓板打造 “假日智能运营新范式
  • 瑞安网站建设电话百度商桥接入网站
  • 嵌入式硬件——I.MX6ULL EPIT(增强型周期中断定时器)
  • 降低测试成本缩短测试周期 | 车辆OBD数据采集方案
  • 一级消防工程师考试时间新闻类网站怎么做seo
  • window显示驱动开发—确定显示适配器上的 VidPN 支持
  • Kafka05-入门-尚硅谷
  • Visual Studio 2022