当前位置: 首页 > news >正文

爬虫进阶 JS逆向基础超详细,解锁加密数据

简述

        本文将讲解如何使用JS逆向的技术将加密后的动态数据转换成人能看懂的真实数据。代码建立在爬虫基础之上,若是没有基础可以先看看基础(本文不会解释基础代码),也可以看看我的:会python就学得会的爬虫基础(只讲实战)_怎么才能把python学到什么都能爬的地步-CSDN博客

锁定想要的数据

选取网站:企名片科创平台

网站一览

        我们想要爬取的是图中的各个【文字内容】。由图可知,每个大的【标题】都会跟着一小段【概要】,一篇文章的【概要】肯定不是文章的所有内容。对于常见的【动态数据】XHR或Fetch),有以下特点:

  • 点击“加载更多”
  • 搜索框自动补全
  • 表格分页
  • 实时数据更新

        我们可以大致推断这些数据是【动态数据】。那么,打开【抓包工具】,在【网络】中将筛选条件设为【Fetch/XHR】,得到以下资源:

        经过搜寻,只有第二个文件在【预览】中包含一个encrypt_data,是我们想要的数据,完成对数据的锁定。

        并且,在锁定文件后,我们可以构建基础的爬虫代码。为了方便,在此使用在线工具:Convert curl commands to Python构建基础的python代码。我们只需要将:

的数据放在

【Curl command】中,并且复制下方的代码就构建完成了。

对数据进行解密

        通过对数据值进行观察,我们人类完全看不懂,因此推测可能是加密后的数据。那么我们怎么知道它的加密方式呢?这个时候就需要用到【JS逆向】技术了。

        顾名思义,【JS逆向】就是我们写爬虫的人通过网页编程人员的JS文件反推他们写网页的逻辑。因此,找到数据的加密方式,就是找到密文到网页中我们看到的文字的解密代码

        找到数据的解密代码,我们可以以该数据为【锚点】,取搜寻代码中出现该数据的地方。

        点击最右边搜索按钮,把encrypt_data输在搜索框内,并找寻.js的文件:

        很巧,只有第一个也是仅此一个的js文件。那么我们只需要点进去查看分析js代码就可以了。不过这里有三个地方都出现了【锚点】,该如何抉择?

【附】锚点找寻注意事项

1.我们要找寻的是方法,是函数,因为我们现在缺的是这个数据是怎么变成真实数据的,所以以 . 调用的对象属性代码【出局】,以()调用的代码【入局】。

2.关键字出现过于频繁,增加筛选条件:

添加文件url的非协议非域名部分。

        那么点击第二个或第三个进入js文件代码。在这两个地方打上断点并重新加载页面:

        这样就可以对网页数据进行拦截并获取想要的数据了。可以看到这是一个简单的逻辑判断赋值语句:

e.encrypt_data && (t.baseURL === "https://businessapi.qimingpian.cn" ? e.data = Mc(e.encrypt_data) : e.data = Kc(e.encrypt_data))

        如此一来更能证明这是一个动态加载的数据了,这个三目运算符揭晓了它【两面三刀】的特点:大致来说。如果是主页面就放【精简版】,否则就是【完整版】。那么,我们只需要对数据进行后面函数的调用,就可以得到明文了。

【附】获取JS代码

1.将鼠标悬浮在Kc上,可以看到:

点击蓝色链接,可以跳转到函数的代码位置,复制就可以了。

对于变量,直接Ctrl+F搜索就行了。

2.在【控制台】输入:

函数名.toString()

打印出来的就是函数体了:

不过这个方法需要注意在此断点执行了此函数。也就是说,调整你断点的位置和时机,获取想要的函数。当然,对于变量也是同理,直接输出变量名就可以了。

        那么,下一步就是调试JS代码看能不能成功了。在PyCharm创建JS文件,输入Kc函数和输出语句(不能运行的下载Node.js插件):

        很明显没有结果。查看错误信息:

ReferenceError: Vc is not defined

        说明Vc函数没有定义。循环使用【附】的方法,补齐缺失的函数和变量,就得到了最终完整解码JS文件。

function Kc(e) {return JSON.parse(Vc("sjdqmp20161205#_316@gfmt", decode(e), 0, 0, "012345677890123", 1))
}function Fc(e) {for (var t = new Array(0, 4, 536870912, 536870916, 65536, 65540, 536936448, 536936452, 512, 516, 536871424, 536871428, 66048, 66052, 536936960, 536936964), n = new Array(0, 1, 1048576, 1048577, 67108864, 67108865, 68157440, 68157441, 256, 257, 1048832, 1048833, 67109120, 67109121, 68157696, 68157697), o = new Array(0, 8, 2048, 2056, 16777216, 16777224, 16779264, 16779272, 0, 8, 2048, 2056, 16777216, 16777224, 16779264, 16779272), a = new Array(0, 2097152, 134217728, 136314880, 8192, 2105344, 134225920, 136323072, 131072, 2228224, 134348800, 136445952, 139264, 2236416, 134356992, 136454144), r = new Array(0, 262144, 16, 262160, 0, 262144, 16, 262160, 4096, 266240, 4112, 266256, 4096, 266240, 4112, 266256), i = new Array(0, 1024, 32, 1056, 0, 1024, 32, 1056, 33554432, 33555456, 33554464, 33555488, 33554432, 33555456, 33554464, 33555488), c = new Array(0, 268435456, 524288, 268959744, 2, 268435458, 524290, 268959746, 0, 268435456, 524288, 268959744, 2, 268435458, 524290, 268959746), l = new Array(0, 65536, 2048, 67584, 536870912, 536936448, 536872960, 536938496, 131072, 196608, 133120, 198656, 537001984, 537067520, 537004032, 537069568), s = new Array(0, 262144, 0, 262144, 2, 262146, 2, 262146, 33554432, 33816576, 33554432, 33816576, 33554434, 33816578, 33554434, 33816578), d = new Array(0, 268435456, 8, 268435464, 0, 268435456, 8, 268435464, 1024, 268436480, 1032, 268436488, 1024, 268436480, 1032, 268436488), m = new Array(0, 32, 0, 32, 1048576, 1048608, 1048576, 1048608, 8192, 8224, 8192, 8224, 1056768, 1056800, 1056768, 1056800), C = new Array(0, 16777216, 512, 16777728, 2097152, 18874368, 2097664, 18874880, 67108864, 83886080, 67109376, 83886592, 69206016, 85983232, 69206528, 85983744), E = new Array(0, 4096, 134217728, 134221824, 524288, 528384, 134742016, 134746112, 16, 4112, 134217744, 134221840, 524304, 528400, 134742032, 134746128), P = new Array(0, 4, 256, 260, 0, 4, 256, 260, 1, 5, 257, 261, 1, 5, 257, 261), x = e.length > 8 ? 3 : 1, y = new Array(32 * x), w = new Array(0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0), p, S, b = 0, h = 0, v, $ = 0; $ < x; $++) {var g = e.charCodeAt(b++) << 24 | e.charCodeAt(b++) << 16 | e.charCodeAt(b++) << 8 | e.charCodeAt(b++),k = e.charCodeAt(b++) << 24 | e.charCodeAt(b++) << 16 | e.charCodeAt(b++) << 8 | e.charCodeAt(b++);v = (g >>> 4 ^ k) & 252645135, k ^= v, g ^= v << 4, v = (k >>> -16 ^ g) & 65535, g ^= v, k ^= v << -16, v = (g >>> 2 ^ k) & 858993459, k ^= v, g ^= v << 2, v = (k >>> -16 ^ g) & 65535, g ^= v, k ^= v << -16, v = (g >>> 1 ^ k) & 1431655765, k ^= v, g ^= v << 1, v = (k >>> 8 ^ g) & 16711935, g ^= v, k ^= v << 8, v = (g >>> 1 ^ k) & 1431655765, k ^= v, g ^= v << 1, v = g << 8 | k >>> 20 & 240, g = k << 24 | k << 8 & 16711680 | k >>> 8 & 65280 | k >>> 24 & 240, k = v;for (var L = 0; L < w.length; L++) w[L] ? (g = g << 2 | g >>> 26, k = k << 2 | k >>> 26) : (g = g << 1 | g >>> 27, k = k << 1 | k >>> 27), g &= -15, k &= -15, p = t[g >>> 28] | n[g >>> 24 & 15] | o[g >>> 20 & 15] | a[g >>> 16 & 15] | r[g >>> 12 & 15] | i[g >>> 8 & 15] | c[g >>> 4 & 15], S = l[k >>> 28] | s[k >>> 24 & 15] | d[k >>> 20 & 15] | m[k >>> 16 & 15] | C[k >>> 12 & 15] | E[k >>> 8 & 15] | P[k >>> 4 & 15], v = (S >>> 16 ^ p) & 65535, y[h++] = p ^ v, y[h++] = S ^ v << 16}return y
}decode = function(p) {u = /[\t\n\f\r ]/gc = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/'p = String(p).replace(u, "");var m = p.length;m % 4 == 0 && (p = p.replace(/==?$/, ""),m = p.length),(m % 4 == 1 || /[^+a-zA-Z0-9/]/.test(p)) && a("Invalid character: the string to be decoded is not correctly encoded.");for (var _ = 0, T, w, y = "", E = -1; ++E < m; )w = c.indexOf(p.charAt(E)),T = _ % 4 ? T * 64 + w : w,_++ % 4 && (y += String.fromCharCode(255 & T >> (-2 * _ & 6)));return y
}function Vc(e, t, n, o, a, r) {var i = new Array(16843776, 0, 65536, 16843780, 16842756, 66564, 4, 65536, 1024, 16843776, 16843780, 1024, 16778244, 16842756, 16777216, 4, 1028, 16778240, 16778240, 66560, 66560, 16842752, 16842752, 16778244, 65540, 16777220, 16777220, 65540, 0, 1028, 66564, 16777216, 65536, 16843780, 4, 16842752, 16843776, 16777216, 16777216, 1024, 16842756, 65536, 66560, 16777220, 1024, 4, 16778244, 66564, 16843780, 65540, 16842752, 16778244, 16777220, 1028, 66564, 16843776, 1028, 16778240, 16778240, 0, 65540, 66560, 0, 16842756),c = new Array(-2146402272, -2147450880, 32768, 1081376, 1048576, 32, -2146435040, -2147450848, -2147483616, -2146402272, -2146402304, -2147483648, -2147450880, 1048576, 32, -2146435040, 1081344, 1048608, -2147450848, 0, -2147483648, 32768, 1081376, -2146435072, 1048608, -2147483616, 0, 1081344, 32800, -2146402304, -2146435072, 32800, 0, 1081376, -2146435040, 1048576, -2147450848, -2146435072, -2146402304, 32768, -2146435072, -2147450880, 32, -2146402272, 1081376, 32, 32768, -2147483648, 32800, -2146402304, 1048576, -2147483616, 1048608, -2147450848, -2147483616, 1048608, 1081344, 0, -2147450880, 32800, -2147483648, -2146435040, -2146402272, 1081344),l = new Array(520, 134349312, 0, 134348808, 134218240, 0, 131592, 134218240, 131080, 134217736, 134217736, 131072, 134349320, 131080, 134348800, 520, 134217728, 8, 134349312, 512, 131584, 134348800, 134348808, 131592, 134218248, 131584, 131072, 134218248, 8, 134349320, 512, 134217728, 134349312, 134217728, 131080, 520, 131072, 134349312, 134218240, 0, 512, 131080, 134349320, 134218240, 134217736, 512, 0, 134348808, 134218248, 131072, 134217728, 134349320, 8, 131592, 131584, 134217736, 134348800, 134218248, 520, 134348800, 131592, 8, 134348808, 131584),s = new Array(8396801, 8321, 8321, 128, 8396928, 8388737, 8388609, 8193, 0, 8396800, 8396800, 8396929, 129, 0, 8388736, 8388609, 1, 8192, 8388608, 8396801, 128, 8388608, 8193, 8320, 8388737, 1, 8320, 8388736, 8192, 8396928, 8396929, 129, 8388736, 8388609, 8396800, 8396929, 129, 0, 0, 8396800, 8320, 8388736, 8388737, 1, 8396801, 8321, 8321, 128, 8396929, 129, 1, 8192, 8388609, 8193, 8396928, 8388737, 8193, 8320, 8388608, 8396801, 128, 8388608, 8192, 8396928),d = new Array(256, 34078976, 34078720, 1107296512, 524288, 256, 1073741824, 34078720, 1074266368, 524288, 33554688, 1074266368, 1107296512, 1107820544, 524544, 1073741824, 33554432, 1074266112, 1074266112, 0, 1073742080, 1107820800, 1107820800, 33554688, 1107820544, 1073742080, 0, 1107296256, 34078976, 33554432, 1107296256, 524544, 524288, 1107296512, 256, 33554432, 1073741824, 34078720, 1107296512, 1074266368, 33554688, 1073741824, 1107820544, 34078976, 1074266368, 256, 33554432, 1107820544, 1107820800, 524544, 1107296256, 1107820800, 34078720, 0, 1074266112, 1107296256, 524544, 33554688, 1073742080, 524288, 0, 1074266112, 34078976, 1073742080),m = new Array(536870928, 541065216, 16384, 541081616, 541065216, 16, 541081616, 4194304, 536887296, 4210704, 4194304, 536870928, 4194320, 536887296, 536870912, 16400, 0, 4194320, 536887312, 16384, 4210688, 536887312, 16, 541065232, 541065232, 0, 4210704, 541081600, 16400, 4210688, 541081600, 536870912, 536887296, 16, 541065232, 4210688, 541081616, 4194304, 16400, 536870928, 4194304, 536887296, 536870912, 16400, 536870928, 541081616, 4210688, 541065216, 4210704, 541081600, 0, 541065232, 16, 16384, 541065216, 4210704, 16384, 4194320, 536887312, 0, 541081600, 536870912, 4194320, 536887312),C = new Array(2097152, 69206018, 67110914, 0, 2048, 67110914, 2099202, 69208064, 69208066, 2097152, 0, 67108866, 2, 67108864, 69206018, 2050, 67110912, 2099202, 2097154, 67110912, 67108866, 69206016, 69208064, 2097154, 69206016, 2048, 2050, 69208066, 2099200, 2, 67108864, 2099200, 67108864, 2099200, 2097152, 67110914, 67110914, 69206018, 69206018, 2, 2097154, 67108864, 67110912, 2097152, 69208064, 2050, 2099202, 69208064, 2050, 67108866, 69208066, 69206016, 2099200, 0, 2, 69208066, 0, 2099202, 69206016, 2048, 67108866, 67110912, 2048, 2097154),E = new Array(268439616, 4096, 262144, 268701760, 268435456, 268439616, 64, 268435456, 262208, 268697600, 268701760, 266240, 268701696, 266304, 4096, 64, 268697600, 268435520, 268439552, 4160, 266240, 262208, 268697664, 268701696, 4160, 0, 0, 268697664, 268435520, 268439552, 266304, 262144, 266304, 262144, 268701696, 4096, 64, 268697664, 4096, 266304, 268439552, 64, 268435520, 268697600, 268697664, 268435456, 262144, 268439616, 0, 268701760, 262208, 268435520, 268697600, 268439552, 268439616, 0, 268701760, 266240, 266240, 4160, 4160, 262208, 268435456, 268701696),P = Fc(e), x = 0, y, w, p, S, b, h, v, $, g, k, L, F, N, V, H = t.length, q = 0, ce = P.length == 32 ? 3 : 9;ce == 3 ? $ = n ? new Array(0, 32, 2) : new Array(30, -2, -2) : $ = n ? new Array(0, 32, 2, 62, 30, -2, 64, 96, 2) : new Array(94, 62, -2, 32, 64, 2, 30, -2, -2), r == 2 ? t += "        " : r == 1 ? n && (p = 8 - H % 8, t += String.fromCharCode(p, p, p, p, p, p, p, p), p === 8 && (H += 8)) : r || (t += "\0\0\0\0\0\0\0\0");var Y = "", f = "";for (o == 1 && (g = a.charCodeAt(x++) << 24 | a.charCodeAt(x++) << 16 | a.charCodeAt(x++) << 8 | a.charCodeAt(x++), L = a.charCodeAt(x++) << 24 | a.charCodeAt(x++) << 16 | a.charCodeAt(x++) << 8 | a.charCodeAt(x++), x = 0); x < H;) {for (h = t.charCodeAt(x++) << 24 | t.charCodeAt(x++) << 16 | t.charCodeAt(x++) << 8 | t.charCodeAt(x++), v = t.charCodeAt(x++) << 24 | t.charCodeAt(x++) << 16 | t.charCodeAt(x++) << 8 | t.charCodeAt(x++), o == 1 && (n ? (h ^= g, v ^= L) : (k = g, F = L, g = h, L = v)), p = (h >>> 4 ^ v) & 252645135, v ^= p, h ^= p << 4, p = (h >>> 16 ^ v) & 65535, v ^= p, h ^= p << 16, p = (v >>> 2 ^ h) & 858993459, h ^= p, v ^= p << 2, p = (v >>> 8 ^ h) & 16711935, h ^= p, v ^= p << 8, p = (h >>> 1 ^ v) & 1431655765, v ^= p, h ^= p << 1, h = h << 1 | h >>> 31, v = v << 1 | v >>> 31, w = 0; w < ce; w += 3) {for (N = $[w + 1], V = $[w + 2], y = $[w]; y != N; y += V) S = v ^ P[y], b = (v >>> 4 | v << 28) ^ P[y + 1], p = h, h = v, v = p ^ (c[S >>> 24 & 63] | s[S >>> 16 & 63] | m[S >>> 8 & 63] | E[S & 63] | i[b >>> 24 & 63] | l[b >>> 16 & 63] | d[b >>> 8 & 63] | C[b & 63]);p = h, h = v, v = p}h = h >>> 1 | h << 31, v = v >>> 1 | v << 31, p = (h >>> 1 ^ v) & 1431655765, v ^= p, h ^= p << 1, p = (v >>> 8 ^ h) & 16711935, h ^= p, v ^= p << 8, p = (v >>> 2 ^ h) & 858993459, h ^= p, v ^= p << 2, p = (h >>> 16 ^ v) & 65535, v ^= p, h ^= p << 16, p = (h >>> 4 ^ v) & 252645135, v ^= p, h ^= p << 4, o == 1 && (n ? (g = h, L = v) : (h ^= k, v ^= F)), f += String.fromCharCode(h >>> 24, h >>> 16 & 255, h >>> 8 & 255, h & 255, v >>> 24, v >>> 16 & 255, v >>> 8 & 255, v & 255), q += 8, q == 512 && (Y += f, f = "", q = 0)}if (Y += f, Y = Y.replace(/\0*$/g, ""), !n) {if (r === 1) {var H = Y.length, _ = 0;H && (_ = Y.charCodeAt(H - 1)), _ <= 8 && (Y = Y.substring(0, H - _))}Y = decodeURIComponent(escape(Y))}return Y
}

        再利用python的基础爬虫代码调用JS文件工具,得到最终的数据:

import requests
import execjsheaders = {'Accept': 'application/json, text/plain, */*','Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6','Connection': 'keep-alive','Content-Type': 'application/x-www-form-urlencoded','Origin': 'https://wx.qmpsee.com','Platform': 'web','Sec-Fetch-Dest': 'empty','Sec-Fetch-Mode': 'cors','Sec-Fetch-Site': 'same-site','Source': 'see','User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/141.0.0.0 Safari/537.36 Edg/141.0.0.0','appflag': 'see-h5-1.0.0','sec-ch-ua': '"Microsoft Edge";v="141", "Not?A_Brand";v="8", "Chromium";v="141"','sec-ch-ua-mobile': '?0','sec-ch-ua-platform': '"Windows"',
}data = {'page': '1','num': '20','ca_uuid': 'feef62bfdac45a94b9cd89aed5c235be','appflag': 'see-h5-1.0.0',
}response = requests.post('https://wyiosapi.qmpsee.com/Web/getCaDetail', headers=headers, data=data)
encrypt_data = response.json()['encrypt_data']with open('encrypt.js', 'r', encoding='utf-8') as f:js_code = f.read()result = execjs.compile(js_code).call('Kc', encrypt_data)
print(result)

        至此,终焉。读者可以自行挖掘这零散的数据。

http://www.dtcms.com/a/549759.html

相关文章:

  • GF框架直接使用SQL语句查询数据库的指南
  • 美食网站素材怎么在网上卖产品
  • 网站建设综合实训设计报告怎么做单位网站
  • JavaWeb后端-JDBC、MyBatis
  • 网站访问流程改变WordPress界面
  • 聚合API平台如何重构AI开发效率?
  • 设计模式之单例模式:一个类就只有一个实例
  • 分布式数据库选型指南 (深入对比TiDB与OceanBase)
  • 模板方法模式:优雅地封装算法骨架
  • 有哪些做ppt用图片的网站有哪些免费咨询皮肤科医生在线
  • 理解 MySQL 架构:从连接到存储的全景视图
  • 电商网站 服务器易派客网站是谁做的
  • 大型语言模型(LLM)架构大比拼
  • 爱派(AiPy):一个让大语言模型直接操作Python完成任务
  • 【一加手机Bootloader解锁政策更新通知】
  • 什么是政企工作手机,有什么功能作用
  • 太原网站排名优化价格室内装修效果图网站有哪些
  • 深入探讨Python中三种核心数据结构:列表、字典和元组。
  • 建网站的几个公司通辽网站网站建设
  • 编辑 JAR 包内嵌套的 TXT 文件(Vim 操作)
  • 网站手机验证码如何做网站做链接代码
  • 无锡做网站6网站看不到预览图
  • Redis 限流最佳实践:令牌桶与滑动窗口全流程实现
  • *清理磁盘空间
  • 用什么软件做网站原型外贸退税流程及方法
  • 微软网站制作软件常见营销策略都有哪些
  • 全栈开源:一套源码快速构建电竞/体育直播平台(PC+H5+双端APP)
  • 淘宝网站维护用DW做的网站怎么弄成链接
  • 【C++】【常见面试题】最简版带大小和超时限制的LRU缓存实现
  • CSAPP实验2:Bomb