当前位置: 首页 > news >正文

猿人学web端爬虫攻防大赛赛题第7题——动态字体,随风漂移

解题步骤

  1. 看流量包。

    image

  2. 在看数据包的session中没有任何加密字段,请求头中也没有加密的参数。

    image

  3. 响应数据中的如&#xb428形式的数据与页面中的数字是一一对应的,直接搞一个字典,获取每个页面的数据然后对照一下就完事了。可惜没那么简单,再访问同一个页面,你会发现对照关系变了。

    image

  4. 所以上面的想法是有问题的,再看响应数据中的woff字段值也跟着变了,说明跟woff字段可能有关系。打断点,触发一下。

    image

    woff 文件是字体文件,实际上就是编码和字符的映射表,如 섴,&#x 是字符前缀,c134 是字符对应的编码

    ttf = data.woff;
    $('.font').text('').append('<style type="text/css">@font-face { font-family:"fonteditor";src: url(data:font/truetype;charset=utf-8;base64,' + ttf + '); }</style>');
    

    src中是woff文件的下载地址,这里可以看到 woff 文件被保存为了 ttf 格式,通过 python 将其下载下来:

    from fontTools.ttLib import TTFont  # pip install fontTools
    from base64 import b64decode
    from parsel import Selector  # pip install parseldef demo(data):"""data为接口返回的内容"""with open('7.ttf', mode='wb') as file:file.write(b64decode(data['woff']))  # 将 woff 字段 b64解码后写入到文件font = TTFont('7.ttf')  # 加载字体文件font.saveXML('7.xml')  # 保存为xml文件# 读取 xml 文件with open('7.xml', mode='r', encoding='utf-8') as f:xml_data = f.read()select = Selector(xml_data)glyf = select.css('glyf > TTGlyph')  # 获取 glyf 下所有的 TTglyph 标签for TTGlyph in glyf[1:]:  # 第 0 个标签的值是不需要的,所以从 第 1 个元素开始遍历name = TTGlyph.css('::attr(name)').get().replace('uni', '')  # 获取 TTGlyph 标签里对应的 name 属性,并将 uni 替换为空pt_tag = TTGlyph.css('pt')  # 获取 TTGlyph 下所有的 pt 标签on_list = []for pt in pt_tag:  # 遍历 pt 标签on = pt.css('::attr(on)').get()  # 获取 pt 标签里对应的 on 属性on_list.append(on)  # 将解析的到 on 属性值添加到列表中print(f"'{''.join(on_list)}': '{name}',")  # 打印出字典形式的字符串# ''.join(on_list) 对应字典键# name 对应字典值
    resp = {"woff": "AAEAAAAKAIAAAwAgT1MvMv/BOMUAAAEoAAAAYGNtYXDlXV9jAAABpAAAAYZnbHlmS99AtgAAA0QAAAQCaGVhZB6SqjgAAACsAAAANmhoZWEG0QEyAAAA5AAAACRobXR4ArwAAAAAAYgAAAAabG9jYQWOBpkAAAMsAAAAGG1heHABGABFAAABCAAAACBuYW1lUGhGMAAAB0gAAAJzcG9zdCjmdk0AAAm8AAAAiAABAAAAAQAA83P19l8PPPUACQPoAAAAANnIUd8AAAAA418UhQAH/+wCRwMDAAAACAACAAAAAAAAAAEAAAQk/qwAfgJYAAAAKwItAAEAAAAAAAAAAAAAAAAAAAACAAEAAAALADkAAwAAAAAAAgAAAAoACgAAAP8AAAAAAAAABAIqAZAABQAIAtED0wAAAMQC0QPTAAACoABEAWkAAAIABQMAAAAAAAAAAAAAEAAAAAAAAAAAAAAAUGZFZABApDXINwQk/qwAfgQkAVQAAAABAAAAAAAAAAAAAAAgAAAAZAAAAlgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAwAAAAMAAAAcAAEAAAAAAIAAAwABAAAAHAAEAGQAAAAUABAAAwAEpDWmhaeTtoTCV8KGxCHFkcg3//8AAKQ1poWnk7aEwlfChcQhxZHIN///W89ZhVhwSYQ9sgAAO+A6dTfQAAEAAAAAAAAAAAAAAAoAAAAAAAAAAAACAAUAAAEGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAALADoAcgDIAQkBHwFiAX8BsAHuAgEAAQAH/+wAIwAHAAIAADczFQccBxsAAAIAI//yAhcDAwAMABkAAAEmBwYQFxYyNzYQJyYHMhcWEAcGIicmEDc2ASR8REFBRP1rBwdrgWUfKSkfvCEfHyEC3iWiU/7EdWtrdQE8U6J4bDn+9DtdXTsBDDlsAAABACr/8gIhAt4AJAAAEwMzNjM2MxYWFRQGIyInJjcjFhcWMzI3NjU0JiMGBwYHIzchNWMVQhwpIi1ZY3RGSBs7BWMURVJCfloyfHIfNisbDCABWALe/lVOKBtIZFVcKRg/PVAyTjZjeIYDFQsu+10AAwAj//ICJQLeAB8ALAA4AAABIgcGFRQXFhc1JgcGFRQWMjc2NTQnJgcVNjc0NTQnJgcyFxYUBwYiJyY0NzYTMhcWFAcGIiY0NzYBJGk4NQEcRTgTQov4OEcnI0BDMidUcVMlU0kjqAZJKSxDXCU2NiifZjQyAt5AJWc4HjkJAwRJMUZhbjk1YUYxSQQDCTkeOGclQFcfD38bJiYbfw8f/tYjOoIkJkqCOiMAAAIAKv/yAhoDAwAbACgAAAEiBwYVFBcWFzY2NCYjIgciByM3NDc2FzYXMyYDFjcWFAcGJyInJjQ2AUFsUllBRZFbe0iNMyNAJQcNSBdVjiArKLtGUgInJ0xaIi9gAt6DS7N7m1QBAX/ich1MJHFSWg8Pd9z+mhsncZ8rPQkqfkd0AAABAHAAAAFoAt4ACQAAAQYGBxU2NxEzEQEjH2oqfChUAt5ALhhUHDv9pQLeAAEAKv/yAhcC3gArAAABIgcGFzM2NjcWFxYUBiMjFTMWFhQHBiMmJyYnIxYXFhcyNjU0JyYHNjU0JgE3XFRHC0kTPlREHkVdTENNPmERTEJfMikFWBRHP31ceiIVRXRqAt5ALm5CRwMDIw+MMEwCRX0xKxEXQEB8KUEBfGQ7OWMzJ4NBfQAAAgAaAAACRwLeAAoADgAAAQEVIRUzNTM1IxEHMxEhAYP+lwFpSnp6UAb+1gLe/hlilZVDAgaL/oUAAAEANAAAAjAC3gAdAAABIgYHMzY3NhcyFhUUBwYHBgcGFSE1ITY3Njc2NCYBQW2JC0cRJSBaRnJaHltzJk8B5v58P2hwJl2UAt6Wckw3NgRKIV4ZQz1VLjtrQFFfVmMJqYMAAgAq//ICFwLeABwAKAAAASIGFRQXFjMWNjczFxQHBiMiJyMWMzY3NjU0JyYHMhcWFAYjIiY1NDYBJGCaRTqYCG4YEQ9FM1NgOEEEzXlNNS5OhGYgUo9JQ0dPAt6Fc3I6Ogk7NAmDWEhstAF4Wb2hTm5PNjeUVIIqXksAAAEASwAAAg8C3gAGAAATFSEBMwE1SwGG/uBTAQsC3mX9hwKeQAAAAAAAABIA3gABAAAAAAAAABcAAAABAAAAAAABAAwAFwABAAAAAAACAAcAIwABAAAAAAADABQAKgABAAAAAAAEABQAKgABAAAAAAAFAAsAPgABAAAAAAAGABQAKgABAAAAAAAKACsASQABAAAAAAALABMAdAADAAEECQAAAC4AhwADAAEECQABABgAtQADAAEECQACAA4AzQADAAEECQADACgA2wADAAEECQAEACgA2wADAAEECQAFABYBAwADAAEECQAGACgA2wADAAEECQAKAFYBGQADAAEECQALACYBb0NyZWF0ZWQgYnkgZm9udC1jYXJyaWVyLlBpbmdGYW5nIFNDUmVndWxhci5QaW5nRmFuZy1TQy1SZWd1bGFyVmVyc2lvbiAxLjBHZW5lcmF0ZWQgYnkgc3ZnMnR0ZiBmcm9tIEZvbnRlbGxvIHByb2plY3QuaHR0cDovL2ZvbnRlbGxvLmNvbQBDAHIAZQBhAHQAZQBkACAAYgB5ACAAZgBvAG4AdAAtAGMAYQByAHIAaQBlAHIALgBQAGkAbgBnAEYAYQBuAGcAIABTAEMAUgBlAGcAdQBsAGEAcgAuAFAAaQBuAGcARgBhAG4AZwAtAFMAQwAtAFIAZQBnAHUAbABhAHIAVgBlAHIAcwBpAG8AbgAgADEALgAwAEcAZQBuAGUAcgBhAHQAZQBkACAAYgB5ACAAcwB2AGcAMgB0AHQAZgAgAGYAcgBvAG0AIABGAG8AbgB0AGUAbABsAG8AIABwAHIAbwBqAGUAYwB0AC4AaAB0AHQAcAA6AC8ALwBmAG8AbgB0AGUAbABsAG8ALgBjAG8AbQAAAgAAAAAAAAAOAAAAAAAAAAAAAAAAAAAAAAAAAAAACwALAAABCgEEAQkBCAELAQYBBQEDAQIBBwd1bmljMjU3B3VuaWI2ODQHdW5pYzI4NQd1bmljODM3B3VuaWM1OTEHdW5pYTY4NQd1bmlhNDM1B3VuaWE3OTMHdW5pYzQyMQd1bmljMjg2","status": "1","state": "success","data": [{"value": "&#xc591 &#xb684 &#xc591 &#xa435 "},{"value": "&#xc285 &#xc421 &#xc837 &#xc286 "},{"value": "&#xc591 &#xc257 &#xc285 &#xa793 "},{"value": "&#xa793 &#xc285 &#xc285 &#xc421 "},{"value": "&#xa685 &#xc421 &#xc591 &#xa685 "},{"value": "&#xa793 &#xa793 &#xc257 &#xa793 "},{"value": "&#xb684 &#xc286 &#xc257 &#xc421 "},{"value": "&#xa793 &#xc837 &#xc421 &#xc421 "},{"value": "&#xc837 &#xc285 &#xc421 &#xc421 "},{"value": "&#xa685 &#xc837 &#xa685 &#xa793 "}]
    }
    demo(resp)
    

    运行得到映射结果。
     

    image

    image

  5. 解析得到映射字典。

    on_map = {'1001101111': '1','101010101101010001010101101010101010010010010101001000010': '8','10101010100001010111010101101010010101000': '6','10100100100101010010010010': '0','1110101001001010110101010100101011111': '5','10010101001110101011010101010101000100100': '9','100110101001010101011110101000': '2','111111111111111': '4','1111111': '7','10101100101000111100010101011010100101010100': '3',
    }
    
  6. 有了映射字典就可以请求并解析到正确的数字了

    from fontTools.ttLib import TTFont  # pip install fontTools
    from base64 import b64decode
    from parsel import Selector
    import requestsdef save_font(font_data):on_map = {'1001101111': '1','101010101101010001010101101010101010010010010101001000010': '8','10101010100001010111010101101010010101000': '6','10100100100101010010010010': '0','1110101001001010110101010100101011111': '5','10010101001110101011010101010101000100100': '9','100110101001010101011110101000': '2','111111111111111': '4','1111111': '7','10101100101000111100010101011010100101010100': '3',}with open('7.ttf', mode='wb') as f:f.write(b64decode(font_data['woff']))  # 保存字体文件font = TTFont('7.ttf')  # 加载字体文件font.saveXML('7.xml')  # 保存为xml文件# 读取 xml 文件with open('7.xml', mode='r', encoding='utf-8') as f:xml_data = f.read()select = Selector(xml_data)# 获取 <glyf> --> 所有 TTGlyph 标签TTGlyph = select.css('glyf > TTGlyph')[1:]  # 第 0 个标签的信息不需要,从第 1 个标签开始获取rep_dist = {}for tt in TTGlyph:name = tt.css('::attr(name)').get().replace('uni', '')  # TTGlyph标签 --> name 值pt = tt.css('pt')  # 获取 Glyph标签 --> TTGlyph标签 --> pt标签对应的 on 值on_list = []for pt_tag in pt:on_list.append(pt_tag.css('::attr(on)').get())rep_dist[name] = on_map[''.join(on_list)]  # 根据映射将 on 值替换成正确的数字result_dict = []for data in font_data['data']:num_list = []for nums in data['value'].replace('&#x', '').split(' ')[0:-1]:num_list.append(rep_dist[nums])result_dict.append(int(''.join(num_list)))#     print(rep_dist[nums], end='')# print()return result_dictheaders = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36"
    }
    cookies = {"sessionid": "mhgiqaaxkqpt3ybutbub96ubi9rr5gtk"
    }
    url = "https://match.yuanrenxue.cn/api/match/7?page=1"
    resp = requests.get(url, headers=headers, cookies=cookies)
    print(save_font(resp.json()))
    

    运行结果如下。
     

    image


    与第一页的数据一致,没问题。

  7. 接下来就是获取所有召唤师的名字了,在js代码中进行了处理。
     

    image


    对应的js代码。

    let page = 1;
    let name = ['极镀ギ紬荕', '爷灬霸气傀儡', '梦战苍穹', '傲世哥', 'мaη肆風聲', '一刀メ隔世', '横刀メ绝杀', 'Q不死你R死你', '魔帝殤邪', '封刀不再战', '倾城孤狼', '戎马江湖', '狂得像风', '影之哀伤', '謸氕づ独尊', '傲视狂杀', '追风之梦', '枭雄在世', '傲视之巅', '黑夜刺客', '占你心为王', '爷来取你狗命', '御风踏血', '凫矢暮城', '孤影メ残刀', '野区霸王', '噬血啸月', '风逝无迹', '帅的睡不着', '血色杀戮者', '冷视天下', '帅出新高度', '風狆瑬蒗', '灵魂禁锢', 'ヤ地狱篮枫ゞ', '溅血メ破天', '剑尊メ杀戮', '塞外う飛龍', '哥‘K纯帅', '逆風祈雨', '恣意踏江山', '望断、天涯路', '地獄惡灵', '疯狂メ孽杀', '寂月灭影', '骚年霸称帝王', '狂杀メ无赦', '死灵的哀伤', '撩妹界扛把子', '霸刀☆藐视天下', '潇洒又能打', '狂卩龙灬巅丷峰', '羁旅天涯.', '南宫沐风', '风恋绝尘', '剑下孤魂', '一蓑烟雨', '领域★倾战', '威龙丶断魂神狙', '辉煌战绩', '屎来运赚', '伱、Bu够档次', '九音引魂箫', '骨子里的傲气', '霸海断长空', '没枪也很狂', '死魂★之灵'];
    let heroArray = []
    for (let i = 0; i <= 4; i++) {let yyq = 1;// ['', '', '', '', '', '', '', '', '', ''] 对应一页十条数据['', '', '', '', '', '', '', '', '', ''].forEach((index, val) => {// console.log(name[yyq + (page - 1) * 10]);heroArray.push(name[yyq + (page - 1) * 10])yyq += 1})page += 1;
    }
    console.log(heroArray)
    

    运行结果。
     


    与页面一致

相关文章:

  • 本地文件批量切片处理与大模型精准交互系统开发指南
  • C# 使用SunnyUI控件 (VS 2019)
  • UE5 渲染思路笔记(角色)
  • Java学习手册:分库分表策略
  • UE5 诺伊腾动捕使用笔记
  • 欧拉系统(openEuler)上部署OpenStack的完整指南 ——基于Yoga版本的全流程实践
  • 【LDM】视觉自回归建模:通过Next-Scale预测生成可扩展图像(NeurIPS2024最佳论文阅读笔记与吃瓜)
  • 打造智慧养老实训室,构建科技赋能养老新生态
  • TDengine 车联网案例
  • 51单片机同一个timer 作为定时器和波特率发生器么?
  • LeetCode 热题 100 79. 单词搜索
  • Spring Cloud Stream集成RocketMQ(kafka/rabbitMQ通用)
  • 如何选择 边缘计算服务器
  • 代码随想录图论part03
  • 总结一下最近的知识盲区(个人笔记)
  • 抖音热门视频评论数追踪爬虫获取
  • C++ 项目 -- 高并发内存池
  • 数据可视化与分析
  • Elasticsearch知识汇总之ElasticSearch与OpenSearch比较
  • 第二章:langchain文本向量化(embed)搭建与详细教程-本地服务方式(下)
  • 江苏淮安优化村级资源配置:淮安区多个空心村拟并入邻村
  • 巴称击落多架印度“阵风”战机,专家:小规模冲突巴空军战力不落下风
  • 经济日报:落实落细更加积极的财政政策
  • 退休11年后,71岁四川厅官杨家卷被查
  • 习近平同欧洲理事会主席科斯塔、欧盟委员会主席冯德莱恩就中欧建交50周年互致贺电
  • 多省份晒出“五一”旅游“成绩单”:北京游客接待量、旅游消费创历史新高