【NLP】32. Transformers (HuggingFace Pipelines 实战)
🤖 Transformers (HuggingFace Pipelines 实战)
本教程基于 Hugging Face 的 transformers
库,展示如何使用预训练模型完成以下任务:
- 情感分析(Sentiment Analysis)
- 文本生成(Text Generation)
- 翻译(Translation)
- 掩码填空(Masked Language Modeling)
- 零样本分类(Zero-shot Classification)
- 特定任务模型推理(Text2Text Generation, e.g., T5)
- 文本摘要(Summarization)
- 自定义分类模型载入与推理(如 BERT)
📦 安装依赖
!pip install datasets evaluate transformers sentencepiece -q
✅ 情感分析
from transformers import pipelineclassifier = pipeline("sentiment-analysis")# 单个句子
classifier("I really enjoy learning new things every day.")# 多个句子
classifier(["I really enjoy learning new things every day.","I dislike rainy weather on weekends."
])
✍️ 文本生成(默认模型)
from transformers import pipelinegenerator = pipeline("text-generation")
generator("Today we are going to explore something exciting")
🎯 文本生成(指定参数)
# 两句话,每句最多15词
generator("Today we are going to explore something exciting", num_return_sequences=2, max_length=15)
⚙️ 使用 distilgpt2 模型生成文本
from transformers import pipelinegenerator = pipeline("text-generation", model="distilgpt2")
generator("Today we are going to explore something exciting",max_length=30,num_return_sequences=2,
)
🌍 翻译(英语→德语)
from transformers import pipelinetranslator = pipeline("translation_en_to_de")
translator("This is a great day to learn something new!")
🧩 掩码填空任务(填词)
from transformers import pipelineunmasker = pipeline("fill-mask")
unmasker("Life is like a <mask> of chocolates.")
🧠 零样本分类(Zero-shot Classification)
from transformers import pipelineclassifier = pipeline("zero-shot-classification")
classifier("This tutorial is about machine learning and natural language processing.",candidate_labels=["education", "sports", "politics"]
)
🔁 T5 模型(Text2Text)
from transformers import pipelinetext2text = pipeline("text2text-generation")
text2text("Translate English to French: How are you today?")
✂️ 文本摘要(Summarization)
from transformers import pipelinesummarizer = pipeline("summarization")
summarizer("Machine learning is a field of artificial intelligence that focuses on enabling machines to learn from data..."
)
🧪 使用自己的模型(以 BERT 为例)
from transformers import BertTokenizer, BertForSequenceClassification
from transformers import pipelinemodel_name = "nlptown/bert-base-multilingual-uncased-sentiment"
tokenizer = BertTokenizer.from_pretrained(model_name)
model = BertForSequenceClassification.from_pretrained(model_name)pipe = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
pipe("I had an amazing experience with this product!")