当前位置：首页 > news >正文

Python字符串

news 2025/8/24 12:38:37

🌟引言

一、字符串的创建与表示

二、字符串的不可变性

三、字符串常用操作

1. 索引与切片

2. 字符串拼接

3. 字符串格式化

4. 内置方法

四、字符串的高级用法

1. 正则表达式

2. 字符串编码与解码

3. 字符串的内存管理

🌟引言

在Python中，字符串是一种非常常用的数据类型，它用于存储和处理文本数据。

一、字符串的创建与表示

在Python中，字符串可以用单引号'、双引号"或三引号'''（或"""）来创建。例如：

str1 = 'Hello, world!'
str2 = "Python is awesome."
str3 = '''This is a
multi-line string.'''
str4 = """Another
multi-line string."""

单引号和双引号在功能上是等价的，但当字符串中包含引号时，可以使用另一种引号来避免转义。例如：

str1 = 'He said, "Hello!"'
str2 = "It's a sunny day."

三引号可以创建多行字符串，这在定义文档字符串（docstring）或包含换行符的字符串时非常有用。

二、字符串的不可变性

Python中的字符串是不可变的，这意味着一旦创建了一个字符串，就不能修改它的内容。例如：

str1 = "Hello"
str1[0] = 'h'  # 会报错，因为字符串不可变

如果需要修改字符串，可以创建一个新的字符串。例如：

str1 = "Hello"
str2 = "h" + str1[1:]
print(str2)  # 输出：hello

三、字符串常用操作

1. 索引与切片

字符串中的每个字符都有一个索引，索引从0开始。可以通过索引访问字符串中的单个字符。例如：

str1 = "Hello"
print(str1[0])  # 输出：H
print(str1[-1])  # 输出：o（负索引表示从字符串末尾开始计数）

切片操作可以获取字符串中的一部分。语法为 str[start:end:step]，其中start是起始索引（包含），end是结束索引（不包含），step是步长（默认为1）。例如：

str1 = "Hello"
print(str1[1:4])  # 输出：ell
print(str1[:3])   # 输出：Hel（从开头到索引3，不包含索引3）
print(str1[2:])   # 输出：llo（从索引2到末尾）
print(str1[::-1]) # 输出：olleH（反转字符串）

2. 字符串拼接

可以使用 + 操作符将两个字符串拼接在一起。例如：

str1 = "Hello"
str2 = "world"
str3 = str1 + " " + str2
print(str3)  # 输出：Hello world

此外，还可以使用 join() 方法将多个字符串拼接成一个字符串。例如：

str_list = ["Hello", "world", "Python"]
str1 = " ".join(str_list)
print(str1)  # 输出：Hello world Python

3. 字符串格式化

Python提供了多种字符串格式化的方法，包括 %操作符、str.format() 方法和 f-string（Python 3.6+）。

%操作符：

name = "Alice"
age = 25
str1 = "My name is %s and I am %d years old." % (name, age)
print(str1)  # 输出：My name is Alice and I am 25 years old.

str.format() 方法：

name = "Alice"
age = 25
str1 = "My name is {} and I am {} years old.".format(name, age)
print(str1)  # 输出：My name is Alice and I am 25 years old.

f-string：

name = "Alice"
age = 25
str1 = f"My name is {name} and I am {age} years old."
print(str1)  # 输出：My name is Alice and I am 25 years old.

f-string是Python 3.6+中引入的一种新的字符串格式化方法，它使用花括号{ } 来标识变量，并在变量前面加上 f 前缀。f-string的语法简洁，运行速度快，是推荐的字符串格式化方法。

4. 内置方法

Python的字符串类提供了许多内置方法，用于处理和操作字符串。

以下是一些常用的字符串方法：

str.upper()：将字符串中的所有字符转换为大写。

str1 = "hello"
print(str1.upper())  # 输出：HELLO

str.lower() ：将字符串中的所有字符转换为小写。

str1 = "HELLO"
print(str1.lower())  # 输出：hello

str.strip() ：去除字符串首尾的空白字符（包括空格、换行符等）。

str1 = "  hello  "
print(str1.strip())  # 输出：hello

str.split() ：将字符串分割成一个列表，默认以空白字符（如空格、换行符等）为分隔符。

str1 = "hello world"
print(str1.split())  # 输出：['hello', 'world']

str.replace(old, new) ：将字符串中的 old子串替换为 new子串。

str1 = "hello world"
print(str1.replace("world", "Python"))  # 输出：hello Python

str.find(sub) ：查找子串sub在字符串中的位置，如果找到则返回起始索引，否则返回-1。

str1 = "hello world"
print(str1.find("world"))  # 输出：6

str.startswith(prefix) ：检查字符串是否以指定的前缀开头。

str1 = "hello world"
print(str1.startswith("hello"))  # 输出：True

str.endswith(suffix) ：检查字符串是否以指定的后缀结尾。

str1 = "hello world"
print(str1.endswith("world"))  # 输出：True

四、字符串的高级用法

1. 正则表达式

正则表达式是一种强大的文本处理工具，可以用于匹配、搜索、替换等操作。Python的re模块提供了对正则表达式的支持。

以下是一些常用的正则表达式操作：

re.match(pattern, string) ：从字符串的开头开始匹配，如果匹配成功则返回一个匹配对象，否则返回None。

import re
pattern = r"hello"
str1 = "hello world"
match = re.match(pattern, str1)
if match:print("Match found:", match.group())  # 输出：Match found: hello
else:print("No match")

re.search(pattern, string) ：在字符串中搜索第一个匹配的子串，如果找到则返回一个匹配对象，否则返回None。

import re
pattern = r"world"
str1 = "hello world"
match = re.search(pattern, str1)
if match:print("Match found:", match.group())  # 输出：Match found: world
else:print("No match")

re.findall(pattern, string) ：返回字符串中所有匹配的子串，返回一个列表。

import re
pattern = r"\d+"
str1 = "There are 123 apples and 456 oranges."
matches = re.findall(pattern, str1)
print(matches)  # 输出：['123', '456']

re.sub(pattern, repl, string) ：将字符串中所有匹配的子串替换为指定的字符串。

import re
pattern = r"\d+"
str1 = "There are 123 apples and 456 oranges."
str2 = re.sub(pattern, "many", str1)
print(str2)  # 输出：There are many apples and many oranges.

2. 字符串编码与解码

在Python中，字符串默认使用Unicode编码。如果需要将字符串转换为其他编码格式，可以使用encode() 方法；如果需要将字节串解码为字符串，可以使用 decode() 方法。例如：

str1 = "你好，世界"
bytes1 = str1.encode("utf-8")  # 将字符串编码为UTF-8格式的字节串
print(bytes1)  # 输出：b'\xe4\xbd\xa0\xe5\xa5\xbd\xef\xbc\x8c\xe4\xb8\x96\xe7\x95\x8c'str2 = bytes1.decode("utf-8")  # 将字节串解码为字符串
print(str2)  # 输出：你好，世界

3. 字符串的内存管理

在Python中，字符串是不可变的，这意味着每次对字符串进行修改操作时，都会创建一个新的字符串对象。这可能会导致内存浪费和性能问题。

为了避免这种情况，可以使用 str.join() 方法或io.StringIO类来高效地处理字符串拼接等操作。例如：

使用str.join()方法：

words = ["Hello", "world", "Python"]
sentence = " ".join(words)
print(sentence)  # 输出：Hello world Python

使用io.StringIO类：

import io
words = ["Hello", "world", "Python"]
with io.StringIO() as s:for word in words:s.write(word + " ")sentence = s.getvalue().strip()
print(sentence)  # 输出：Hello world Python