Java Stream (二)
文章目录
- 前言
- 基础集合操作(基于字符串/数值集合)
- 1. 交集(Intersection)
- 2. 并集(Union)
- 3. 差集(Difference)
- 对象集合操作(需重写 equals/hashCode)
- 1. User 类
- 2. 对象交集
- 复杂组合操作
- 1. 多集合交集
- 2. 交并差混合操作
- 3. 分组后集合操作
- 并行优化与性能对比
- 1. 并行流加速
- 2. 时间复杂度对比表
- 扩展工具类封装
- 1. 通用集合操作工具
- 全部代码
- 运行结果
- 注意事项
- 总结
前言
当你迷茫的时候,请点击 目录大纲 快速查看前面的技术文章,相信你总能找到前行的方向
另外,本文相关代码已上传 gitee仓库,欢迎关注收藏
本文用Java Stream API完成集合的交集、并集、差集以及它们的组合情况
基础集合操作(基于字符串/数值集合)
1. 交集(Intersection)
List<String> list1 = Arrays.asList("A", "B", "C", "D");List<String> list2 = Arrays.asList("C", "D", "E", "F");// 方式1:直接过滤 O(n²)时间复杂度List<String> intersection = list1.stream().filter(list2::contains).collect(Collectors.toList());System.out.println("交集-直接列表过滤:" + intersection);// 方式2:优化为O(n)时间复杂度(推荐)Set<String> set2 = new HashSet<>(list2);List<String> optimizedIntersection = list1.stream().filter(set2::contains).collect(Collectors.toList());System.out.println("交集-HashSet过滤:" + optimizedIntersection);
2. 并集(Union)
// 简单合并(可能含重复), [A,B,C,D,C,D,E,F]List<String> unionWithDuplicates = Stream.concat(list1.stream(), list2.stream()).collect(Collectors.toList());System.out.println("并集-简单合并(可能含重复:" + unionWithDuplicates);// 去重并集 , [A,B,C,D,E,F]List<String> unionDistinct = Stream.concat(list1.stream(), list2.stream()).distinct().collect(Collectors.toList());System.out.println("并集-去重并集:" + unionDistinct);
3. 差集(Difference)
// list1有但list2没有, [A, B]List<String> difference1 = list1.stream().filter(e -> !list2.contains(e)).collect(Collectors.toList());System.out.println("差集-list1有但list2没有:" + difference1);// list2有但list1没有, [E, F]List<String> difference2 = list2.stream().filter(e -> !list1.contains(e)).collect(Collectors.toList());System.out.println("差集-list2有但list1没有:" + difference2);// 对称差集(并集 - 交集), [A, B, E, F]List<String> symmetricDiff = Stream.concat(list1.stream().filter(e -> !list2.contains(e)),list2.stream().filter(e -> !list1.contains(e))).collect(Collectors.toList());System.out.println("差集-对称差集(并集 - 交集):" + symmetricDiff);
对象集合操作(需重写 equals/hashCode)
1. User 类
public class User {String id;String name;String dept;public User(String id, String name) {this.id = id;this.name = name;}public User(String id, String name, String dept) {this.id = id;this.name = name;this.dept = dept;}// 构造方法、getter/setterpublic String getId() {return this.id;}public String getName() {return this.name;}public String getDept() {return this.dept;}@Overridepublic boolean equals(Object o) {if (this == o) {return true;}if (o == null || getClass() != o.getClass()) {return false;}User user = (User) o;return Objects.equals(this.id, user.id);}@Overridepublic int hashCode() {return Objects.hash(this.id);}@Overridepublic String toString() {return String.format("(%s-%s-%s)", id, name, dept);}
}
2. 对象交集
// 对象集合操作(需重写 equals/hashCode)List<User> users1 = Arrays.asList(new User("1", "Alice"),new User("2", "Bob"));List<User> users2 = Arrays.asList(new User("2", "Bob"),new User("3", "Charlie"));// 对象交集, [User{id=2}]List<User> userIntersection = users1.stream().filter(users2::contains).collect(Collectors.toList());System.out.println("对象交集-对象id:" + userIntersection);// 自定义字段比较(例如按name字段)Set<String> names = users2.stream().map(User::getName).collect(Collectors.toSet());List<User> nameIntersection = users1.stream().filter(u -> names.contains(u.getName())).collect(Collectors.toList());System.out.println("对象交集-自定义字段比较(例如按name字段):" + nameIntersection);
复杂组合操作
1. 多集合交集
// 多集合交集List<List<String>> multiLists = Arrays.asList(Arrays.asList("A", "B", "C"),Arrays.asList("B", "C", "D"),Arrays.asList("C", "D", "E"));System.out.println("多集合交集-multiLists:" + multiLists);Set<String> baseSet = new HashSet<>(multiLists.get(0));System.out.println("多集合交集-baseSet:" + baseSet);multiLists.stream().skip(1).forEach(baseSet::retainAll);// 最终baseSet: [C]System.out.println("多集合交集-最终baseSet:" + baseSet);
2. 交并差混合操作
// 求 (A∪B) ∩ (C-D)List<String> list3 = Arrays.asList("A", "B", "C", "D");List<String> list4 = Arrays.asList("C", "D");Set<String> set = list3.stream().filter(f -> !list4.contains(f)).collect(Collectors.toSet());List<String> complexOp = Stream.concat(list1.stream(), list2.stream()).distinct().filter(set::contains).collect(Collectors.toList());System.out.println("复杂操作 (A∪B) ∩ (C-D):" + complexOp);
3. 分组后集合操作
List<User> users = Arrays.asList(new User("1", "Alice", "DeptA"),new User("2", "Bob", "DeptA"),new User("3", "Charlie", "DeptA"),new User("1", "Alice", "DeptB"),new User("2", "Bob", "DeptB"),new User("4", "Doge", "DeptB"));System.out.println("分组-users:" + users);Map<String, List<User>> deptGroups = users.stream().collect(Collectors.groupingBy(User::getDept));System.out.println("分组部门-deptGroups:" + deptGroups);// 按部门求与其他部门的交集deptGroups.forEach((dept, usersInDept) -> {deptGroups.entrySet().stream().filter(e -> !e.getKey().equals(dept)).forEach(otherDept -> {Set<User> common = usersInDept.stream().filter(otherDept.getValue()::contains).collect(Collectors.toSet());System.out.println(dept + "与" + otherDept.getKey() + "的交集: " + common);});});System.out.println("分组-按部门求与其他部门的交集:" + deptGroups);
并行优化与性能对比
1. 并行流加速
// 并行计算差集List<String> parallelDiff = list1.parallelStream().filter(e -> !Collections.synchronizedSet(new HashSet<>(list2)).contains(e)).collect(Collectors.toList());System.out.println("并行计算差集:" + parallelDiff);
2. 时间复杂度对比表
操作类型 | 实现方式 | 时间复杂度 | 适用场景 |
---|---|---|---|
交集 | 直接filter(list::contains) | O(n²) | 小数据集 |
交集 | 预存HashSet过滤 | O(n) | 大数据集推荐 |
并集 | Stream.concat + distinct | O(n) | 常规场景 |
差集 | HashSet预处理 | O(n) | 高频操作推荐 |
扩展工具类封装
1. 通用集合操作工具
public class StreamColUtils {/*** 泛型交集* @param list1 列表1* @param list2 列表2* @return List* @param <T> 模板类*/public static <T> List<T> intersection(List<T> list1, List<T> list2) {Set<T> set = new HashSet<>(list2);return list1.stream().filter(set::contains).collect(Collectors.toList());}/*** 多字段去重并集* @param list1 列表1* @param list2 列表2* @return List*/public static List<User> unionByFields(List<User> list1, List<User> list2) {return Stream.concat(list1.stream(), list2.stream()).collect(Collectors.collectingAndThen(Collectors.toMap(User::getId,Function.identity(),(existing, replacement) -> existing),map -> new ArrayList<>(map.values())));}
}
全部代码
package stream;import java.util.Arrays;
import java.util.Collections;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.stream.Collectors;
import java.util.stream.Stream;/*** @author zhangjianlong 2025/8/5 17:58*/
public class StreamTest {public static void main(String[] args) {List<String> list1 = Arrays.asList("A", "B", "C", "D");List<String> list2 = Arrays.asList("C", "D", "E", "F");// 对象集合操作(需重写 equals/hashCode)List<User> users1 = Arrays.asList(new User("1", "Alice"),new User("2", "Bob"));List<User> users2 = Arrays.asList(new User("2", "Bob"),new User("3", "Charlie"));//交集(Intersection)testIntersection(list1, list2);// 并集(Union)testUnion(list1, list2);// 差集(Difference)testDiff(list1, list2);// 对象交集(Intersection)testUsrIntersection(users1, users2);// 多集合交集testMultiList();// 交并差混合操作: (A∪B) ∩ (C-D)testComplex(list1, list2);// 分组集合操作testGroup();// 并行优化与性能对比testParallel(list1, list2);// 工具类List<String> list3 = StreamColUtils.intersection(list1, list2);System.out.println("工具类-列表交集:" + list3);List<User> users3 = StreamColUtils.intersection(users1, users2);System.out.println("工具类-对象列表交集:" + users3);}private static void testParallel(List<String> list1, List<String> list2) {// 并行计算差集List<String> parallelDiff = list1.parallelStream().filter(e -> !Collections.synchronizedSet(new HashSet<>(list2)).contains(e)).collect(Collectors.toList());System.out.println("并行计算差集:" + parallelDiff);}private static void testGroup() {List<User> users = Arrays.asList(new User("1", "Alice", "DeptA"),new User("2", "Bob", "DeptA"),new User("3", "Charlie", "DeptA"),new User("1", "Alice", "DeptB"),new User("2", "Bob", "DeptB"),new User("4", "Doge", "DeptB"));System.out.println("分组-users:" + users);Map<String, List<User>> deptGroups = users.stream().collect(Collectors.groupingBy(User::getDept));System.out.println("分组部门-deptGroups:" + deptGroups);// 按部门求与其他部门的交集deptGroups.forEach((dept, usersInDept) -> {deptGroups.entrySet().stream().filter(e -> !e.getKey().equals(dept)).forEach(otherDept -> {Set<User> common = usersInDept.stream().filter(otherDept.getValue()::contains).collect(Collectors.toSet());System.out.println(dept + "与" + otherDept.getKey() + "的交集: " + common);});});System.out.println("分组-按部门求与其他部门的交集:" + deptGroups);}private static void testComplex(List<String> list1, List<String> list2) {// 求 (A∪B) ∩ (C-D)List<String> list3 = Arrays.asList("A", "B", "C", "D");List<String> list4 = Arrays.asList("C", "D");Set<String> set = list3.stream().filter(f -> !list4.contains(f)).collect(Collectors.toSet());List<String> complexOp = Stream.concat(list1.stream(), list2.stream()).distinct().filter(set::contains).collect(Collectors.toList());System.out.println("复杂操作 (A∪B) ∩ (C-D):" + complexOp);}private static void testMultiList() {// 多集合交集List<List<String>> multiLists = Arrays.asList(Arrays.asList("A", "B", "C"),Arrays.asList("B", "C", "D"),Arrays.asList("C", "D", "E"));System.out.println("多集合交集-multiLists:" + multiLists);Set<String> baseSet = new HashSet<>(multiLists.get(0));System.out.println("多集合交集-baseSet:" + baseSet);multiLists.stream().skip(1).forEach(baseSet::retainAll);// 最终baseSet: [C]System.out.println("多集合交集-最终baseSet:" + baseSet);}private static void testUsrIntersection(List<User> users1, List<User> users2) {// 对象交集, [User{id=2}]List<User> userIntersection = users1.stream().filter(users2::contains).collect(Collectors.toList());System.out.println("对象交集-对象id:" + userIntersection);// 自定义字段比较(例如按name字段)Set<String> names = users2.stream().map(User::getName).collect(Collectors.toSet());List<User> nameIntersection = users1.stream().filter(u -> names.contains(u.getName())).collect(Collectors.toList());System.out.println("对象交集-自定义字段比较(例如按name字段):" + nameIntersection);}private static void testDiff(List<String> list1, List<String> list2) {// list1有但list2没有, [A, B]List<String> difference1 = list1.stream().filter(e -> !list2.contains(e)).collect(Collectors.toList());System.out.println("差集-list1有但list2没有:" + difference1);// list2有但list1没有, [E, F]List<String> difference2 = list2.stream().filter(e -> !list1.contains(e)).collect(Collectors.toList());System.out.println("差集-list2有但list1没有:" + difference2);// 对称差集(并集 - 交集), [A, B, E, F]List<String> symmetricDiff = Stream.concat(list1.stream().filter(e -> !list2.contains(e)),list2.stream().filter(e -> !list1.contains(e))).collect(Collectors.toList());System.out.println("差集-对称差集(并集 - 交集):" + symmetricDiff);}private static void testUnion(List<String> list1, List<String> list2) {// 简单合并(可能含重复), [A,B,C,D,C,D,E,F]List<String> unionWithDuplicates = Stream.concat(list1.stream(), list2.stream()).collect(Collectors.toList());System.out.println("并集-简单合并(可能含重复:" + unionWithDuplicates);// 去重并集 , [A,B,C,D,E,F]List<String> unionDistinct = Stream.concat(list1.stream(), list2.stream()).distinct().collect(Collectors.toList());System.out.println("并集-去重并集:" + unionDistinct);}private static void testIntersection(List<String> list1, List<String> list2) {// 方式1:直接过滤 O(n²)时间复杂度List<String> intersection = list1.stream().filter(list2::contains).collect(Collectors.toList());System.out.println("交集-直接列表过滤:" + intersection);// 方式2:优化为O(n)时间复杂度(推荐)Set<String> set2 = new HashSet<>(list2);List<String> optimizedIntersection = list1.stream().filter(set2::contains).collect(Collectors.toList());System.out.println("交集-HashSet过滤:" + optimizedIntersection);}
}
运行结果
交集-直接列表过滤:[C, D]
交集-HashSet过滤:[C, D]
并集-简单合并(可能含重复:[A, B, C, D, C, D, E, F]
并集-去重并集:[A, B, C, D, E, F]
差集-list1有但list2没有:[A, B]
差集-list2有但list1没有:[E, F]
差集-对称差集(并集 - 交集):[A, B, E, F]
对象交集-对象id:[(2-Bob-null)]
对象交集-自定义字段比较(例如按name字段):[(2-Bob-null)]
多集合交集-multiLists:[[A, B, C], [B, C, D], [C, D, E]]
多集合交集-baseSet:[A, B, C]
多集合交集-最终baseSet:[C]
复杂操作 (A∪B) ∩ (C-D):[A, B]
分组-users:[(1-Alice-DeptA), (2-Bob-DeptA), (3-Charlie-DeptA), (1-Alice-DeptB), (2-Bob-DeptB), (4-Doge-DeptB)]
分组部门-deptGroups:{DeptB=[(1-Alice-DeptB), (2-Bob-DeptB), (4-Doge-DeptB)], DeptA=[(1-Alice-DeptA), (2-Bob-DeptA), (3-Charlie-DeptA)]}
DeptB与DeptA的交集: [(1-Alice-DeptB), (2-Bob-DeptB)]
DeptA与DeptB的交集: [(1-Alice-DeptA), (2-Bob-DeptA)]
分组-按部门求与其他部门的交集:{DeptB=[(1-Alice-DeptB), (2-Bob-DeptB), (4-Doge-DeptB)], DeptA=[(1-Alice-DeptA), (2-Bob-DeptA), (3-Charlie-DeptA)]}
并行计算差集:[A, B]
工具类-列表交集:[C, D]
工具类-对象列表交集:[(2-Bob-null)]
注意事项
-
对象比较一致性
对象集合操作需正确重写equals()
和hashCode()
方法,否则基于引用的比较会导致逻辑错误(#)(#)。 -
空集合处理
使用Optional
包装结果避免NPE:Optional.ofNullable(list1).map(l -> l.stream().filter(...)).orElse(Stream.empty()).collect(...);
-
并行流使用场景
- 数据量超过10万时考虑
parallelStream()
- 避免在共享变量操作中使用非线程安全集合
- 数据量超过10万时考虑
-
第三方库补充
复杂场景可结合 Apache Commons 或 Guava:// Guava对称差集 Sets.symmetricDifference(Sets.newHashSet(list1), Sets.newHashSet(list2));
总结
通过灵活组合 Stream API 的中间操作与终端操作,可实现各类集合运算及复杂逻辑处理。对于超大数据集建议采用分页处理或结合数据库实现,内存操作优先考虑时间复杂度优化方案。