当前位置: 首页 > news >正文

wordcount在mapreduce的例子

1.启动集群

2.创建项目

 项目结构为:

3.pom.xml文件为

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"><modelVersion>4.0.0</modelVersion><groupId>org.example</groupId><artifactId>mapReduceTest</artifactId><packaging>war</packaging><version>1.0-SNAPSHOT</version><name>mapReduceTest Maven Webapp</name><url>http://maven.apache.org</url><dependencies><dependency><groupId>junit</groupId><artifactId>junit</artifactId><version>3.8.1</version><scope>test</scope></dependency><dependency><groupId>org.apache.logging.log4j</groupId><artifactId>log4j-slf4j-impl</artifactId><version>2.12.0</version></dependency><dependency><groupId>org.apache.hadoop</groupId><artifactId>hadoop-common</artifactId><version>3.1.3</version></dependency><dependency><groupId>org.apache.hadoop</groupId><artifactId>hadoop-hdfs</artifactId><version>3.1.3</version></dependency><dependency><groupId>org.apache.hadoop</groupId><artifactId>hadoop-mapreduce-client-core</artifactId><version>3.1.3</version></dependency><dependency><groupId>org.apache.hadoop</groupId><artifactId>hadoop-client</artifactId><version>3.1.3</version><exclusions><!-- ’d Log4j 1.x --><exclusion><groupId>log4j</groupId><artifactId>log4j</artifactId></exclusion><!-- ’d SLF4J ’ Log4j 1.x „e¥ --><exclusion><groupId>org.slf4j</groupId><artifactId>slf4j-log4j12</artifactId></exclusion></exclusions></dependency></dependencies><build><finalName>mapReduceTest</finalName></build>
</project>

4.WordCountMapper代码为

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;import java.io.IOException;public class WordCountMapper extends Mapper<LongWritable,Text,Text,IntWritable> {@Overrideprotected void map(LongWritable key1,Text value1,Context context) throws IOException, InterruptedException {String data=value1.toString();String[] words=data.split(" ");for(String w:words){context.write(new Text(w),new IntWritable(1));}}}

5.WordCountReduce代码为:

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;import java.io.IOException;public class WordCountReduce extends Reducer<Text,IntWritable,Text,IntWritable> {@Overrideprotected void reduce(Text k3,Iterable<IntWritable> v3,Context context) throws IOException, InterruptedException {int total=0;for(IntWritable v:v3){total+=v.get();}context.write(k3,new IntWritable(total));}
}

6.WordCountMain代码为:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.Job;public class WordCountMain {public static void main(String[] args) throws Exception {Job job = Job.getInstance(new Configuration());job.setJarByClass(WordCountMain.class);job.setMapperClass(WordCountMapper.class);job.setMapOutputKeyClass(Text.class);job.setMapOutputValueClass(IntWritable.class);job.setReducerClass(WordCountReduce.class);job.setOutputKeyClass(Text.class);job.setOutputValueClass(IntWritable.class);FileInputFormat.setInputPaths(job, new Path("hdfs://172.18.0.2:9000/input"));FileOutputFormat.setOutputPath(job, new Path("hdfs://172.18.0.2:9000/WordCountOutput"));job.waitForCompletion(true);}
}

7.测试结果

运行这个main,可以看到

用shell脚本可以查看

相关文章:

  • 荣耀手机,系统MagicOS 9.0 USB配置没有音频来源后无法被adb检测到,无法真机调试的解决办法
  • android setImageResource和setBackgroundResource区别
  • 小红书代运营服务商综合排名分析
  • MiniCPM-V
  • 无线定位之 三 SX1302 网关源码 thread_gps 线程详解
  • SQL:JOIN 进阶
  • Tenacity 高级使用指南:Python 重试机制的终极解决方案
  • 学习日志04 java
  • MYSQL之表的约束
  • Redis 中常见的数据类型有哪些?
  • 深度强化学习有什么学习建议吗?
  • telnetlib源码深入解析
  • FanControl(电脑风扇转速控制软件) v224 中文版
  • python学习打卡day23
  • DHCP自动分配IP
  • 仪器设备行业实验室管理现状 质检LIMS系统在仪器设备行业的应用
  • 十二、操作符重载
  • 项目售后服务承诺书,软件售后服务方案,软件安装文档,操作文档,维护文档(Word原件)
  • 在CentOS 7上仅安装部署MySQL 8.0客户端
  • 在Text-to-SQL任务中应用过程奖励模型
  • 铁路部门:确保沿线群众安全,焦柳铁路6个区段将陆续安装防护栅栏
  • 男子退机票被收票价90%的手续费,律师:虽然合规,但显失公平
  • 打击网络谣言、共建清朗家园,中国互联网联合辟谣平台2025年4月辟谣榜
  • 法院就“行人相撞案”道歉:执法公正,普法莫拉开“距离”
  • 浙江首个核酸药谷落子杭州,欢迎订阅《浪尖周报》第23期
  • 宝妈称宝宝在粽子中吃出带血创可贴,来伊份:已内部排查