代码
GitHub链接 开源工具类UserAgentParser地址
日志来源
根据之前的作业提交系统项目的后台获取,为了本地测试,故先从数据库中备份下来并做了适当的处理,仅保留UserAgent的相关信息
本地文件测试
使用开源工具类解析UserAgent中的信息
@Test
public void testReadFile() throws Exception {
String path = "F:\\JAVA Workspace\\hadoopstudy\\access.log";
BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(new File(path))));
String line = "";
int count = 0;
//模拟MapReduce存储
Map browerMap = new HashMap();
UserAgentParser userAgentParser = new UserAgentParser();
while (line != null) {
line = reader.readLine();
count++;
if (StringUtils.isNotBlank(line)) {
String source = line;
UserAgent agent = userAgentParser.parse(source);
//测试
String browser = agent.getBrowser();
String engine = agent.getEngine();
String engineVersion = agent.getEngineVersion();
String os = agent.getOs();
String platform = agent.getPlatform();
boolean mobile = agent.isMobile();
Integer browserValue = browerMap.get(browser);
if (browserValue != null) {
browerMap.put(browser, browerMap.get(browser) + 1);
} else {
browerMap.put(browser, 1);
}
//输出解析的信息
System.out.println(browser + "," + engine + "," + engineVersion + "," + os + "," + platform + "," + mobile);
}
}
System.out.println("总记录数:" + count);
System.out.println("====================================");
for (Map.Entryentry :browerMap.entrySet()){
System.out.println(entry.getKey()+":"+entry.getValue());
}
}
执行单元测试
使用MapReduce统计
代码以WordCount为原型,结合本地测试的代码编写 ... boolean mobile = agent.isMobile();
if (mobile){
mobileuser = "手机用户";
}else {
mobileuser = "非手机用户";
}
//通过上下文把map的处理结果输出
context.write(new Text(browser), one);
...
打包运行
将工程用mvn assembly:assembly
命令把插件一起打包,上传至虚拟机,同时将log上传至HDFS的根目录
[hadoop@localhost testFile]
结果与本地测试一致
[hadoop@localhost testFile]