Spark学习-SparkSQL--02-Spark history Server
Spark History Server配置使用
1。Spark history Server产生背景
以standalone运行模式为例,在运行Spark Application的时候,Spark会提供一个WEBUI列出应用程序的运行时信息;但该WEBUI随着Application的完成(成功/失败)而关闭,也就是说,Spark Application运行完(成功/失败)后,将无法查看Application的历史记录;
Spark history Server就是为了应对这种情况而产生的,通过配置可以在Application执行的过程中记录下了日志事件信息,那么在Application执行结束后,WEBUI就能重新渲染生成UI界面展现出该Application在执行过程中的运行时信息;
Spark运行在yarn或者mesos之上,通过spark的history server仍然可以重构出一个已经完成的Application的运行时参数信息(假如Application运行的事件日志信息已经记录下来);
配置&使用Spark History Server
以默认配置的方式启动spark history server:
cd $SPARK_HOME/sbin
start-history-server.sh
报错
starting org.apache.spark.deploy.history.HistoryServer, logging to /home/spark/software/source/compile/deploy_spark/sbin/../logs/spark-spark-org.apache.spark.deploy.history.HistoryServer-1-hadoop000.out
failed to launch org.apache.spark.deploy.history.HistoryServer:
at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:44)
... 6 more
[root@biluos logs]# /opt/moudles/spark-2.2.0-bin-hadoop2.7/sbin/start-history-server.sh hdfs://mycluster:8020/spark_job_history
starting org.apache.spark.deploy.history.HistoryServer, logging to /opt/moudles/spark-2.2.0-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.history.HistoryServer-1-biluos.com.out
[root@biluos logs]# cat spark-root-org.apache.spark.deploy.history.HistoryServer-1-biluos.com.out
Spark Command: /opt/moudles/jdk1.8.0_121/bin/java -cp /opt/moudles/spark-2.2.0-bin-hadoop2.7/conf/:/opt/moudles/spark-2.2.0-bin-hadoop2.7/jars/*:/opt/moudles/hadoop-2.7.3/etc/hadoop/ -Xmx1g org.apache.spark.deploy.history.HistoryServer hdfs://mycluster:8020/spark_job_history
========================================
17/08/03 03:22:18 INFO HistoryServer: Started daemon with process name: 2666@biluos.com
17/08/03 03:22:18 INFO SignalUtils: Registered signal handler for TERM
17/08/03 03:22:18 INFO SignalUtils: Registered signal handler for HUP
17/08/03 03:22:18 INFO SignalUtils: Registered signal handler for INT
17/08/03 03:22:18 WARN HistoryServerArguments: Setting log directory through the command line is deprecated as of Spark 1.1.0. Please set this through spark.history.fs.logDirectory instead.
17/08/03 03:22:19 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/08/03 03:22:19 INFO SecurityManager: Changing view acls to: root
17/08/03 03:22:19 INFO SecurityManager: Changing modify acls to: root
17/08/03 03:22:19 INFO SecurityManager: Changing view acls groups to:
17/08/03 03:22:19 INFO SecurityManager: Changing modify acls groups to:
17/08/03 03:22:19 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
17/08/03 03:22:19 INFO FsHistoryProvider: History server ui acls disabled; users with admin permissions: ; groups with admin permissions
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:278)
at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala)
Caused by: java.io.FileNotFoundException: Log directory specified does not exist: hdfs://mycluster:8020/spark_job_history
at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$startPolling(FsHistoryProvider.scala:214)
at org.apache.spark.deploy.history.FsHistoryProvider.initialize(FsHistoryProvider.scala:160)
at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:156)
at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:78)
... 6 more
Caused by: java.io.FileNotFoundException: File does not exist: hdfs://mycluster:8020/spark_job_history
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)
at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$startPolling(FsHistoryProvider.scala:204)
... 9 more
解决方法
[root@biluos logs]# hdfs dfs -mkdir /spark_job_history
重新启动不报错了
界面如图
声明:该文观点仅代表作者本人,入门客AI创业平台信息发布平台仅提供信息存储空间服务,如有疑问请联系rumenke@qq.com。
- 上一篇: 《KyLin学习理解》-03-KyLin的坑总结
- 下一篇:没有了