可能是使用的人不多了,网上关于hive2.1的文章都差不多。都没能解决我遇到的问题,这篇文章写给所有没能成功运行hive2的朋友,希望能有所帮助。
使用hive前,需要有hdfs,这个就各位自行搞定吧。
1、下载bin版本的hive,地址:https://hive.apache.org/
2、解压到文件到本地目录,我是放在了 /home/software/apache-hive-2.1.0-bin
3、下载你hadoop集群版本所使用的hadoop软件,解压到本地目录,我是放在了 /home/software/hadoop-2.7.2。为什么还需要hadoop软件呢,在官方的admin参数配置那篇文章(原文地址: https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration#AdminManualConfiguration-HiveConfigurationVariables )提到了一个参数配置:
Variable Name | Description | Default Value |
hadoop.bin.path | The location of the Hadoop script which is used to submit jobs to Hadoop when submitting through a separate JVM. | $HADOOP_HOME/bin/hadoop |
所以为了顺利提交jobs,hive所在的服务器需要hadoop软件。
4、配置环境变量,打开 /etc/profile 文件,加入以下内容:
[code language=”text”]
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.el7_2.x86_64
export HIVE_HOME=/home/software/apache-hive-2.1.0-bin
export HADOOP_HOME=/home/software/hadoop-2.7.2
export PATH=$HIVE_HOME/bin:$PATH
[/code]
上面这些路径当然因系统而异,各位不要搞错了。编辑完后,执行 source /etc/profile 让配置立即生效。
5、修改hive配置文件,hive配置文件有一个模板$HIVE_HOME/conf/hive-default.xml.template,可以复制一份,重命名为hive-site.xml。hive的配置主要以下几项内容。
A、Metastore的配置,参见:https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin#AdminManualMetastoreAdmin-RemoteMetastoreDatabase
B、HDFS 的配置,参见:https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration#AdminManualConfiguration-HiveConfigurationVariables
C、HiveServer2 的配置,参见:https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2#SettingUpHiveServer2-RunninginHTTPMode
D、HiveWebInterface 的配置,参见:https://cwiki.apache.org/confluence/display/Hive/HiveWebInterface#HiveWebInterface-WhatIstheHiveWebInterface
6、Metastore的配置,hive有内置一个小型的关系型数据库derby,但我们一般都不用。这里我使用MariaDB10作为我的Metastore。MariaDB怎么安装就不说了,不过有一点需要注意的,就是接入用户必须要有密码,很多人都习惯了本地接入不使用密码。
另外,还需要把jdbc connector放在$HIVE_HOME/lib文件夹下面,MariaDB10的connector的下载地址:https://mariadb.com/my_portal/download/java-client
[code language=”xml” autolinks=”false”]
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://HIVE:3306/hive?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.mariadb.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password to use against metastore database</description>
</property>
[/code]
7、HDFS的配置,除去那些默认项,你只需要配置这两个项目就好了
[code language=”xml” autolinks=”false”]
<property>
<name>fs.defaultFS</name>
<value>hdfs://Hadoop-I:9000</value>
<description/>
</property>
<property>
<name>mapred.job.tracker</name>
<value>Hadoop-I:9001</value>
<description/>
</property>
[/code]
8、HiveServer2配置
[code language=”xml” autolinks=”false”]
<property>
<name>hive.server2.thrift.bind.host</name>
<value>HIVE</value>
<description>Bind host on which to run the HiveServer2 Thrift service.</description>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
<description>Setting this property to true will have HiveServer2 execute
Hive operations as the user making the calls to it.
</description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
[/code]
doAs之所以要改成false,是因为本地执行beeline的时候,总是出错。关于这个key的说明,大家可以参考官方文档,建议是设置为true。HIVE这个已经被我写入hosts文件了,大家不要照抄哦。
9、HiveWebInterface,这个还没搞定,后续补充。
10、设置hive tmp目录,这个官方文档没有提及,不过不设置的话,跑不动。
[code language=”xml” autolinks=”false”]
<property>
<name>hive.exec.local.scratchdir</name>
<value>/home/hive/tmp</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/home/hive/tmp</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
[/code]
至此完成所有配置工作了,接下来是前期的一些准备工作。参见:https://cwiki.apache.org/confluence/display/Hive/GettingStarted
11、在HDFS上新建文件夹
[code language=”bash”]
$ $HADOOP_HOME/bin/hadoop fs -mkdir /tmp
$ $HADOOP_HOME/bin/hadoop fs -mkdir /user/hive/warehouse
$ $HADOOP_HOME/bin/hadoop fs -chmod g+w /tmp
$ $HADOOP_HOME/bin/hadoop fs -chmod g+w /user/hive/warehouse
[/code]
12、初始化Metastore
[code language=”bash”]
$ $HIVE_HOME/bin/schematool -dbType <db type> -initSchema
[/code]
我使用的是MariaDB,所以dbType,填mysql
终于大功告成了,首先启动hadoop集群,然后开始运行hive server2,
[code language=”bash”]
# bin/hiveserver2
which: no hbase in (/home/software/apache-hive-2.1.0-bin/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/software/apache-hive-2.1.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/software/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
[/code]
接下来执行beeline,就可以执行操作了
[code language=”bash”]
[root@HIVE apache-hive-2.1.0-bin]# bin/beeline -u jdbc:hive2://HIVE:10000
which: no hbase in (/home/software/apache-hive-2.1.0-bin/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin)
Connecting to jdbc:hive2://HIVE:10000
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/software/apache-hive-2.1.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/software/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connected to: Apache Hive (version 2.1.0)
Driver: Hive JDBC (version 2.1.0)
16/10/03 23:40:50 [main]: WARN jdbc.HiveConnection: Request to set autoCommit to false; Hive does not support autoCommit=false.
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 2.1.0 by Apache Hive
0: jdbc:hive2://HIVE:10000> show databases;
+—————-+–+
| database_name |
+—————-+–+
| default |
+—————-+–+
1 row selected (1.807 seconds)
0: jdbc:hive2://HIVE:10000>
[/code]
备注:将conf里的hive-log4j2.properties.template改为hive-log4j2.properties可以打开日志,记得修改配置文件中,日志文件的位置存放位置,并确保填写的目录是存在的。