Hive的安装

    • Hadoop 2.7.2
    • JDK 1.7 U79
    • Hive 2.1.0
    • Mysql(apt-get 安装)
    • 192.168.1.166为Mysql server meta server安装位置
    • 192.168.1.159为Hive数据仓库安装位置

    二 Hive的安装-MySQL作为元数据库

    • 安装JDK-略过
    • 安装Hadoop-略过
    • 安装Mysql-略过

    三 在192.168.1.166上建立Hive meta数据库,用户,赋予权限

    1 安装Hive

    1. hadoop@hadoopmaster:~$ sudo cp -R apache-hive-2.1.0-bin /usr/local/hive
    2. hadoop@hadoopmaster:~$ sudo chmod -R 775 /usr/local/hive/
    3. hadoop@hadoopmaster:~$ sudo chown -R hadoop:hadoop /usr/local/hive

    2 修改/etc/profile加入HIVE_HOME的变量

    1. export HIVE_HOME=/usr/local/hive
    2. export PATH=$PATH:$HIVE_HOME/bin
    3. export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib:/usr/local/hive/lib
    4. $source /etc/profile

    3 修改hive/conf下的几个template模板并重命名为其他

    1. cp hive-env.sh.template hive-env.sh
    2. cp hive-default.xml.template hive-site.xml

    配置hive-env.sh文件,指定HADOOP_HOME

    1. HADOOP_HOME=/usr/local/hadoop

    4 修改hive-site.xml文件,指定MySQL数据库驱动、数据库名、用户名及密码,修改的内容如下所示

    1. <property>
    2. <name>javax.jdo.option.ConnectionURL</name>
    3. <value>jdbc:mysql://192.168.1.178:3306/hive?createDatabaseIfNotExist=true</value>
    4. <description>JDBC connect string for a JDBC metastore</description>
    5. </property>
    6. <property>
    7. <name>javax.jdo.option.ConnectionDriverName</name>
    8. <value>com.mysql.jdbc.Driver</value>
    9. <description>Driver class name for a JDBC metastore</description>
    10. </property>
    11. <property>
    12. <name>javax.jdo.option.ConnectionUserName</name>
    13. <value>hive</value>
    14. <description>username to use against metastore database</description>
    15. </property>
    16. <property>
    17. <name>javax.jdo.option.ConnectionPassword</name>
    18. <value>hive</value>
    19. <description>password to use against metastore database</description>
    20. </property>
    21. 其中:
    22. javax.jdo.option.ConnectionURL参数指定的是Hive连接数据库的连接字符串;
    23. javax.jdo.option.ConnectionDriverName参数指定的是驱动的类入口名称;
    24. javax.jdo.option.ConnectionUserName参数指定了数据库的用户名;
    25. javax.jdo.option.ConnectionPassword参数指定了数据库的密码。
    1. <property>
    2. <name>hive.exec.local.scratchdir</name>
    3. <value>/home/hadoop/iotmp</value>
    4. <description>Local scratch space for Hive jobs</description>
    5. </property>
    6. <property>
    7. <name>hive.downloaded.resources.dir</name>
    8. <value>/home/hadoop/iotmp</value>
    9. <description>Temporary local directory for added resources in the remote file system.</description>
    10. </property>

    并且需要对目录进行权限设定

    1. mkdir -p /home/hadoop/iotmp
    2. chmod -R 775 /home/hadoop/iotmp

    五修改hive/bin下的hive-config.sh文件,设置JAVA_HOME,HADOOP_HOME

    1. export JAVA_HOME=/usr/lib/jvm
    2. export HADOOP_HOME=/usr/local/hadoop

    六 下载mysql-connector-java-5.1.27-bin.jar文件,并放到$HIVE_HOME/lib目录下

    可以从Mysql的官方网站下载,但是记得一定要解压呀,下载的是一个tar.gz文件

    1. hadoop fs -mkdir /tmp
    2. hadoop fs -mkdir -p /user/hive/warehouse
    3. hadoop fs -chmod g+w /tmp
    4. hadoop fs -chmod g+w /user/hive/warehouse

    进入之前需要初始化数据库

    2 测试hive shell

    1. hive
    2. show databases;
    3. show tables;

    3可以在hadoop中查看hive生产的文件

    1. hadoop dfs -ls /user/hive/warehouse

    七 Hive shell使用实例

    在正式讲解HiveQL之前,先在命令行下运行几样命令是有好处的,可以感受一下HiveQL是如何工作的,也可以自已随便探索一下.

    1 创建数据(文本以tab分隔)

    1. ~ vi /home/cos/demo/t_hive.txt
    2. 16 2 3
    3. 61 12 13
    4. 41 2 31
    5. 17 21 3
    6. 71 2 31
    7. 1 12 34
    8. 11 2 34

    2 创建新表

    1. hive> CREATE TABLE t_hive (a int, b int, c int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
    2. OK
    3. Time taken: 0.121 seconds
    1. hive> LOAD DATA LOCAL INPATH '/tmp/t_hive.txt' OVERWRITE INTO TABLE t_hive ;
    2. Loading data to table default.t_hive
    3. Time taken: 0.609 seconds

    4 查看表

    1. hive> show tables;
    2. OK
    3. t_hive
    4. Time taken: 0.099 seconds

    5 正则匹配表名

    1. hive>show tables '*t*';
    2. OK
    3. t_hive
    4. Time taken: 0.065 seconds

    6 查看表数据

    1. hive> select * from t_hive;
    2. OK
    3. 16 2 3
    4. 61 12 13
    5. 41 2 31
    6. 17 21 3
    7. 71 2 31
    8. 1 12 34
    9. 11 2 34
    10. Time taken: 0.264 seconds

    7 查看表结构

    1. hive> desc t_hive;
    2. OK
    3. a int
    4. b int
    5. c int
    6. Time taken: 0.1 seconds

    9 重命令表名

    1. ~ ALTER TABLE t_hive RENAME TO t_hadoop;
    2. OK
    3. Time taken: 0.45 seconds
    4. hive> show tables;
    5. OK
    6. t_hadoop
    7. Time taken: 0.07 seconds

    10 删除表

    1. hive> DROP TABLE t_hadoop;
    2. OK
    3. Time taken: 0.767 seconds
    4. hive> show tables;
    5. OK
    6. Time taken: 0.064 seconds

    八 使用beeline

    HiveServer2提供了一个新的命令行工具Beeline,它是基于SQLLine CLI的JDBC客户端。
    关于SQLLine的的知识,可以参考这个网站:

    1 首先把驱动拷贝到Lib中

    1. sudo cp jdbc/hive-jdbc-2.1.0-standalone.jar /usr/local/hive/lib/

    2 启动hiveserver2的服务

    1. 命令行模式:
    2. hive --service hiveserver2 --hiveconf hive.server2.thrift.port=10001
    3. 服务模式:
    4. hiveserver2 start

    3 执行操作

    1. % bin/beeline
    2. Hive version 0.11.0-SNAPSHOT by Apache
    3. beeline> !connect jdbc:hive2://localhost:10000/default
    4. !connect jdbc:hive2://localhost:10000/default
    5. Connecting to jdbc:hive2://localhost:10000/default
    6. Connected to: Hive (version 0.10.0)
    7. Driver: Hive (version 0.10.0-SNAPSHOT)
    8. Transaction isolation: TRANSACTION_REPEATABLE_READ
    9. 0: jdbc:hive2://localhost:10000> show tables;
    10. show tables;
    11. +-------------------+
    12. | tab_name |
    13. +-------------------+
    14. | primitives |
    15. | src |
    16. | src1 |
    17. | src_sequencefile |
    18. | src_thrift |
    19. | srcbucket |
    20. | srcbucket2 |
    21. | srcpart |
    22. +-------------------+
    23. 9 rows selected (1.079 seconds)

    出错信息1

    1. hadoop@hadoopmaster:/usr/local/hive/conf$ hive
    2. SLF4J: Class path contains multiple SLF4J bindings.
    3. SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    4. SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    5. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
    6. Logging initialized using configuration in jar:file:/usr/local/hive/lib/hive-common-2.1.0.jar!/hive-log4j2.properties Async: true
    7. Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Hive metastore database is not initialized. Please use schematool (e.g. ./schematool -initSchema -dbType ...) to create the schema. If needed, don't forget to include the option to auto-create the underlying database in your JDBC connection string (e.g. ?createDatabaseIfNotExist=true for mysql))
    8. at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:578)
    9. at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:518)
    10. at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:705)
    11. at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
    12. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    13. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    14. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    15. at java.lang.reflect.Method.invoke(Method.java:606)
    16. at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    17. at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
    18. Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Hive metastore database is not initialized. Please use schematool (e.g. ./schematool -initSchema -dbType ...) to create the schema. If needed, don't forget to include the option to auto-create the underlying database in your JDBC connection string (e.g. ?createDatabaseIfNotExist=true for mysql))
    19. at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:226)
    20. at org.apache.hadoop.hive.ql.metadata.Hive.<init>(Hive.java:366)
    21. at org.apache.hadoop.hive.ql.metadata.Hive.create(Hive.java:310)
    22. at org.apache.hadoop.hive.ql.metadata.Hive.getInternal(Hive.java:290)
    23. at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:266)
    24. at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:545)
    25. ... 9 more
    26. Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Hive metastore database is not initialized. Please use schematool (e.g. ./schematool -initSchema -dbType ...) to create the schema. If needed, don't forget to include the option to auto-create the underlying database in your JDBC connection string (e.g. ?createDatabaseIfNotExist=true for mysql))
    27. at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3593)
    28. at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:236)
    29. at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:221)
    30. ... 14 more
    31. Caused by: MetaException(message:Hive metastore database is not initialized. Please use schematool (e.g. ./schematool -initSchema -dbType ...) to create the schema. If needed, don't forget to include the option to auto-create the underlying database in your JDBC connection string (e.g. ?createDatabaseIfNotExist=true for mysql))
    32. at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3364)
    33. at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3336)
    34. at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3590)
    35. ... 16 more

    没有执行

    1. schematool -initSchema -dbType mysql

    执行之后搞定

    出错信息2

    1. hadoop@hadoopmaster:/usr/local/hive/lib$ hive
    2. SLF4J: Class path contains multiple SLF4J bindings.
    3. SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    4. SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    5. SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    6. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
    7. Logging initialized using configuration in jar:file:/usr/local/hive/lib/hive-common-2.1.0.jar!/hive-log4j2.properties Async: true
    8. Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
    9. at org.apache.hadoop.fs.Path.initialize(Path.java:205)
    10. at org.apache.hadoop.fs.Path.<init>(Path.java:171)
    11. at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:631)
    12. at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:550)
    13. at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:518)
    14. at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:705)
    15. at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
    16. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    17. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    18. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    19. at java.lang.reflect.Method.invoke(Method.java:606)
    20. at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    21. at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
    22. Caused by: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
    23. at java.net.URI.checkPath(URI.java:1804)
    24. at java.net.URI.<init>(URI.java:752)
    25. at org.apache.hadoop.fs.Path.initialize(Path.java:202)
    26. ... 12 more
    1. <property>
    2. <name>hive.exec.local.scratchdir</name>
    3. <value>/home/hadoop/iotmp</value>
    4. <description>Local scratch space for Hive jobs</description>
    5. </property>
    6. <property>
    7. <name>hive.downloaded.resources.dir</name>
    8. <value>/home/hadoop/iotmp</value>
    9. <description>Temporary local directory for added resources in the remote file system.</description>
    10. </property>
    11. <property>

    并且需要对目录进行权限设定

    参考文档