Thrift JDBC Server描述Thrift JDBC Server使用的是HIVE0.12的HiveServer2实现。能够使用Spark或者hive0.12版本的beeline脚本与JDBC Server进行交互使用。Thrift JDBC Server默认监听端口是10000。使用Thrift JDBC Server前需要注意:1、将hive-site.xml配置文件拷贝到$SPARK_HOME/conf目录下;2、需要在$SPARK_HOME/conf/spark-env.sh中的SPARK_CLASSPATH添加jdbc驱动的jar包export SPARK_CLASSPATH=$SPARK_CLASSPATH:/home/Hadoop/software/mysql-connector-java-5.1.27-bin.jarThrift JDBC Server命令使用帮助:cd $SPARK_HOME/sbin start-thriftserver.sh --help 复制代码 Usage: ./sbin/start-thriftserver [options] [thrift server options] Spark assembly has been built with Hive, including Datanucleus jars on classpath Options: --master MASTER_URL spark://host:port, mesos://host:port, yarn, or local. --deploy-mode DEPLOY_MODE Whether to launch the driver program locally ("client") or on one of the worker machines inside the cluster ("cluster") (Default: client). --class CLASS_NAME Your application"s main class (for Java / Scala apps). --name NAME A name of your application. --jars JARS Comma-separated list of local jars to include on the driver and executor classpaths. --py-files PY_FILES Comma-separated list of .zip, .egg, or .py files to place on the PYTHONPATH for Python apps. --files FILES Comma-separated list of files to be placed in the working directory of each executor. --conf PROP=VALUE Arbitrary Spark configuration property. --properties-file FILE Path to a file from which to load extra properties. If not specified, this will look for conf/spark-defaults.conf. --driver-memory MEM Memory for driver (e.g. 1000M, 2G) (Default: 512M). --driver-java-options Extra Java options to pass to the driver. --driver-library-path Extra library path entries to pass to the driver. --driver-class-path Extra class path entries to pass to the driver. Note that jars added with --jars are automatically included in the classpath. --executor-memory MEM Memory per executor (e.g. 1000M, 2G) (Default: 1G). --help, -h Show this help message and exit --verbose, -v Print additional debug output Spark standalone with cluster deploy mode only: --driver-cores NUM Cores for driver (Default: 1). --supervise If given, restarts the driver on failure. Spark standalone and Mesos only: --total-executor-cores NUM Total cores for all executors. YARN-only: --executor-cores NUM Number of cores per executor (Default: 1). --queue QUEUE_NAME The YARN queue to submit to (Default: "default"). --num-executors NUM Number of executors to launch (Default: 2). --archives ARCHIVES Comma separated list of archives to be extracted into the working directory of each executor.Thrift server options: --hiveconf <property=value> Use value for given propertymaster的描述与Spark SQL CLI一致beeline命令使用帮助:cd $SPARK_HOME/bin beeline --helpUsage: java org.apache.hive.cli.beeline.BeeLine -u <database url> the JDBC URL to connect to -n <username> the username to connect as -p <password> the password to connect as -d <driver class> the driver class to use -e <query> query that should be executed -f <file> script file that should be executed --color=[true/false] control whether color is used for display --showHeader=[true/false] show column names in query results --headerInterval=ROWS; the interval between which heades are displayed --fastConnect=[true/false] skip building table/column list for tab-completion --autoCommit=[true/false] enable/disable automatic transaction commit --verbose=[true/false] show verbose error messages and debug info --showWarnings=[true/false] display connection warnings --showNestedErrs=[true/false] display nested errors --numberFormat=[pattern] format numbers using DecimalFormat pattern --force=[true/false] continue running script even after errors --maxWidth=MAXWIDTH the maximum width of the terminal --maxColumnWidth=MAXCOLWIDTH the maximum width to use when displaying columns --silent=[true/false] be more silent --autosave=[true/false] automatically save preferences --outputformat=[table/vertical/csv/tsv] format mode for result display --isolation=LEVEL set the transaction isolation level --help display this messageThrift JDBC Server/beeline启动启动Thrift JDBC Server:默认端口是10000cd $SPARK_HOME/sbin start-thriftserver.sh如何修改Thrift JDBC Server的默认监听端口号?借助于--hiveconfstart-thriftserver.sh --hiveconf hive.server2.thrift.port=14000HiveServer2 Clients 详情参见:https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients启动beelinecd $SPARK_HOME/bin beeline -u jdbc:hive2://hadoop000:10000/default -n hadoopsql脚本测试SELECT track_time, url, session_id, referer, ip, end_user_id, city_id FROM page_views WHERE city_id = -1000 limit 10; SELECT session_id, count(*) c FROM page_views group by session_id order by c desc limit 10;本文永久更新链接地址:http://www.linuxidc.com/Linux/2014-09/106619.htm