Hive优化之自动合并输出的小文件

1.先在hive-site.xml中设置小文件的标准. <property>
<name>hive.merge.smallfiles.avgsize</name>
<value>536870912</value>
<description>When the average output file size of a job is less than this number, Hive will start an additional map-reduce job to merge the output files into bigger files. This is only done for map-only jobs if hive.merge.mapfiles is true, and for map-reduce jobs if hive.merge.mapredfiles is true.</description>
</property>2.为只有map的mapreduce的输出并合并小文件.<property>
<name>hive.merge.mapfiles</name>
<value>true</value>
<description>Merge small files at the end of a map-only job</description>
</property>3.为含有reduce的mapreduce的输出并合并小文件.<property>
<name>hive.merge.mapredfiles</name>
<value>true</value>
<description>Merge small files at the end of a map-reduce job</description>
</property>Hive编程指南 PDF 中文高清版 http://www.linuxidc.com/Linux/2015-01/111837.htm基于Hadoop集群的Hive安装 http://www.linuxidc.com/Linux/2013-07/87952.htmHive内表和外表的区别 http://www.linuxidc.com/Linux/2013-07/87313.htmHadoop + Hive + Map +reduce 集群安装部署 http://www.linuxidc.com/Linux/2013-07/86959.htmHive本地独立模式安装 http://www.linuxidc.com/Linux/2013-06/86104.htmHive学习之WordCount单词统计 http://www.linuxidc.com/Linux/2013-04/82874.htmHive运行架构及配置部署 http://www.linuxidc.com/Linux/2014-08/105508.htmHive 的详细介绍：请点这里
Hive 的下载地址：请点这里本文永久更新链接地址