PHP生成word文档的三种实现方式

最近工作遇到关于生成word的问题
现在总结一下生成word的三种方法。
btw：好像只要是标题带PHP的貌似点击量都不是很高（哥哥我标题还是带上PHP了），不知道为什么，估计博客园上net技术大牛比较多吧，如果把java，.net，php比作程序员的女友，那么java是Oracle门下的大家闺秀，.net微软旗下的名门望族，PHP则是草根门下的山村野姑，这让我等PHP草民闷骚男情何以堪情何以堪。。牢骚发完了，正式写吧

PHP生成word原理

利用windows下面的 com组件
利用PHP将内容写入doc文件之中

具体实现：
利用windows下面的 com组件
原理：com作为PHP的一个扩展类，安装过office的服务器会自动调用word.application的com，可以自动生成文档，PHP官方文档手册：http://www.php.net/manual/en/class.com.php

使用官方实例：

<?php// starting word$word = new COM（"word.application"） or die（"Unable to instantiate Word"）;echo "Loaded Word, version {$word->Version}
"; //bring it to front$word->Visible = 1; //open an empty document$word->Documents->Add（）; //do some weird stuff$word->Selection->TypeText（"This is a test..."）;$word->Documents[1]->SaveAs（"Useless test.doc"）; //closing word$word->Quit（）; //free the object$word = null;?>

个人建议：com实例后的方法都需要查找官方文档才知道什么意思，编辑器没有代码提示，非常不方便，另外这个效率也不是很高，不推荐使用
利用PHP将内容写入doc文件之中
这个方法又可以分为两种方法

生成mht格式（和HTML很相似）写入word
纯HTML格式写入word

生成mht格式（和HTML很相似）写入word

/** * 根据HTML代码获取word文档内容 * 创建一个本质为mht的文档，该函数会分析文件内容并从远程下载页面中的图片资源 * 该函数依赖于类MhtFileMaker * 该函数会分析img标签，提取src的属性值。但是，src的属性值必须被引号包围，否则不能提取 ** @param string $content HTML内容 * @param string $absolutePath 网页的绝对路径。如果HTML内容里的图片路径为相对路径，那么就需要填写这个参数，来让该函数自动填补成绝对路径。这个参数最后需要以/结束 * @param bool $isEraseLink 是否去掉HTML内容中的链接 */function getWordDocument（ $content , $absolutePath = "" , $isEraseLink = true ）{ $mht = new MhtFileMaker（）; if （$isEraseLink）$content = preg_replace（"/<as*.*?s*>（s*.*?s*）</a>/i" , "$1" , $content）; //去掉链接$images = array（）; $files = array（）; $matches = array（）; //这个算法要求src后的属性值必须使用引号括起来 if （ preg_match_all（"/<img[.
]*?srcs*?=s*?[""]（.*?）[""]（.*?）/>/i",$content ,$matches ） ） {$arrPath = $matches[1];for （ $i=0;$i<count（$arrPath）;$i++）{ $path = $arrPath[$i]; $imgPath = trim（ $path ）; if （ $imgPath ！= "" ） {$files[] = $imgPath;if（ substr（$imgPath,0,7） == "http://"）{ //绝对链接，不加前缀}else{ $imgPath = $absolutePath.$imgPath;}$images[] = $imgPath; }} } $mht->AddContents（"tmp.html",$mht->GetMimeType（"tmp.html"）,$content）; for （ $i=0;$i<count（$images）;$i++） {$image = $images[$i];if （ @fopen（$image , "r"） ）{ $imgcontent = @file_get_contents（ $image ）; if （ $content ）$mht->AddContents（$files[$i],$mht->GetMimeType（$image）,$imgcontent）;}else{ echo "file:".$image." not exist！<br />";} } return $mht->GetFile（）;}

这个函数的主要功能其实就是分析HTML代码中的所有图片地址，并且依次下载下来。获取到了图片的内容以后，调用MhtFileMaker类，将图片添加到mht文件中。具体的添加细节，封装在MhtFileMaker类中了。
使用方法：远程调用

url= http://www.***.com; $content = file_get_contents（$url）; $fileContent = getWordDocument（$content,"http://www.jb51.net/Music/etc/"）;$fp = fopen（"test.doc", "w"）;fwrite（$fp, $fileContent）;fclose（$fp）;

其中，$content变量应该是HTML源代码，后面的链接应该是能填补HTML代码中图片相对路径的URL地址
本地生成调用：

header（"Cache-Control: no-cache, must-revalidate"）; header（"Pragma: no-cache"）; $wordStr = "http://www.jb51.net/"; $fileContent = getWordDocument（$wordStr）; $fileName = iconv（"utf-8", "GBK", ‘jb51" . "_". $intro . "_" . rand（100, 999））; header（"Content-Type: application/doc"）; header（"Content-Disposition: attachment; filename=" . $fileName . ".doc"）; echo $fileContent;

注意，在使用这个函数之前，您需要先包含类MhtFileMaker，这个类可以帮助我们生成Mht文档。

<?php/***********************************************************************Class:Mht File MakerVersion:1.2 betaDate: 02/11/2007Author:Wudi Description: The class can make .mht file.***********************************************************************/ class MhtFileMaker{ var $config = array（）; var $headers = array（）; var $headers_exists = array（）; var $files = array（）; var $boundary; var $dir_base; var $page_first;function MhtFile（$config = array（））{}function SetHeader（$header）{$this->headers[] = $header;$key = strtolower（substr（$header, 0, strpos（$header, ":"）））;$this->headers_exists[$key] = TRUE; }function SetFrom（$from）{$this->SetHeader（"From: $from"）; }function SetSubject（$subject）{$this->SetHeader（"Subject: $subject"）; }function SetDate（$date = NULL, $istimestamp = FALSE）{if （$date == NULL） { $date = time（）;}if （$istimestamp == TRUE） { $date = date（"D, d M Y H:i:s O", $date）;}$this->SetHeader（"Date: $date"）; }function SetBoundary（$boundary = NULL）{if （$boundary == NULL） { $this->boundary = "--" . strtoupper（md5（mt_rand（））） . "_MULTIPART_MIXED";} else { $this->boundary = $boundary;} }function SetBaseDir（$dir）{$this->dir_base = str_replace（"\", "/", realpath（$dir））; }function SetFirstPage（$filename）{$this->page_first = str_replace（"\", "/", realpath（"{$this->dir_base}/$filename"））; }function AutoAddFiles（）{if （！isset（$this->page_first）） { exit （"Not set the first page."）;}$filepath = str_replace（$this->dir_base, "", $this->page_first）;$filepath = "http://mhtfile" . $filepath;$this->AddFile（$this->page_first, $filepath, NULL）;$this->AddDir（$this->dir_base）; }function AddDir（$dir）{$handle_dir = opendir（$dir）;while （$filename = readdir（$handle_dir）） { if （（$filename！="."） && （$filename！=".."） && （"$dir/$filename"！=$this->page_first）） {if （is_dir（"$dir/$filename"）） { $this->AddDir（"$dir/$filename"）;} elseif （is_file（"$dir/$filename"）） { $filepath = str_replace（$this->dir_base, "", "$dir/$filename"）; $filepath = "http://mhtfile" . $filepath; $this->AddFile（"$dir/$filename", $filepath, NULL）;} }}closedir（$handle_dir）; }function AddFile（$filename, $filepath = NULL, $encoding = NULL）{if （$filepath == NULL） { $filepath = $filename;}$mimetype = $this->GetMimeType（$filename）;$filecont = file_get_contents（$filename）;$this->AddContents（$filepath, $mimetype, $filecont, $encoding）; }function AddContents（$filepath, $mimetype, $filecont, $encoding = NULL）{if （$encoding == NULL） { $filecont = chunk_split（base64_encode（$filecont）, 76）; $encoding = "base64";}$this->files[] = array（"filepath" => $filepath,"mimetype" => $mimetype,"filecont" => $filecont,"encoding" => $encoding）; }function CheckHeaders（）{if （！array_key_exists（"date", $this->headers_exists）） { $this->SetDate（NULL, TRUE）;}if （$this->boundary == NULL） { $this->SetBoundary（）;} }function CheckFiles（）{if （count（$this->files） == 0） { return FALSE;} else { return TRUE;} }function GetFile（）{$this->CheckHeaders（）;if （！$this->CheckFiles（）） { exit （"No file was added."）;}$contents = implode（"
", $this->headers）;$contents .= "
";$contents .= "MIME-Version: 1.0
";$contents .= "Content-Type: multipart/related;
";$contents .= "	boundary="{$this->boundary}";
";$contents .= "	type="" . $this->files[0]["mimetype"] . ""
";$contents .= "X-MimeOLE: Produced By Mht File Maker v1.0 beta
";$contents .= "
";$contents .= "This is a multi-part message in MIME format.
";$contents .= "
";foreach （$this->files as $file） { $contents .= "--{$this->boundary}
"; $contents .= "Content-Type: $file[mimetype]
"; $contents .= "Content-Transfer-Encoding: $file[encoding]
"; $contents .= "Content-Location: $file[filepath]
"; $contents .= "
"; $contents .= $file["filecont"]; $contents .= "
";}$contents .= "--{$this->boundary}--
";return $contents; }function MakeFile（$filename）{$contents = $this->GetFile（）;$fp = fopen（$filename, "w"）;fwrite（$fp, $contents）;fclose（$fp）; }function GetMimeType（$filename）{$pathinfo = pathinfo（$filename）;switch （$pathinfo["extension"]） { case "htm": $mimetype = "text/html"; break; case "html": $mimetype = "text/html"; break; case "txt": $mimetype = "text/plain"; break; case "cgi": $mimetype = "text/plain"; break; case "php": $mimetype = "text/plain"; break; case "css": $mimetype = "text/css"; break; case "jpg": $mimetype = "image/jpeg"; break; case "jpeg": $mimetype = "image/jpeg"; break; case "jpe": $mimetype = "image/jpeg"; break; case "gif": $mimetype = "image/gif"; break; case "png": $mimetype = "image/png"; break; default: $mimetype = "application/octet-stream"; break;}return $mimetype; }}?>

点评：这种方法的缺点是不支持批量生成下载，因为一个页面只能有一个header，（无论远程使用还是本地生成声明header页面只能输出一个header），即使你循环生成，结果还是只有一个word生成（当然你可以修改上面的方式来实现）
2.纯HTML格式写入word
原理：
利用ob_start把html页面先存储起来（解决一下页面多个header问题，可以批量生成），然后在写入doc文档内容利用
代码：

<?phpclass word{ function start（）{ob_start（）;echo "<html xmlns:o="urn:schemas-microsoft-com:office:office"xmlns:w="urn:schemas-microsoft-com:office:word"xmlns="http://www.w3.org/TR/REC-html40">";}function save（$path）{ echo "</html>";$data = ob_get_contents（）;ob_end_clean（）; $this->wirtefile （$path,$data）;} function wirtefile （$fn,$data）{$fp=fopen（$fn,"wb"）;fwrite（$fp,$data）;fclose（$fp）;}}

$html = " <table width=600 cellpadding="6" cellspacing="1" bgcolor="#336699"> <tr bgcolor="White"><td>PHP10086</td><td><a href="http://www.php10086.com" target="_blank" >http://www.php10086.com</a></td> </tr> <tr bgcolor="red"><td>PHP10086</td><td><a href="http://www.php10086.com" target="_blank" >http://www.php10086.com</a></td> </tr> <tr bgcolor="White"><td colspan=2 >PHP10086<br>最靠谱的PHP技术博客分享网站<img src="http://www.php10086.com/wp-content/themes/WPortal-Blue/images/logo.gif"></td> </tr> </table> ";//批量生成 for（$i=1;$i<=3;$i++）{$word = new word（）;$word->start（）;//$html = "aaa".$i;$wordname = "PHP淮北的个人网站--PHP10086.com".$i.".doc";echo $html;$word->save（$wordname）;ob_flush（）;//每次执行前刷新缓存flush（）; }

个人点评：这种方法效果最好，原因有两个：
第一代码比较简洁，很容易理解，第二种支持批量生成word（这个很重要）
第三支持完整的html代码
生成了三个word文档：并且内容支持完整的html代码显示，第三种方法强烈推荐

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持脚本之家。