C++在循环内和循环外定义变量的差异（如何写出高效的for循环）

写这篇文章的原因是我在问答平台看到的一个问题：C++内层循环中定义变量和在外面定义比影响大吗？

例如：
for（int i=0;i<999;i++） {
for（int j=0;j<999;j++）;
}内层循环每次都定义j会造成多大的消耗呢？

此处我给出的回答是：

这个需要看你具体用什么编译器。不过主流编译器（如vs和gcc）这一块优化都比较好，不会反复分配变量。

看到答案和评论，好像有很多人对这个感兴趣，所以我打算给大家实测分享一下，于是写了如下代码进行测试：

#include <cstdio>using namespace std;void Test1（）{for （int i = 0; i < 2; i++）{for （int j = 0; j < 3; j++）{printf（"%d,%d
", int（i）, int（j））;}}}void Test2（）{int i, j;for （i = 0; i < 2; i++）{for （j = 0; j < 3; j++）{printf（"%d,%d
", int（i）, int（j））;}}}int main（）{Test1（）;Test2（）;return 0;}

OK，程序非常简单，Test1和Test2是两个循环，干相同的事情，就是在双重循环里打印一下 i 和 j 的值，差别只在于一个在循环外定义变量 j，另一个在循环内定义变量 j。此处我使用g++进行编译，优化等级是O0（这是GCC默认的优化等级，也是最低的优化等级）的：g++ -O0 -g test.cpp编译后，我将生成的Test1函数和Test2函数反汇编出来，得出的结果是这样的：Test1函数反汇编如下：

（gdb） disas /m Test1Dump of assembler code for function Test1（）:5 { 0x0804841d <+0>: push %ebp 0x0804841e <+1>: mov%esp,%ebp 0x08048420 <+3>: sub$0x28,%esp6 for （int i = 0; i < 2; i++） 0x08048423 <+6>: movl $0x0,-0x10（%ebp） 0x0804842a <+13>:jmp0x804845d <Test1（）+64> 0x08048459 <+60>:addl $0x1,-0x10（%ebp） 0x0804845d <+64>:cmpl $0x1,-0x10（%ebp） 0x08048461 <+68>:jle0x804842c <Test1（）+15>7 {8 for （int j = 0; j < 3; j++） 0x0804842c <+15>:movl $0x0,-0xc（%ebp） 0x08048433 <+22>:jmp0x8048453 <Test1（）+54> 0x0804844f <+50>:addl $0x1,-0xc（%ebp） 0x08048453 <+54>:cmpl $0x2,-0xc（%ebp） 0x08048457 <+58>:jle0x8048435 <Test1（）+24>9 {10printf（"%d,%d
", int（i）, int（j））; 0x08048435 <+24>:mov-0xc（%ebp）,%eax 0x08048438 <+27>:mov%eax,0x8（%esp） 0x0804843c <+31>:mov-0x10（%ebp）,%eax 0x0804843f <+34>:mov%eax,0x4（%esp） 0x08048443 <+38>:movl $0x8048560,（%esp） 0x0804844a <+45>:call 0x80482f0 <printf@plt>11}12}1314} 0x08048463 <+70>:leave 0x08048464 <+71>:ret

Test2函数反汇编如下：

（gdb） disas /m Test2Dump of assembler code for function Test2（）:17{ 0x08048465 <+0>: push %ebp 0x08048466 <+1>: mov%esp,%ebp 0x08048468 <+3>: sub$0x28,%esp18int i, j;1920for （i = 0; i < 2; i++） 0x0804846b <+6>: movl $0x0,-0x10（%ebp） 0x08048472 <+13>:jmp0x80484a5 <Test2（）+64> 0x080484a1 <+60>:addl $0x1,-0x10（%ebp） 0x080484a5 <+64>:cmpl $0x1,-0x10（%ebp） 0x080484a9 <+68>:jle0x8048474 <Test2（）+15>21{22for （j = 0; j < 3; j++） 0x08048474 <+15>:movl $0x0,-0xc（%ebp） 0x0804847b <+22>:jmp0x804849b <Test2（）+54> 0x08048497 <+50>:addl $0x1,-0xc（%ebp） 0x0804849b <+54>:cmpl $0x2,-0xc（%ebp） 0x0804849f <+58>:jle0x804847d <Test2（）+24>23{24printf（"%d,%d
", int（i）, int（j））; 0x0804847d <+24>:mov-0xc（%ebp）,%eax 0x08048480 <+27>:mov%eax,0x8（%esp） 0x08048484 <+31>:mov-0x10（%ebp）,%eax 0x08048487 <+34>:mov%eax,0x4（%esp） 0x0804848b <+38>:movl $0x8048560,（%esp） 0x08048492 <+45>:call 0x80482f0 <printf@plt>25}26}27} 0x080484ab <+70>:leave 0x080484ac <+71>:retEnd of assembler dump.

在Test1的反汇编中，我们在内部for （int j = 0; j < 3; j++）下面，没有看到分配变量 j 的汇编指令，如果再只打印Test1和Test2的汇编代码，经过对比，你们发现这两个函数产生的汇编指令是完全一样的：

（gdb） disas Test1Dump of assembler code for function Test1（）: 0x0804841d <+0>: push %ebp 0x0804841e <+1>: mov%esp,%ebp 0x08048420 <+3>: sub$0x28,%esp 0x08048423 <+6>: movl $0x0,-0x10（%ebp） 0x0804842a <+13>:jmp0x804845d <Test1（）+64> 0x0804842c <+15>:movl $0x0,-0xc（%ebp） 0x08048433 <+22>:jmp0x8048453 <Test1（）+54> 0x08048435 <+24>:mov-0xc（%ebp）,%eax 0x08048438 <+27>:mov%eax,0x8（%esp） 0x0804843c <+31>:mov-0x10（%ebp）,%eax 0x0804843f <+34>:mov%eax,0x4（%esp） 0x08048443 <+38>:movl $0x8048560,（%esp） 0x0804844a <+45>:call 0x80482f0 <printf@plt> 0x0804844f <+50>:addl $0x1,-0xc（%ebp） 0x08048453 <+54>:cmpl $0x2,-0xc（%ebp） 0x08048457 <+58>:jle0x8048435 <Test1（）+24> 0x08048459 <+60>:addl $0x1,-0x10（%ebp） 0x0804845d <+64>:cmpl $0x1,-0x10（%ebp） 0x08048461 <+68>:jle0x804842c <Test1（）+15> 0x08048463 <+70>:leave 0x08048464 <+71>:retEnd of assembler dump.



（gdb） disas Test2 Dump of assembler code for function Test2（）: 0x08048465 <+0>: push %ebp 0x08048466 <+1>: mov%esp,%ebp 0x08048468 <+3>: sub$0x28,%esp 0x0804846b <+6>: movl $0x0,-0x10（%ebp） 0x08048472 <+13>:jmp0x80484a5 <Test2（）+64> 0x08048474 <+15>:movl $0x0,-0xc（%ebp） 0x0804847b <+22>:jmp0x804849b <Test2（）+54> 0x0804847d <+24>:mov-0xc（%ebp）,%eax 0x08048480 <+27>:mov%eax,0x8（%esp） 0x08048484 <+31>:mov-0x10（%ebp）,%eax 0x08048487 <+34>:mov%eax,0x4（%esp） 0x0804848b <+38>:movl $0x8048560,（%esp） 0x08048492 <+45>:call 0x80482f0 <printf@plt> 0x08048497 <+50>:addl $0x1,-0xc（%ebp） 0x0804849b <+54>:cmpl $0x2,-0xc（%ebp） 0x0804849f <+58>:jle0x804847d <Test2（）+24> 0x080484a1 <+60>:addl $0x1,-0x10（%ebp） 0x080484a5 <+64>:cmpl $0x1,-0x10（%ebp） 0x080484a9 <+68>:jle0x8048474 <Test2（）+15> 0x080484ab <+70>:leave 0x080484ac <+71>:retEnd of assembler dump.

当然，这里只测试了g++的编译效果。vs下的效果大家可以自己测试。目前可以肯定，如果你使用gcc的编译器，你完全可以不用纠结在循环外定义变量还是循环内定义变量，因为效果完全是一样的，不过为了代码好看，还是写到循环内吧。上面已经探究了使用基本数据类型int作为循环变量的情况，这里需要进阶一下，探讨一下如果我使用的不是int，而是一个复杂的对象，那循环的效果又是如何呢？为了方便看到变量的分配，我在类的构造函数里加了打印语句，可以让我们方便地看到类的对象被创建的情况：

#include <cstdio>using namespace std;class MyInt{public:MyInt（int i）:m_iValue（i）{printf（"Constructed: MyInt（%d）
", i）;}MyInt（）{printf（"Constructed: MyInt（）
"）;}MyInt &operator++（int i） {m_iValue ++;return *this;}bool const operator <（const MyInt& another）{return m_iValue < another.m_iValue;}operator int（）{return m_iValue;}MyInt &operator =（int i）{m_iValue = i;return *this;}private:int m_iValue;};void Test1（）{for （MyInt i = MyInt（0）; i < MyInt（2）; i++）{for （MyInt j = MyInt（0）; j < MyInt（3）; j++）{printf（"%d,%d
", int（i）, int（j））;}}}void Test2（）{MyInt i, j;for （i = MyInt（0）; i < MyInt（2）; i++）{for （j = MyInt（0）; j < MyInt（3）; j++）{printf（"%d,%d
", int（i）, int（j））;}}}void Test3（）{MyInt i, j;for （i = 0; int（i） < 2; i++）{for （j = 0; int（j） < 3; j++）{printf（"%d,%d
", int（i）, int（j））;}}}int main（）{printf（"Test1---------------------------------
"）;Test1（）;printf（"Test2---------------------------------
"）;Test2（）;printf（"Test3---------------------------------
"）;Test3（）;return 0;}

好的，还是使用g++ -O0编译，我们来看看执行结果：

Test1---------------------------------Constructed: MyInt（0）Constructed: MyInt（2）Constructed: MyInt（0）Constructed: MyInt（3）0,0Constructed: MyInt（3）0,1Constructed: MyInt（3）0,2Constructed: MyInt（3）Constructed: MyInt（2）Constructed: MyInt（0）Constructed: MyInt（3）1,0Constructed: MyInt（3）1,1Constructed: MyInt（3）1,2Constructed: MyInt（3）Constructed: MyInt（2）Test2---------------------------------Constructed: MyInt（）Constructed: MyInt（）Constructed: MyInt（0）Constructed: MyInt（2）Constructed: MyInt（0）Constructed: MyInt（3）0,0Constructed: MyInt（3）0,1Constructed: MyInt（3）0,2Constructed: MyInt（3）Constructed: MyInt（2）Constructed: MyInt（0）Constructed: MyInt（3）1,0Constructed: MyInt（3）1,1Constructed: MyInt（3）1,2Constructed: MyInt（3）Constructed: MyInt（2）Test3---------------------------------Constructed: MyInt（）Constructed: MyInt（）0,00,10,21,01,11,2

可以看到，Test3创建对象的次数是最少的，如果对象比较复杂，显然Test3会是最高效的编码方式。对于整个程序的输出，我们可以分析一下：

对于C++内置的基本数据类型，编译器有相关的优化，在双重循环中会避免掉对象的反复分配，但对于复杂的类对象，编译器似乎不会轻易优化，所以我们在Test1中仍然看到了对j变量多次分配动作。
在Test2中，由于我们在循环外定义了j变量，所以这里没有发生对j变量的反复分配，但由于赋值条件i = MyInt（0）和j = MyInt（0）以及判断条件i < MyInt（2）和j < MyInt（3）中需要构造MyInt（2）和MyInt（3）对象，所以我们仍然看到循环中多次的变量分配。
而在Test3中，我们换了一种方式，用重载运算符=直接在赋值语句中给对象赋整型值，避免了赋值语句中创建MyInt对象，并用int（i） < 2和int（j） < 3，避免了在判断条件里创建MyInt对象，所以整段代码里只在循环外分配了两次变量，这其实是最高效的方式。

最后总结：

对于使用int等基本数据类型作为循环变量，只要你用的优化方面足够给力的主流的编译器，完全不需要关心在循环外还是循环内定义循环变量。
如果循环变量本身是复杂的对象，建议在循环外定义好，并且在for循环的赋值语句、判断语句中，都要避免重复创建对象。

------------------------------分割线------------------------------C++ Primer Plus 第6版中文版清晰有书签PDF+源代码 http://www.linuxidc.com/Linux/2014-05/101227.htm读C++ Primer 之构造函数陷阱 http://www.linuxidc.com/Linux/2011-08/40176.htm读C++ Primer 之智能指针 http://www.linuxidc.com/Linux/2011-08/40177.htm读C++ Primer 之句柄类 http://www.linuxidc.com/Linux/2011-08/40175.htm将C语言梳理一下，分布在以下10个章节中：

Linux-C成长之路（一）：Linux下C编程概要 http://www.linuxidc.com/Linux/2014-05/101242.htm
Linux-C成长之路（二）：基本数据类型 http://www.linuxidc.com/Linux/2014-05/101242p2.htm
Linux-C成长之路（三）：基本IO函数操作 http://www.linuxidc.com/Linux/2014-05/101242p3.htm
Linux-C成长之路（四）：运算符 http://www.linuxidc.com/Linux/2014-05/101242p4.htm
Linux-C成长之路（五）：控制流 http://www.linuxidc.com/Linux/2014-05/101242p5.htm
Linux-C成长之路（六）：函数要义 http://www.linuxidc.com/Linux/2014-05/101242p6.htm
Linux-C成长之路（七）：数组与指针 http://www.linuxidc.com/Linux/2014-05/101242p7.htm
Linux-C成长之路（八）：存储类，动态内存 http://www.linuxidc.com/Linux/2014-05/101242p8.htm
Linux-C成长之路（九）：复合数据类型 http://www.linuxidc.com/Linux/2014-05/101242p9.htm
Linux-C成长之路（十）：其他高级议题

本文永久更新链接地址：http://www.linuxidc.com/Linux/2015-05/117019.htm