Are some languages ​​greener than others?

<p>The lifecycle of a computer application (expressing needs, specifications, cycle development / qualification, production, end use) is marked by decisions impacting significantly its overall environmental footprint.<br />
<br />
Upstream of the development phase, is the choice of a programming language one of those decisions? In particular, has&nbsp;&nbsp;this choice a quantifiable impact on a software energy consumption?<br />
<br />
We shall answer this last question in the limited context of a case study.</p>
<p>We compare the power consumption induced by the execution of a series of four programs with the same functionality written in C + + and run the same programs written in Cawen, the language that we (Thomas Samain&nbsp;&nbsp;and Gwena&euml;l Chailleu, members of the Green Code Lab) are currently developing.</p>
<p style="margin-bottom: 0cm">&nbsp;</p>
<!--break-->
<h2>
&nbsp;</h2>
<h2>
The test</h2>
<p>Among the comparative cross-language tests available on the web, we chose to use Google R &amp; D engineer Robert Hundt&rsquo;s.</p>
<p>His study, published in 2011, was to implement the same Operational Research algorithm (loop detection in a graph, based on the work of P.Havlak and Tarjan RE) in 4 different languages : <em>C++, Java, go</em> et <em>Scala</em>.</p>
<p>For each language, the test measured the length of the source code, the compilation time and the size of the generated executable.</p>
<p>At runtime, it also measured the memory consumption and processing time.</p>
<p>The results <a href="https://days2011.scala-lang.org/sites/days2011/files/ws3-1-Hundt.pdf">are available here.</a></p>
<p>&nbsp;</p>
<h2>
The first results</h2>
<p>Regarding the criteria related to the implementation, C + + far outweighs its competitors: the executable produced by the C + + compiler consumes between 3 and 4 times less memory and processing is 2.5 to 12.5 times faster than for other languages.</p>
<p>With <em>C++</em> displaying good results on static criteria as well , it was considered the quite legitimate winner of the contest.</p>
<p>&nbsp;</p>
<h2>
Optimizations and variants</h2>
<p>Robert Hundt submitted the test to some C++ specialists, who optimized the original program by replacing some container objects.</p>
<p>Independently, another two developers, <a href="http://code.google.com/p/benchgraffiti/source/browse/havlak/?r=d856c2f69... Cox</a> (Go developer, he explains his optimization process <a href="http://blog.golang.org/2011/06/profiling-go-programs.html">here</a>) and <a href="http://howtowriteaprogram.blogspot.fr/2011/10/loop-counting.html">Anthony C. Hay</a>, have proposed their own implementation.</p>
<p>&nbsp;</p>
<h2 class="western">
Performance and energy consumption</h2>
<p>We used the<a href="http://www.greencodelab.fr/content/green-plugwise"> Green Code Lab software</a> and followed its <a href="http://www.greencodelab.fr/content/mesurer-limpact-energetique-dun-logic... and, with the advice of its contributors, we measured the power consumption at executing different versions of the model.</p>
<p>Our test environment is as follows:</p>
<p><i>gcc 4.5.3 / Cygwin6.0/ Intel Pentium Dual-Core T4200 /64b/2GHz /4G RAM . </i></p>
<p><br />
This is the configuration that we call <i>cygwin</i> <a href="http://www.melvenn.com/fr/google-benchmark/">on our site</a>.</p>
<p>The consumption curves obtained are displayed here:</p>
<p><img src="htthp://www.melvenn.com/wp-content/uploads/2013/04/G3.jpg" width="700px/" /></p>
<p>And the winners are</p>
<p>R.Cox, A.Hay, R.Hundt 2 puis R.Hundt 1.</p>
<p>The energy consumption seems perfectly correlated to the execution time.</p>
<p>From the most sober to the greediest, <font color="#008000"><b>the energy consumption varies by 1 to 21</b></font><font color="#008000">.</font></p>
<p>We can draw a somewhat expected conclusion:</p>
<p><b>Performance can varies widely between two implementations of the same processing developed in the same language.</b></p>
<p>Performance gains are mainly obtained by replacing some lists and hash tables by tables.</p>
<p>&nbsp;</p>
<h2 class="western">
What about changing the input data?</h2>
<p>Surprisingly enough, Robert Hundt&rsquo;s executable performs 15000 times the same treatment on a single arbitrary graph.</p>
<p>Anthony Hay proposed instead to process populations of randomly generated graphs.</p>
<p>We have established a set of 100 random graphs with up to 100 vertices and 10 000 random graphs with up to 50,000 nodes.</p>
<p><img src="http://www.melvenn.com/wp-content/uploads/2013/04/G4.jpg" width="700px/" /></p>
<p><img src="http://www.melvenn.com/wp-content/uploads/2013/04/G5.jpg" width="700px/" /></p>
<p>The ranking is upset.</p>
<p>It becomes, for graphs of up to 10,000 nodes: R.Hundt 2, A.Hay, R.Cox, R.Hundt1</p>
<p>This time,<strong> <span style="color:#006400;">the energy consumption varies by 1 to 4.1</span></strong>.</p>
<p>For graphs and more than 50 000 vertices: R.Hundt 2, R.Hundt 1, A.Hay, R.Cox</p>
<p>Between the most sober and the greediest implementations,<span style="color:#006400;"><strong> the power consumption varies by 1 to 2.3</strong></span>.</p>
<p>What gave a competitive advantage to Cox and Hay versions in the previous test is now a burden: for a large number of values, searching through a table is much slower than with a hash table. With these new inputs, the cost of inserting / deleting in hash tables is fully compensated by the acceleration obtained in the primitive research (R.Hundt 1 &amp; 2).</p>
<p>It may be noted as well that optimizations made to the R.Hundt version become inoperative on graphs of up to 50,000 vertices: for this sample, R. Hundt 1 and 2 are equivalent to the original version.</p>
<p>With a view to reducing the power consumption of research in loop graphs (a niche market if any) a programmer should favor either the solution Hundt or solutions Hay Cox depending on the input graphs resembling our random series.</p>
<p><strong>The application performance is closely linked to its use and in particular the volume and values ​​of the input data. Optimizing an application requires the knowledge of the conditions of production.</strong></p>
<p>&nbsp;</p>
<h2 class="western">
What about changing the language ?</h2>
<p>Due to lack of time, we did not measure the power consumption associated with versions of Java, Scala and go. Possibly a mission for Green Code Lab members?</p>
<p>We used all C + + versions presented earlier and translated them into Cawen, the language that we are currently developing.</p>
<p>We tested the versions C + + and Cawen on 5 machines (5 gcc / 4 OS).</p>
<p><a href="http://www.melvenn.com/fr/google-benchmark/">Results as well as the <em>C++</em>, <em>Cawen</em> and <em>C</em> sources are available here</a>.</p>
<p>Globally, memory consumption, execution time, size and source are all significantly better with Cawen than with C + +, and gains continue to grow with more recent versions developed posteriorly to this snapshot taken at an early stage.</p>
<p>Precompiling time, very important for the moment, is currently being reduced to an acceptable level.</p>
<p>We subsequently developed optimized versions of the original Cawen code. Performance gains were obtained by various methods adapted to the specific profile of each program.</p>
<p>Compared to tests that we present on site, a difference has to be noted: for measuring power consumption we decided, to a very limited extent, to fine tune the Cawen compiler parameters.</p>
<h2 class="western">
&nbsp;</h2>
<h2 class="western">
Compiler parameterization</h2>
<p>Once the code is optimized, it is possible to play with compiler options for even shorter execution times, and, to some extent, lower energy consumption.</p>
<ul>
<li>
<b>General optimization parameters</b></li>
</ul>
<p>Robert Hundt uses gcc optimization O2 and not O3 to compile his version.</p>
<p>This is actually a good choice on his test machine, for example on our server freebsd (12.4 s against 13.4 s). But this is detrimental to our other machine cygwin ... We chose to use O3 for all tests presented here.</p>
<ul>
<li>
<b>Choice between gcc and g++</b></li>
</ul>
<p>A peculiarity of the code generated by the precompiler Cawen is that it is compatible C99 / C++. Switching from one compiler to another can yield results: for version A.Hay with 10,000 random graphs, we spend 4.9 s with g++ and only 2.8 s with gcc.</p>
<ul>
<li>
<b>Unrolling loops</b></li>
</ul>
<p style="margin-bottom: 0cm">In our first series of tests we noted that algorithms A.Hay and R.Cox spent most of their time in a search loop of the values ​​of an array. We implemented a software response (templating the search function) that unraveled the loop and significantly improved the performance of the two executables ...</p>
<p style="margin-bottom: 0cm">...Before realizing that gcc offered the parameter -funroll-loops to perform exactly the same work at compilation time, transparently to the developer. For instance, for the test R. Cox / random 10000, we get 13.2 sec without the-funroll-loops and 5 with this parameter set.</p>
<p style="margin-bottom: 0cm">By opposition, our optimized version of the code (see the function in the file govel_contains2 govel_typed.h) is penalized by this parameter: the two optimizations (parameter-funroll-loops and software development) do not mingle...</p>
<p style="margin-bottom: 0cm">Up to you to find this anomaly in the following charts!</p>
<p style="margin-bottom: 0cm">&nbsp;</p>
<ul>
<li style="margin-bottom: 0cm;">
<b>machine-specific settings</b></li>
</ul>
<p>Most frequently, the gains obtained from compilation parameterization are specific to the platform (OS / processor) and not exportable. We have not explored the opportunities of gcc. They alone could provide the matter for more than one article ...</p>
<p>But there is clearly still much room for acceleration. For instance, we have not made use of the SSE primitives to exploit multi-core architectures capabilities. Shall these accelerations translate into energy gain?</p>
<p>This remains to be seen.</p>
<p>&nbsp;</p>
<h2>
What is the power consumption?</h2>
<p style="margin-bottom: 0cm;">Here are the data measured for each version coded in C + +, in Cawen (literal translation of the C + + version) and Cawen optimized (optimized version of the first Cawen version).<img src="http://www.melvenn.com/wp-content/uploads/2013/04/RH1.jpg" width="700px/" /></p>
<p><img src="http://www.melvenn.com/wp-content/uploads/2013/04/RH2.jpg" width="700px/" /></p>
<p><img src="http://www.melvenn.com/wp-content/uploads/2013/04/AH.jpg" width="700px/" /></p>
<p><img src="http://www.melvenn.com/wp-content/uploads/2013/04/RC.jpg" width="700px/" /></p>
<p>These results can be analyzed in two ways: one can consider 4 independent programs coded in 2 different languages, and for each of the 3 proposed uses (bench, 10000 random, random 50000) we measured the following energy savings:</p>
<p style="margin-bottom: 0cm;"><img src="http://www.melvenn.com/wp-content/uploads/2013/04/G2.jpg" width="700px/" /></p>
<p>Another way round is to consider that the objective was to determine the least consuming executable for every type of use.</p>
<p style="margin-bottom: 0cm;"><img src="http://www.melvenn.com/wp-content/uploads/2013/04/G1.jpg" width="700px/" /></p>
<p>In this context, in &lsquo;bench&rsquo; mode, the best Cawen program (Cawen RCOX) consumes <span style="color:#006400;"><strong>15% less than the best C++ program</strong></span> (cpp RCOX), and<span style="color:#006400;"><strong> the best optimized Cawen program 30% less</strong></span>.</p>
<p>In &lsquo;graphs up to 10000 vertices&rsquo; mode, the best Cawen program consumes <span style="color:#006400;"><strong>11 times less energy than the best C++ program</strong></span>. The best optimized Cawen programs consumes <span style="color:#006400;"><strong>17.7 times less energy than the best C + + program</strong></span>.</p>
<p>In &lsquo;graphs up to 50,000 vertices&rsquo; mode, the best Cawen program consumes <span style="color:#006400;"><strong>73 times less energy than the best C + + program</strong></span>. The best optimized Cawen program consumes <strong><span style="color:#006400;">243 times less than the best C + + program</span></strong>.</p>
<p style="margin-bottom: 0cm;">&nbsp;</p>
<h2 style="margin-bottom: 0cm;">
Conclusion</h2>
<p style="margin-bottom: 0cm">The relationship between program performance and power consumption can be complex. In this experimental setting, performance improvement and energy saving go hand in hand. Criteria for a successful optimization are:</p>
<ul>
<li>
<p>a good knowledge and simulation of the software working conditions</p>
</li>
<li>
<p>mastering the intricacies of standard libraries (why do some tables require resizing while others should be left free to grow and shrink on demand?)</p>
</li>
<li>
<p>a complete understanding of their underlying algorithms (e.g. choice hash / array)</p>
</li>
<li>
<p>a thorough analysis of the general pattern of memory usage by the (why do allocators benefit some programs and penalize others?)</p>
</li>
<li>
<p>taking advantage of the compiler options.</p>
</li>
</ul>
<p>To answer the question raised at the beginning of our article, the choice of a language, where not imposed by the functional context, the profile of developers, or the hype, has indeed a crucial role to play in software eco-design.</p>
<p>This confirms what Facebook has experienced on a large scale. Project &lsquo;HipHop for PHP&rsquo; success story has become a classic argument in favor of IT eco-conception: in 2010, the company has cut its electricity bill by half by switching from php to C++.</p>
<p>&nbsp;</p>
<p>What if they had used Cawen instead?</p>
<p>&nbsp;</p>
<p>Thomas Samain &amp; Gwena&euml;l Chailleu, 12/12/2012</p>

Technologie: 
Catégorie: 

Ajouter un commentaire