Useless Microoptimizations Homepage Forum Forum Index Useless Microoptimizations Homepage Forum
Don't get confused, this is just my homepage, not really a message board. I implemented it as a forum for reasons you can find here.
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

crabench - results that concentrate on memory timings and memory bandwidth

 
Post new topic   Reply to topic    Useless Microoptimizations Homepage Forum Forum Index -> General hardware notes
View previous topic :: View next topic  
Author Message
Useless Microoptimizations
Site Admin


Joined: 09 Feb 2005
Posts: 114
Location: Boston, MA, USA

PostPosted: Tue Oct 11, 2005 7:43 pm   Reply with quote

Normal results for my benchmarks and general explanations are accessible here:
http://cracauer-forum.cons.org/forum/crabench.html

But in this page I added a specialized chart that uses just one CPU with different memory timings and memory frequencies. The chart is cut off closely around the reference point to show differences in detail. If you need the higher values look at the normal crabench page, all these settings are also in the full charts.

The CPU chosen is my X2 3800+ clocked at 2.4 GHz. I also left the 2.4 GHz dual Opteron in. The chart is cut off closely around the reference point to show differences in detail.

Please see http://www.cons.org/cracauer/crabench/memory3.user_narrow.html for a standalone page of this chart (black-n-white version for printing also available).




Summary of results
------------------

User CPU speed relative to A64 X2 2.4 GHz (3800+ Toledo), 2x 200 MHz 1T 3-4-4-8 TCCD Geil One 512 MB, NF4 SLI-DRI [2 CPUs]

A64 X2 2.4 GHz (3800+ Toledo), 2x 200 MHz 1T 1.5-2-2-5 TCCD Geil One 512 MB, NF4 SLI-DRI [2 CPUs]:
            Normal C compilation:  101.8 (  99.6)
                 C++ compilation:  103.4 ( 101.2)
               Linux compilation:  103.1 ( 100.9)
                            gzip:  100.3 (  98.2)
                            Lisp:  102.1 ( 100.0)
                          python:  100.1 (  98.0)
                          php/bs:  102.3 ( 100.2)
              High quality mpeg4:  100.9 (  98.7)
          Constant bitrate mjpeg:  103.6 ( 101.4)
                      Fast mjpeg:  104.0 ( 101.8)
                         Overall:  102.2

A64 X2 2.4 GHz (3800+ Toledo), 2x 200 MHz 1T 3-3-3-8 TCCD Geil One 512 MB, NF4 SLI-DRI [2 CPUs]:
            Normal C compilation:  100.7 (  99.8)
                 C++ compilation:  101.6 ( 100.7)
               Linux compilation:  101.0 ( 100.1)
                            gzip:  100.3 (  99.4)
                            Lisp:  101.1 ( 100.2)
                          python:  100.2 (  99.3)
                          php/bs:  101.0 ( 100.1)
              High quality mpeg4:  100.4 (  99.5)
          Constant bitrate mjpeg:  101.2 ( 100.3)
                      Fast mjpeg:  101.5 ( 100.6)
                         Overall:  100.9

A64 X2 2.4 GHz (3800+ Toledo), 2x 200 MHz 1T 3-4-4-8 TCCD Geil One 512 MB, NF4 SLI-DRI [2 CPUs]:
            Normal C compilation:  100.0 ( 100.0)
                 C++ compilation:  100.0 ( 100.0)
               Linux compilation:  100.0 ( 100.0)
                            gzip:  100.0 ( 100.0)
                            Lisp:  100.0 ( 100.0)
                          python:  100.0 ( 100.0)
                          php/bs:  100.0 ( 100.0)
              High quality mpeg4:  100.0 ( 100.0)
          Constant bitrate mjpeg:  100.0 ( 100.0)
                      Fast mjpeg:  100.0 ( 100.0)
                         Overall:  100.0

A64 X2 2.4 GHz (3800+ Toledo), 2x 240 MHz 1T 3-3-3-8 TCCD Geil One 512 MB, NF4 SLI-DRI [2 CPUs]:
            Normal C compilation:  101.6 (  99.6)
                 C++ compilation:  103.1 ( 101.1)
               Linux compilation:  102.8 ( 100.7)
                            gzip:  100.4 (  98.4)
                            Lisp:  102.1 ( 100.1)
                          python:  100.1 (  98.1)
                          php/bs:  102.0 (  99.9)
              High quality mpeg4:  100.9 (  98.8)
          Constant bitrate mjpeg:  103.6 ( 101.5)
                      Fast mjpeg:  104.0 ( 101.9)
                         Overall:  102.1

A64 X2 2.4 GHz (3800+ Toledo), 2x 300 MHz 1T 3-4-4-8 TCCD Geil One 512 MB, NF4 SLI-DRI [2 CPUs]:
            Normal C compilation:  102.1 (  99.6)
                 C++ compilation:  103.8 ( 101.4)
               Linux compilation:  103.8 ( 101.4)
                            gzip:  100.2 (  97.8)
                            Lisp:  103.0 ( 100.6)
                          python:  100.1 (  97.7)
                          php/bs:   99.3 (  97.0)
              High quality mpeg4:  101.2 (  98.8)
          Constant bitrate mjpeg:  105.1 ( 102.6)
                      Fast mjpeg:  105.8 ( 103.3)
                         Overall:  102.4



Last edited by Useless Microoptimizations on Tue Mar 07, 2006 6:30 pm; edited 4 times in total
Back to top
View user's profile Send private message Visit poster's website Permanent URL to this post in this thread
Useless Microoptimizations
Site Admin


Joined: 09 Feb 2005
Posts: 114
Location: Boston, MA, USA

PostPosted: Tue Oct 11, 2005 7:47 pm   Reply with quote

This is the chart with only a "few" results.

More different timings are here:
http://www.cons.org/cracauer/crabench/memory.user_narrow.html



(Or visit http://www.cons.org/cracauer/crabench/memoryshort.user_narrow.html to see this chart with text results at the bottom)


This is the chart with only a "few" results.

More different timings are here:
http://www.cons.org/cracauer/crabench/memory.user_narrow.html


Last edited by Useless Microoptimizations on Tue Oct 25, 2005 8:21 am; edited 3 times in total
Back to top
View user's profile Send private message Visit poster's website Permanent URL to this post in this thread
Useless Microoptimizations
Site Admin


Joined: 09 Feb 2005
Posts: 114
Location: Boston, MA, USA

PostPosted: Sat Oct 15, 2005 5:45 pm   Reply with quote

Benchmarks that compare 1T versus 2T command rate

Athlon 64 X2 at 2.4 GHz:
http://www.cons.org/cracauer/crabench/commandrate.user_verynarrow.html

Athlon 64 E6 Venice at 2.55 GHz:
http://www.cons.org/cracauer/crabench/commandrate_venice.user_verynarrow.html


Last edited by Useless Microoptimizations on Tue Oct 25, 2005 5:18 pm; edited 1 time in total
Back to top
View user's profile Send private message Visit poster's website Permanent URL to this post in this thread
Useless Microoptimizations
Site Admin


Joined: 09 Feb 2005
Posts: 114
Location: Boston, MA, USA

PostPosted: Sat Oct 15, 2005 6:11 pm   Reply with quote

Benchmarks that compare cache size

Two comparisions:
  • Single-CPU: Opteron 939 at 2.55 Ghz versus Venice at same speed with same RAM, timings and mainboard.
  • Dual-core/CPU: Dual socket 939 Opteron at 2.0 GHz versus a socket 939 X2 at same speed and simulating the Opteron's RAM timings.


Main results here:
http://www.cons.org/cracauer/crabench/cache.user.html

Except for the Linux kernel and some video encoding the bigger cache is mostly useless. And this is with very slow RAM. If you had both on DDR400 the bigger cache would have even less effect (because a cache miss would be cheaper).

Even if you look at the parallel benchmarks in the wall clock charts you'll see that unless you do lots of plain HTTP transfers it is still pretty useless:
http://www.cons.org/cracauer/crabench/cache.wall.html


BTW, the reason why the Linux kernel benefits from 1024 KB cache is that the include file structure in Linux is so insane that loading and compiling even small C files blows the 512 KB CPU cache when it expands include files. The FreeBSD kernel with much more compact and less nested include files behaves like all the other programs.


Last edited by Useless Microoptimizations on Tue Oct 25, 2005 5:21 pm; edited 3 times in total
Back to top
View user's profile Send private message Visit poster's website Permanent URL to this post in this thread
Useless Microoptimizations
Site Admin


Joined: 09 Feb 2005
Posts: 114
Location: Boston, MA, USA

PostPosted: Mon Oct 17, 2005 10:50 pm   Reply with quote

Chart that focuses on RAM bandwidth

My X2 at 2.5 GHz, RAM at 208, 250 and 312 MHz, trying to keep good timings at each.

Chart with CPU time used:
http://www.cons.org/cracauer/crabench/memory2.user_narrow.html

Same chart, but wall clock time so that you can see whether it speeds up parallel benchmarks and throughput:
http://www.cons.org/cracauer/crabench/memory2.wall.html
Back to top
View user's profile Send private message Visit poster's website Permanent URL to this post in this thread
Useless Microoptimizations
Site Admin


Joined: 09 Feb 2005
Posts: 114
Location: Boston, MA, USA

PostPosted: Tue Oct 25, 2005 5:19 pm   Reply with quote

I updated the charts linked to above. General cleanup.

Results of my Opteron 939 coming in, see the cache page.
Back to top
View user's profile Send private message Visit poster's website Permanent URL to this post in this thread
Useless Microoptimizations
Site Admin


Joined: 09 Feb 2005
Posts: 114
Location: Boston, MA, USA

PostPosted: Mon Nov 21, 2005 3:57 pm   Reply with quote

Opteron 939 around 2.9 GHz.

This chart also contains memory with Winbond BH-5 chips (Geil One BH) at 2-2-2-5 > 250 MHz.

User CPU time:
http://www.cons.org/cracauer/crabench/opteron939.user_narrow.html

Wall clock time (with concurrent tests):
http://www.cons.org/cracauer/crabench/opteron939.wall.html
Back to top
View user's profile Send private message Visit poster's website Permanent URL to this post in this thread
Useless Microoptimizations
Site Admin


Joined: 09 Feb 2005
Posts: 114
Location: Boston, MA, USA

PostPosted: Tue Mar 07, 2006 5:39 pm   Reply with quote

I reorganized the charts a little.

The first graphics is now with fewer options.
Back to top
View user's profile Send private message Visit poster's website Permanent URL to this post in this thread
Useless Microoptimizations
Site Admin


Joined: 09 Feb 2005
Posts: 114
Location: Boston, MA, USA

PostPosted: Fri Mar 30, 2007 12:54 pm   Reply with quote

Numbers comparing a Core2Duo (E6600) at 3.6 GHz, using 800 and 600 MHz RAM, and also comparing the effect of the 1333 versus 1066 MHz FSB strap:

CPU time:
http://www.cons.org/cracauer/crabench/ddr2.user.html

Wall clock time including multithreading:
http://www.cons.org/cracauer/crabench/ddr2.wall.html
Back to top
View user's profile Send private message Visit poster's website Permanent URL to this post in this thread
Display posts from previous:   
Post new topic   Reply to topic    Useless Microoptimizations Homepage Forum Forum Index -> General hardware notes All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group