General OpenPOWER Hardware > General CPU Discussion

Asymmetric CPUs with the Same Core Count

(1/3) > >>

amock:
I recently saw https://github.com/rigtorp/c2clat and ran it on my dual 18-core machine and got an interesting result.  I've attached an image of it, and the part that seems strange is the bottom left quadrant next to the origin where the pattern doesn't match what's in the top right quadrant.  It reminded me that I had earlier looked at my CPU caches and one seemed to have more than the other.


--- Code: --- Package L#0
      L3 L#0 (10MB) + L2 L#0 (512KB)
        L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
          PU L#0 (P#0)
          PU L#1 (P#1)
          PU L#2 (P#2)
          PU L#3 (P#3)
        L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1
          PU L#4 (P#4)
          PU L#5 (P#5)
          PU L#6 (P#6)
          PU L#7 (P#7)
      L3 L#1 (10MB) + L2 L#1 (512KB)
        L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2
          PU L#8 (P#8)
          PU L#9 (P#9)
          PU L#10 (P#10)
          PU L#11 (P#11)
        L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3
          PU L#12 (P#12)
          PU L#13 (P#13)
          PU L#14 (P#14)
          PU L#15 (P#15)
      L3 L#2 (10MB) + L2 L#2 (512KB)
        L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4
          PU L#16 (P#16)
          PU L#17 (P#17)
          PU L#18 (P#18)
          PU L#19 (P#19)
        L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5
          PU L#20 (P#20)
          PU L#21 (P#21)
          PU L#22 (P#22)
          PU L#23 (P#23)
      L3 L#3 (10MB) + L2 L#3 (512KB)
        L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6
          PU L#24 (P#24)
          PU L#25 (P#25)
          PU L#26 (P#26)
          PU L#27 (P#27)
        L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7
          PU L#28 (P#28)
          PU L#29 (P#29)
          PU L#30 (P#30)
          PU L#31 (P#31)
      L3 L#4 (10MB) + L2 L#4 (512KB)
        L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8
          PU L#32 (P#32)
          PU L#33 (P#33)
          PU L#34 (P#34)
          PU L#35 (P#35)
        L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9
          PU L#36 (P#36)
          PU L#37 (P#37)
          PU L#38 (P#38)
          PU L#39 (P#39)
      L3 L#5 (10MB) + L2 L#5 (512KB)
        L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10
          PU L#40 (P#40)
          PU L#41 (P#41)
          PU L#42 (P#42)
          PU L#43 (P#43)
        L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11
          PU L#44 (P#44)
          PU L#45 (P#45)
          PU L#46 (P#46)
          PU L#47 (P#47)
      L3 L#6 (10MB) + L2 L#6 (512KB)
        L1d L#12 (32KB) + L1i L#12 (32KB) + Core L#12
          PU L#48 (P#48)
          PU L#49 (P#49)
          PU L#50 (P#50)
          PU L#51 (P#51)
        L1d L#13 (32KB) + L1i L#13 (32KB) + Core L#13
          PU L#52 (P#52)
          PU L#53 (P#53)
          PU L#54 (P#54)
          PU L#55 (P#55)
      L3 L#7 (10MB) + L2 L#7 (512KB) + L1d L#14 (32KB) + L1i L#14 (32KB) + Core L#14
        PU L#56 (P#56)
        PU L#57 (P#57)
        PU L#58 (P#58)
        PU L#59 (P#59)
      L3 L#8 (10MB) + L2 L#8 (512KB) + L1d L#15 (32KB) + L1i L#15 (32KB) + Core L#15
        PU L#60 (P#60)
        PU L#61 (P#61)
        PU L#62 (P#62)
        PU L#63 (P#63)
      L3 L#9 (10MB) + L2 L#9 (512KB) + L1d L#16 (32KB) + L1i L#16 (32KB) + Core L#16
        PU L#64 (P#64)
        PU L#65 (P#65)
        PU L#66 (P#66)
        PU L#67 (P#67)
      L3 L#10 (10MB) + L2 L#10 (512KB) + L1d L#17 (32KB) + L1i L#17 (32KB) + Core L#17
        PU L#68 (P#68)
        PU L#69 (P#69)
        PU L#70 (P#70)
        PU L#71 (P#71)

--- End code ---
compared to

--- Code: ---Package L#1
      L3 L#11 (10MB) + L2 L#11 (512KB)
        L1d L#18 (32KB) + L1i L#18 (32KB) + Core L#18
          PU L#72 (P#72)
          PU L#73 (P#73)
          PU L#74 (P#74)
          PU L#75 (P#75)
        L1d L#19 (32KB) + L1i L#19 (32KB) + Core L#19
          PU L#76 (P#76)
          PU L#77 (P#77)
          PU L#78 (P#78)
          PU L#79 (P#79)
      L3 L#12 (10MB) + L2 L#12 (512KB)
        L1d L#20 (32KB) + L1i L#20 (32KB) + Core L#20
          PU L#80 (P#80)
          PU L#81 (P#81)
          PU L#82 (P#82)
          PU L#83 (P#83)
        L1d L#21 (32KB) + L1i L#21 (32KB) + Core L#21
          PU L#84 (P#84)
          PU L#85 (P#85)
          PU L#86 (P#86)
          PU L#87 (P#87)
      L3 L#13 (10MB) + L2 L#13 (512KB)
        L1d L#22 (32KB) + L1i L#22 (32KB) + Core L#22
          PU L#88 (P#88)
          PU L#89 (P#89)
          PU L#90 (P#90)
          PU L#91 (P#91)
        L1d L#23 (32KB) + L1i L#23 (32KB) + Core L#23
          PU L#92 (P#92)
          PU L#93 (P#93)
          PU L#94 (P#94)
          PU L#95 (P#95)
      L3 L#14 (10MB) + L2 L#14 (512KB)
        L1d L#24 (32KB) + L1i L#24 (32KB) + Core L#24
          PU L#96 (P#96)
          PU L#97 (P#97)
          PU L#98 (P#98)
          PU L#99 (P#99)
        L1d L#25 (32KB) + L1i L#25 (32KB) + Core L#25
          PU L#100 (P#100)
          PU L#101 (P#101)
          PU L#102 (P#102)
          PU L#103 (P#103)
      L3 L#15 (10MB) + L2 L#15 (512KB)
        L1d L#26 (32KB) + L1i L#26 (32KB) + Core L#26
          PU L#104 (P#104)
          PU L#105 (P#105)
          PU L#106 (P#106)
          PU L#107 (P#107)
        L1d L#27 (32KB) + L1i L#27 (32KB) + Core L#27
          PU L#108 (P#108)
          PU L#109 (P#109)
          PU L#110 (P#110)
          PU L#111 (P#111)
      L3 L#16 (10MB) + L2 L#16 (512KB)
        L1d L#28 (32KB) + L1i L#28 (32KB) + Core L#28
          PU L#112 (P#112)
          PU L#113 (P#113)
          PU L#114 (P#114)
          PU L#115 (P#115)
        L1d L#29 (32KB) + L1i L#29 (32KB) + Core L#29
          PU L#116 (P#116)
          PU L#117 (P#117)
          PU L#118 (P#118)
          PU L#119 (P#119)
      L3 L#17 (10MB) + L2 L#17 (512KB)
        L1d L#30 (32KB) + L1i L#30 (32KB) + Core L#30
          PU L#120 (P#120)
          PU L#121 (P#121)
          PU L#122 (P#122)
          PU L#123 (P#123)
        L1d L#31 (32KB) + L1i L#31 (32KB) + Core L#31
          PU L#124 (P#124)
          PU L#125 (P#125)
          PU L#126 (P#126)
          PU L#127 (P#127)
      L3 L#18 (10MB) + L2 L#18 (512KB)
        L1d L#32 (32KB) + L1i L#32 (32KB) + Core L#32
          PU L#128 (P#128)
          PU L#129 (P#129)
          PU L#130 (P#130)
          PU L#131 (P#131)
        L1d L#33 (32KB) + L1i L#33 (32KB) + Core L#33
          PU L#132 (P#132)
          PU L#133 (P#133)
          PU L#134 (P#134)
          PU L#135 (P#135)
      L3 L#19 (10MB) + L2 L#19 (512KB)
        L1d L#34 (32KB) + L1i L#34 (32KB) + Core L#34
          PU L#136 (P#136)
          PU L#137 (P#137)
          PU L#138 (P#138)
          PU L#139 (P#139)
        L1d L#35 (32KB) + L1i L#35 (32KB) + Core L#35
          PU L#140 (P#140)
          PU L#141 (P#141)
          PU L#142 (P#142)
          PU L#143 (P#143)

--- End code ---

Does anyone else have asymmetric CPUs with the same core count?  For people with 18-core CPUs, what cache configuration do you have?

ClassicHasClass:
I don't have a system that big, but I don't remember seeing anything like that with my dual-8. Are you sure Hostboot didn't guard anything out? (At a BMC root prompt, try `pflash -P GUARD -c` to ensure there are no guard entries in the PNOR.)

amock:
I'm pretty sure nothing was guarded out.  I just cleared it now and I've cleared out the guards before and it has always been like this.  The 8 core and smaller all have 10MB of cache per core, where the 18 core an larger have 10MB of cache per core pair.  So it seems like one of my CPUs has cores with unpaired cache (with 11 10MB caches like the 22 core CPU would) and one is just the paired caches (so 9 10MB caches) as normal.

I think this also contributes to an issue I had when I was overclocking.  I used the overclock from the Raptor repos and sometimes it would just suddenly die when under very heavy load.  My guess is that the CPU with 20MB of extra cache drew more power than it should have and tripped something.

ejfluhr:
Bonus cache!    Presumably 1 core in each of the pair that share the L2/L3 was bad in some way, and that die did not have enough paired-cache cores to make a matched set of 18, so used some of the remaining unpaired cores to make up the difference.  Given the difficulty of yielding big die on a modern technology node, reducing core count is common industry practice.  It's a little unusual here that 2 core share the cache so when one gets knocked out, the other retains access to the full amount of cache.  If the cache had been reduced any, then that core might perform worse versus the same workload on only 1 of the 2 cores in a paired-cache config, so it gets to keep the full local cache without having to share any...yay!

Raptor identifies this config specifically as "unpaired" in their 4-core and 8-core cpu description, e.g.:  https://raptorcs.com/content/CP9M31/intro.html
   4 cores per package
       3.2GHz base / 3.8GHz turbo (WoF)
       90W TDP
       All Core Turbo capable
       32KB L1 data cache + 32KB L1 instruction cache / core
       512KB unpaired L2 cache / core
       10MB unpaired L3 cache / core

Versus the larger core-count, e.g.:  https://raptorcs.com/content/CP9M36/intro.html
   18 cores per package
       2.8GHz base / 3.8GHz turbo (WoF)
       190W TDP
       32KB L1 data cache + 32KB L1 instruction cache / core
       512KB L2 cache / core pair
       10MB L3 cache / core pair


If you don't want the bonus cache, I am sure someone would trade CPUs with you.  ;-)

ejfluhr:
How do you read the purple and orange plot??

Navigation

[0] Message Index

[#] Next page

Go to full version