K6: The design index of K6 CPU is quite high. MMX technology has more on-chip first-level cache (32K instructions and 32K data), deeper pipeline, more instructions can be processed in parallel, and the running clock frequency is higher. AMD is undoubtedly very successful in integer operation. Because K6 has a larger L 1 cache, with the increase of frequency, it can achieve more significant performance improvement than Pentium MMX. K6 is slightly behind in running applications that need to use MMX or FP (floating point instruction). Compared with Pentium MMX with the same frequency, even Pentium without MMX is much worse, which makes K6 perform far worse than Intel in some 3D games. In addition, AMD's MMX unit can only process one instruction at a time, while Intel's MMX unit can process two instructions. Therefore, K6 has poor performance in executing MMX instructions and floating-point instructions.
AMD's K6 is shorter than Intel's CPU in processing some MMX operations, but the throughput of a single operation is the same. The shorter processing cycle can't make up for the deficiency that K6 can't process two MMX instructions at the same time. Although Intel's MMX CPU can process two MMX instructions at the same time, its MMX unit only contains a multiplication unit and a shift unit, so it is impossible to perform these key operations at the same time. At the same time, only one MMX instruction operation memory and integer register can play a role in floating-point processing, so the processing cycle of K6 is still shorter than that of Intel, but only one operation can be started every two clock cycles, and Intel chips can start one every cycle. The end result is that for many floating-point operations, the throughput of AMD chips can only reach half that of Intel chips.
K6 series CPU I * * * has five frequencies, namely:166mhz/200mhz/233mhz/266mhz/300mhz. All five models use 66mhz external frequency, but the later 233mhz/266mhz/300mhz can support 100 MHz external frequency by upgrading the BIOS of the motherboard. In terms of frequency doubling, the K6 series has a core voltage of 2.9V, 3.2V and 2.2V from 2.5 to 4.5. It is particularly worth mentioning that their first-level cache is increased to 64KB, which is twice that of MMX, which is why the integer performance of K6 is better than MMX.
1In the middle of 998, AMD's latest K6-2 processor was officially launched. This is the first one to adopt 3Dnow! X86 microprocessor technology compatible with Microsoft Windows operating system. It adopts a brand-new silicon crystal manufacturing technology (C4 flip chip), which was developed by IBM. It improves the precision of silicon crystal to 0.25 micron, and reduces the original die size of K6 from168m2 to 68m2. At the same time, the number of crystals increased by 500,000 (reaching 9.3 million), and other structures were basically the same as K6, L65,438+0. In addition, its working voltage dropped from 2.9V/3.2V to 2.2V..AMD took the lead in joining 3Dnow when it introduced K6-2 CPU! Floating-point /3D acceleration technology, 64-bit dual-channel floating-point buffer, 2 1 brand new 3Dnow! Instruction set, adding SIMD (single instruction, multiple data). And AMD's 3Dnow! Technology, with the introduction of K6-2, immediately won the support of game manufacturers, software manufacturers and graphics drivers, and became an important industrial standard.
K6-3 processor uses 0.25 micron thread and consists of 21300,000 transistors. K6-3 processor adopts three-layer structure design. The core of K6-3 processor has 64K first-level cache (1 level) and 256K second-level cache (level 2), and the motherboard is equipped with level 3 cache (level 3). The first and second layer caches of K6-3 processor are ***320K in total, both of which are built in the core of the processor chip, which is the same as the clock frequency of the processor, and the execution speed of this cache is at full speed with the processor. K6-3' s three-level structure design can support 1024K three-level cache on the motherboard. On the motherboard with Super 7 structure, the clock frequency of the third-level cache is 100MHz. Compared with Pentium II, which only has 32K first-level cache and 5 12K half-speed second-level cache, AMD's three-level cache structure can increase the cache capacity of the system and improve the overall performance of the system.
K6-3 processor now supports 3D! Instruction set 3D now! This instruction set is similar to Intel's KNI(Katmai New Instruction) instruction set, which uses the method of adding instructions to speed up multimedia processing, such as 3D drawing and applications that need a lot of floating-point operations.
Because of the cost and yield, K6-3 processor is not very successful in the desktop market, so it will gradually disappear from the desktop market and gradually enter the notebook market. AMD will launch a version of K6-3+ processor specifically for notebook computers. K6-3+ uses 0. 18 micron thread, and the chip has built-in secondary cache. In addition, the notebook K6-3+ will have dual-mode function (AMD code-named Gemini) that can automatically step up and down, similar to the notebook processor that Intel will launch next. When using indoor AC power supply, the clock frequency of K6-3+ processor is high; If battery power is used, the K6-3+ processor will automatically slow down to extend the battery life.
What really makes AMD proud is the Athlon processor originally code-named K7. Athlon has a superscalar Risc core with superscalar, super pipeline and multi-pipeline. Using 0.25μ m process, 22 million transistors are integrated, and the chip area is184 mm. At present, a more advanced 0. 18μ m Athlon has been introduced. The next step is to adopt copper wire technology. AMD has never lagged behind Intel in manufacturing technology. (Photo by athlon.jpg)
Athlon includes three decoders, three integer execution units (IEU), three address generation units (AGU) and three multimedia units (floating-point arithmetic units). Athlon can execute three floating-point instructions in the same clock cycle, and each floating-point unit is a complete pipeline. K7 includes three decoders, and sends the decoded macro instructions (K7 decodes X86 instructions into macro instructions and converts X86 instructions with different lengths into macro instructions with the same length, which can give full play to the power of RISC kernel) to the instruction control unit, which can control (save) 72 instructions at the same time. And then sends the instruction to an integer unit or a multimedia unit. Integer units can schedule 18 instructions at the same time. Each integer unit is an independent pipeline, and the scheduling unit can predict the branches of instructions and execute them out of order. K7' s multimedia unit (also called floating-point unit) has a stack register that can be renamed. The floating-point scheduling unit can schedule 36 instructions at the same time, and the floating-point register can store 88 instructions. Among the three floating-point units, there is an adder and a multiplier, which can execute MMX instruction and 3DNow instruction. There is also a floating-point unit responsible for loading and saving data. Because of K7' s powerful floating-point unit, AMD processor surpassed Intel processor in floating-point for the first time.
Athlon has a built-in 128KB full-speed cache (L 1 cache) and an external L2 cache with a frequency of 1/2 and a capacity of 5 12KB, which can support up to 8MB L2 cache. Large cache can further improve the huge data throughput required by the server system.
The packaging and appearance of Athlon are similar to the SECC cassette of Pentium II, but Athlon adopts the Slot A interface specification. Slot A interface originated from Alpha EV6Bus, and its clock frequency is as high as 200MHz, which makes the peak bandwidth reach 1.6gb/s, and it is still compatible with the traditional 100MHz bus on the memory bus. At present, PC- 100 SDRAM can still be used, which protects the investment of users and reduces the cost. In the future, you can also use DDRSDRAM with higher performance, which is similar to the data throughput of 800MHz RAMBUS pushed by Intel. EV6 bus can support up to 400MHz, which can perfectly support multiple processors. Have natural advantages. You should know that Slot 1 only supports dual processors, and slot can support 4 processors. SlotA looks very similar to Slot 1 of traditional 242pin, just like Slot 1 which in turn is 180 degrees, but they are completely incompatible in electrical specifications and bus protocols. The CPU in slot 1/Socket 370 cannot be installed on the Athlon motherboard in slot A, and vice versa.
AMD in order to further expand 3Dnow! The scope of support of the software platform, while closing the original 3Dnow! Compared with SSE, Athlon processor provides enhanced 3Dnow! Technology, added 24 new instructions. Among them, the 19 control instruction is completely compatible with the video operation and fast memory pre-reading instruction that Intel added to the SSE instruction of Pentium III for the existing 64-bit MMX buffer. Therefore, the software developed for SSE instruction set of Pentium III can be successfully transferred to Athlon with only a few modifications, giving full play to the SIMD acceleration performance of powerful MMX buffer. On the other hand, Athlon has added five new instructions, which can make the CPU directly handle the conversion of analog/digital signals like a DSP chip. It can be used for soft modem, ADSL network conversion and transmission, and Dolby AC-3 decoding. So far, Intel's CPU has not provided instructions for similar functions. Obviously, AMD has once again played an innovative role in the development of a new generation of processor instruction sets.
Having said that, what is the actual performance of Athlon processor? Comparing with the 600MHz Athlon and the 600 MHz Pentium III (Xeon only reaches 550MHz at present), the integer performance of Athlon (CPUMark99, WinStone99) is about 10% faster than that of Pentium III with the same frequency. For Athlon, floating-point performance is even more impressive. Although the FPUmark test value of WinBench99 is only about 8% faster, the test result of cross-platform industrial evaluation standard SPECfp_base95 is about 38% faster. In terms of 3D performance, the software of 3D WinBench, such as 3D Winmark and 3DMark 99 Max, is ahead by 36 ~ 38%. When running 3D Studio Max R3.0, the rendering speed of Athlon platform is about 33% faster than that of Pentium Ⅲ Ⅲ. Because the difference between Pentium III and Pentium III Xeon lies in the capacity and speed of L2 cache (Xeon is full-speed L2 cache), if we compare the entry-level Xeon processor with only 5 12KB L2 cache, when running most software, Xeon has only integer performance faster than Pentium III, and the floating-point performance is exactly the same. So, in Athlon vs Pentium Ⅲ&; The Xeon Athlon designed by L2 cache beat the Xeon processor designed by L2 cache at full speed in any software performance with the measured result of 1/2 frequency.
Recently AMD introduced Athlon with 800MHz Athlon. The 800MHz processor is still in SlotA structure, but all new Athlon processors are K75 cores. The 800MHz Athlon processor adopts 0. 18 micron aluminum process, and the wafer area is 102 cm2. Compared with the old Athlon processor made of 0.25 micron thread, the 800MHz processor has lower calorific value.
According to the performance test data of Athlon processor and Pentium III processor published by AMD, in Business Winstone 99 (Windows NT 4.0), the test values of Athlon 800MHz, Athlon 750MHz and Pentium 733MHz are 465,438+0.4 and 465,438+0.3, respectively. In WinBench 99CPUmark 99, the test value of Athlon 800MHz is 7 1.9, Athlon 750MHz is 67.9, and Pentium III 733 MHz is 65.8. In WinBench 99 FPU WinMark, the test value of Athlon 800MHz is 4370, Athlon 750MHz is 4 103.3, and Pentium III 733 MHz is 3890.
Therefore, AMD has positioned the Athlon processor at the Xeon level and the price between Xeon and Pentium III, hoping to enter the market of commercial, high-end workstations and servers, which should be a very competitive market strategy.
Pentium: It is the famous Pentium processor. It is a new generation of high-performance processor introduced by Intel in 1993. Its internal code is P54C. Pentium contains as many as 365,438+million transistors with built-in 16K L1 cache. The clock frequency is 60MHz and 66MHz at first, and finally reaches 200MHz. Due to Pentium's excellent manufacturing technology, overclocking is very good, that is, its clock frequency can be increased by 1~2, which makes overclocking gradually popular. At the same time, its floating-point performance surpasses competitors Cyrix and AMD. Since then, Intel has kept the title of floating point until AMD introduced Athlon chip. Due to the above reasons, Pentium has won most of the market share of 586-level CPU. Since Pentium 75, the socket technology of CPU has officially changed from the previous Socket4 to support both Socket5 and Socket7, of which Socket7 is still in use today, and AMD later developed into Super7, which is another story.
Pentium Pro:16th generation X86 CPU introduced by Intel in 1996. Pentimu Pro contains as many as 5.5 million transistors, and Pentium Pro's first-level (on-chip) cache is 8KB instruction and 8KB data. It is worth noting that a package of Pentimu Pro includes a 256KB secondary cache chip in addition to Pentium Pro chip, which runs at the same frequency as the processor. The two chips are interconnected by a high-bandwidth internal communication bus, which is independent of the system bus. The most striking thing about Pentimu Pro is that it has an innovative technology called "Dynamic Execution", which is another leap after Pentium broke through the superscalar architecture. Pentimu Pro is mainly used for servers.
At the end of 1996, Pentium MMX: 1996 introduced an improved version of Pentium series, with internal code P55C, which is what we usually call Pentium MMX. MMX technology is a multimedia enhanced instruction set technology newly invented by Intel, and its English full name can be translated as "multimedia extended instruction set". Pentium MMX can be said to be the CPU product with the highest market share in computer before the beginning of 1999. Pentium MMX series has only three frequencies: 166MHz/200MHz/233MHz. The first-level cache is increased from Pentium's 16KB to 32KB, with a core voltage of 2.8v and frequency doubling of 2.5, 3 and 3.5 respectively. Slots are all Socket 7.
Pntium Ⅱ:1997 In May, Intel introduced Pentium Ⅱ, a product of the same level as Pentium Pro. Pentium CPU has many branches and series products, among which the first generation product is chip code Klamath. It runs on the 66MHz bus and has four main frequencies: 233MHz, 266MHz, 300MHz and 333MHz. PentiumII adopts the same 32-bit kernel structure as Pentium Pro, which accelerates the writing operation of segment registers and increases the MMX instruction set. Intel uses CMOS technology to integrate 7.5 million transistors into a 203-square-millimeter silicon wafer. In terms of bus, Pentium II II processor adopts dual independent bus structure, namely back-end bus technology. One bus is connected to the L2 cache and the other bus is connected to the memory. In order to reduce the cost, Pentium II II uses an off-chip external cache, which can run at half the speed of the CPU's own clock. In terms of interface technology, in order to beat the competitors and gain greater internal bus bandwidth, Pentium II adopted the patented Slot 1 interface standard for the first time. It does not use ceramic package, but makes CPU and secondary cache on printed circuit board. Packaging is the so-called SEC (Single Contact Box) card box. Pentium CPU has a 32KB on-chip L 1 cache (16K instruction/16K data). 57 MMX instruction; Eight 64-bit MMX registers. L2 cache is an off-chip synchronous burst SRAM cache, 5 12K four-way cascade.
Celeron: Celeron is a cheap version proposed by Intel for Pentium II. Its core technology is the same as PentiumII. This is an attempt to occupy the low-priced personal computer market. It can be said that it was specially launched by Intel in order to seize the low-end market. Of course, Intel is not willing to be robbed of the fat in the low-end market by AMD and Cyrix. Celeron processors are like Pentium II II without L2 cache. Celeron was originally manufactured by 0.35 micron process, and its external frequency was 66MHz, including 266MHz and 300MHz. Then came 333MHz, and the manufacturing process of 0.25 micron was adopted from then on. The original Celeron made a mistake, that is, it removed Celeron's secondary cache. Therefore, its performance is not ideal, and it has become a chicken rib in the hands of players. The only advantage is that it is very resistant to overclocking. Subsequently, Intel corrected this error, and integrated 128KB full-speed cache in Celeron, and all of them adopted 0.25 micron manufacturing process. In order to be different from the original Celeron 300, it named Celeron 300A with 300MHz integrated with 128KB cache as Celeron 300 a, and all Celeron introduced later had built-in 128KB secondary cache.
In order to further reduce the cost, Intel returned to the abandoned socket structure, because Celeron's secondary cache is built-in, that is, there is no external back-end bus, so it is purely redundant to install Celeron on a PCB. So Intel made Celeron into a socket structure, but it was no longer Socket7 but a brand-new Socket370. 370 means it has 370 pins, 49 more than the 32 1 pin of Socket7 CPU, so the two sockets are incompatible. In order to enable the users of the original Slot 1 to use the CPU with Socket370 structure, the conversion card from Socket370 to Slot 1 appeared. Celeron with Socket370 structure has the same kernel except the interface. Celeron does not support Intel's latest SSE instruction set, but its structure is better than Pentium III. Celeron's L2 cache is only a quarter of that of Pentium II and Pentium III, but the speed is the same as that of the processor. As a result, Celeron's processor is slightly worse than Pentium II at the same level when running applications with general computing load. Another difference between Celeron and Pentium II/III is the bus speed: Celeron's bus speed is still 66MHz at present, and Celeron with bus speed of 100MHz will appear later.
Xeon: Xeon processor, mainly used in high-end NT servers, shows that Intel has been coveting the high profits in the server CPU market for a long time, hoping to get a slice of it. Xeon processors have powerful functions that have never been seen before in x86 era. It is compatible with previous generations of Intel microprocessors; PentiumII processor adopts dual independent bus structure and dynamic instruction execution technology in P6 microstructure. Xeon processor has built-in L2 cache of 5 12KB or even 2MB bytes, which runs at the same bus speed as CPU. We see that Xeon's SEC box is twice as high as PII because it has a built-in full-speed L2 cache. Xeon supports up to 8 processors, and the interface is no longer Slot 1, but Slot2 interface. The chipset supporting Xeon is Intel's 440GX.
Intel Pentium III: The clock speed of Pentium III has jumped from the top 450MHz of Pentium II, but this is not the reason why Intel's latest processor is valued. It attracts people's attention because of its enhancement in multimedia performance, which can speed up the programs that need intensive processing and running. PentiumIII adds 70 instruction sets that other processors don't have: SSE new instruction, which is specially designed to improve 3D graphics performance, 3D sound effects and speech recognition. In addition, Pentium III is compatible with MMX instruction, SSE instruction and synchronous floating-point operation, so it provides more updated multimedia applications for game manufacturers and other program developers.
The latest Pentium III processor is a CPU code-named Coppermine, which adopts 0. 18 micron process, including Slot 1 (picture copperpic.jpg) and new FC-PGA (picture fcpga 1.jpg), and its external frequencies are 100MHz and/kloc respectively. In order to effectively reduce the processor cost, Intel can integrate L2 cache into the chip after switching to 0. 18 micron process, and its L2 cache capacity will be increased to 256K;; Similar to Celeron with built-in 128K, socket structure can be adopted. Copper ore can be encapsulated in socket slot FC-PGA (flip chip -PGA) to reduce the manufacturing cost. According to Intel's product planning schedule, in March 2000, all Pentium III processors will be converted into FC-PGA packages, that is, Pentium III processors will adopt the same Socket 370 architecture as Celeron processors, and the existing Pentium III processors with 1 architecture will become history. During the transition of Pentium III processor from Slot 1 architecture to Socket 370 architecture, Intel will still supply the new Pentium III processor with Slot 1 architecture, while the second-tier cache has disappeared on the processor circuit board of the new Pentium III processor with Slot 1 architecture, and it is all integrated into Coppermine core.