simple pipelining and superscalar architecture_问答_开发者

simple pipelining and superscalar architecture

开发者 https://www.devze.com 2023-03-13 14:27 出处：网络

consider this instruction flow diagram.... instruction fetch->instruction decode->operands fetch->instruction execute->write back

consider this instruction flow diagram....

instruction fetch->instruction decode->operands fetch->instruction execute->write back

suppose a processor that supports

both cisc and risc...like intel 486

now if we issue a risc instruction it takes one clock cycle to execute and so there is no problem...but if a cisc instruction is issued its execution will take time...

so let it takes开发者_JAVA技巧 three clock cycles to execute the cisc instruction and one clock cycle each is taken by the stages preceeding execution....

now in a superscalar structure the two instructions issued while the first is being processed are diverted into other functional units available...but there is no such diversion possible in simple pipelining as only one functional unit is available for execution of instructions....

so what avoids the congestion of instructions in simple pipelining case?

Technically speaking, the x86 is not a RISC processor. It's a CISC processor. There are instructions that take less time, but those aren't RISC instructions. I believe that Intel internally turns instructions into RISC instructions, but that's not really relevant.

If we have instructions which take different amounts of time, then that becomes a CISC processor. It's nearly impossible to pipeline a CISC processor - to the best of my knowledge nobody has done it. There are many things that you can do inside of the CPU itself in order to speed up execution, such as out-of-order execution. So, there's no way you can have pipeline congestion because all instructions must be executed sequentially.

now if we issue a risc instruction it takes one clock cycle to execute and so there is no problem...but if a cisc instruction is issued its execution will take time...

A RISC instruction does not necessarily take one clock cycle. On the MIPS, it takes 5. However, the point of pipelining is that after you execute one instruction, the next instruction will complete one clock cycle after the current one finishes.

now in a superscalar structure the two instructions issued while the first is being processed are diverted into other functional units available...

In a superscalar architecture, two instructions are executed and finish at the same time. In a pure superscalar architecture, the cycle looks like this(F = Fetch, D = Decode, X = eXecute, M = Memory, W = Writeback):

(inst. 1) F D X M W
(inst. 2) F D X M W
(inst. 3)          F D X M W
(inst. 4)          F D X M W

but there is no such diversion possible in simple pipelining as only one functional unit is available for execution of instructions....

Right, so the cycle looks like this:

(inst. 1) F D X M W
(inst. 2)   F D X M W
(inst. 3)     F D X M W
(inst. 4)       F D X M W

Now, if we have instructions that take a varying amount of time(a CISC computer), it's harder to pipeline, because there's only one execution unit, and we may have to wait for a previous instruction to finish executing. Instruction 1 takes 2 execution cycles, instruction 2 takes 5, instruction 3 takes two, and instruction 4 takes only one in this example

(inst. 1) F D X X M W
(inst. 2)         F D X X X X X M W
(inst. 3)                       F D X X M W
(inst. 4)                               F D X M W

Thus, we can't really pipeline CISC processors - we must wait for the execute cycle to finish before we can go onto the next instruction. We don't have to do this in MIPS because it can determine if an instruction is a branch and the destination in the decode phase.