1 2
|
SMID:If we consider doing below:
A=a+b+c+d or A=a*b*c*d
|
No,
SIMD rather performs things like
1 2 3 4
|
a0+=b0;
a1+=b1;
a2+=b2;
a3+=b3;
|
You said “typically 4”.What does it mean? |
Like I said, these instruction sets are often used in linear algebra applications. 4D vectors are some of the most common.
Is it umm, let’s say for this reason: level one cannot hold (they can’t manufacture efficiently for consumer) more space than xKB/xMB. |
Sort of. The real reason is that fast memory is expensive. We could in principle manufacture RAM consisting entirely of flip flops (the kind of memory used for CPU registers), but it would be far too expensive.
So we create another memory space adjacent to cpu |
Caches L1 and L2 are on-die.
But lv1, lv2, lv3 etc operates in what speed? |
To give you a vague idea, RAM works on the order of a few GiB/s, while L2 cache works on the order of a few hundred GiB/s. Here are some latency costs for cache hits:
http://stackoverflow.com/questions/4087280/approximate-cost-to-access-various-caches-and-main-memory
In this analogy, can we say “RAM” is one kind of levelX cache? |
Whether something is a cache or not depends on how it's used. In principle, RAM is not a cache because it doesn't need to be used as one, unlike CPU caches which have no other functionality (they're not programmatically accessible).
Most OSs cache disk accesses, and web browsers often keep caches of previously visited pages or content. Anything can be a cache if it can hold state.
Lets consider 600MHz ram. So it can feed a 600M number of “chunk of data(buffer or similar?)” to CPU at one second. |
More or less. What gets sent in each transfer is a processor word, the natural-sized datum that a CPU works with. For example, a 32-bit processor has a 32-bit word.
Is there any way in C++ to use SMID |
Some compilers can automatically generate code that uses
SIMD instruction sets with varying degrees of performance gains. Some compilers have extensions known as "intrinsics" that translate directly to SIMD instructions.
Generally speaking, though, the only real way to use SIMD in a program is to manually code the relevant functions in Assembly.
Is there any way in C++ to use cache efficiently |
Sure. Effective cache use has little to do with the language and more to do with the choice of algorithm or data structure.
For example:
http://www.cplusplus.com/forum/lounge/86758/