Simon Madugula Literature Review Samples
One of such techniques is the use of a very small size but very fast memory known as the Cache memory which is often located on the CPU for faster access. Cache memory is useful in accessing data that is used regularly. For instance, web browsers store local copies of web pages in order to make the webpage load faster when it is accessed again since the information is fetched from the local computer. Similarly, CPU memory cache also stores some information, although very small compared to a RAM, for faster data access.
The discussion of Cache Memory will not be complete without bringing up the process involved in the running of a software program by a PC.
When a program is launched, it is loaded from the PC hard drive into the RAM and the CPU loads the program from RAM through a memory controller. This program cannot be loaded directly from the hard drive by the CPU. Assuming a 64-bit CPU running at 2 GHz, the data transfer rate is about 16GB/s. A DDR2 RAM transferring data at 6.4GB/s cannot match up the speed of the data transfer rate of the CPU hypothesized above. The bottom line is that RAM is not as fast as the CPU in fetching data. The CPU thus has to wait for the RAM for the data it needs for execution. This latency is what Cache memory reduces.
The Cache memory is located close to the processor, between the processor and the RAM to store data that is to be accessed by the processor for faster data retrieval by the processor. The CPU can access the cache in a single processor clock cycle compared to the multiple processor clock cycles required for RAM access.
Despite the size of a cache being very small compared to the size of RAM, the cache gives the overall effect of a faster computer. This is due to the way and manner data is stored in the cache. How does the computer maximize the use of the very small size of the cache memory? This is answered in the next section.
Locality of Reference
The principle of locality of reference points to the fact that computer programs to not reference memory locations at random but rather to specific areas of the memory. Programs often contain looping structures that perform particular operations on a set of data over-and-over again and so reference the same data set multiple times. Likewise, most of the data structures used in programs such as Arrays, Lists, and Stacks etc. use contiguous memory space for storage. It is based on this principle that data is stored in the Cache to facilitate faster access by the CPU. The operation of a cache is based on this principle of locality of reference. There are three aspects of locality of reference: Temporal locality, spatial locality and Sequential locality of reference.
Temporal locality of Reference
Temporal locality is based on time. A memory location that is accessed at a particular time will still be referenced at another time soon. Such data that is stored at that location in memory is kept in the Cache in anticipation of a future use of the data.
Spatial locality of reference
This is based on the high likelihood that a particular memory location will be referenced if a storage location close to it has been recently referenced. Spatial locality is based on memory location.
Sequential Locality of reference
Storage is accessed in a sequential manner in either an ascending order or a descending order. Accessing the elements of an array for instance falls in this category.
Cache Hits and Misses
Owing to the small size of the cache it is not likely that the cache will always have the data needed by the CPU already stored in it. When the CPU references a memory location that is already in the cache, it is referred to as a cache hit. When however, the CPU references a memory location that is not in the cache and has to be fetched from the RAM, it is referred to as a cache miss. The performance of the cache in the system can be expressed by the hit ratio of the system, which measures the probability that the desired information is found in the cache and does not require a main memory access. The hit ratio is expressed mathematically as a ratio of the number of cache hits to the total number of main memory accesses.
Hit ratio= number of hitstotal number of main memory access
where, total number of main memory access is the sum of number of total hits and the number of total misses.
The hit ratio is not the same for all systems but varies depending on the architecture of the cache and how much cache memory is present in the system, and also on the kind of program the system is running and the type of data the system is manipulating. In practice, caches are known to provide hit ratios greater than 90%.
Since caches store data for the quick access of the CPU instead of the CPU fetching it from RAM, there must be a relationship between the memory locations in RAM and those in the cache. This relationship is explained in the following section.
Cache Mapping Strategies
A cache controller controls the operations of a cache. One of such operations is the mapping of memory locations in the RAM to memory locations in cache. The importance of this mapping cannot be overemphasized in the overall operation of the cache. Asides pointing the location of a cache location in main memory, the mapping enables the system to quickly detect cache hits. There are three main cache mapping strategies in use; associative mapping, direct mapping and set-associative mapping. The suitability of a mapping strategy is hinged on its features and characteristics with respect to the system requirements. Each strategy has pros and cons that make it more preferable to another strategy.
Associative mapping is based on Content Addressable Memory (CAM). Unlike RAM and sequential access memory, CAM identifies information with the actual content that is stored in the memory. In this cache organization, each cache entry has two parts - the address tag and the data or instruction words from the corresponding address line in main memory.
In associative-mapped cache as shown in figure 1., for a main memory of size 2n bytes and the cache line size is 2m bytes, the address tags are n - m bits long. Compared to associative mapping, there is significant reduction in the hardware requirement for this particular category of cache organization. It requires only one comparator in the hardware circuitry for tag-checking. It however has a lower hit ratio compared to the associative-mapped cache with more hardware requirement.
Unlike what obtains in associative mapping, direct mapping aims to reduce the costs associated with comparing a lot of large tags at once by reducing the number of comparisons made. This it achieves by ensuring that any item from main memory can only be stored in particular locations on the cache determined by part of its main memory address known as its index. Figure 2 shows the architecture of a direct-mapped cache. It is faster to determine a cache hit here since we only need compare the index with the main memory address tag portion. In direst mapping, for a main memory of size 2n bytes and cache with 2k lines, the tags will only be n - k - m bits long since k bits will be used for the address index.
In set-associated mapping, a main memory item is can only be in a subset of locations in the cache instead of a particular cache location as in direct-mapped and instead of any location on the cache as in associative-mapped cache. This cache organization finds a balance between the advantages of the associative-mapped cache organization and the direct-mapped cache organization. It requires a more complex hardware compared to the direct-mapped cache. A schematic of set-associative mapping is as shown in figure 3.
Dumas, J.D. (2006). Computer Architecture Fundamentals and Principles of Computer Design, Taylor & Francis Group, Boca Raton, FL.