I have some threaded C code that req开发者_StackOverflowuires 64 byte alignment of the processed data structure. How will this alignment interact with prefetch instructions like the gcc __builtin_prefetch? Will the effects of prefetching be the same as using a non-aligned array or not?
Note that I am using memalign to obtain the aligned array.
Thanks.
The answer to this one is highly implementation-dependent.
However, on x86 and x86_64, GCC implements __builtin_prefetch as a single PREFETCH assembly instruction.
According to Intel's documentation (search for "PREFETCH"):
Fetches the line of data from memory that contains the byte specified with the source operand to a location in the cache hierarchy specified by a locality hint:
I am 99% sure the AMD version behaves the same way, but I am too busy to check...
So if the memory operand is unaligned, it will effectively be rounded down to a multiple of 64 bytes and that cache line will be prefetched. (Well, 64 bytes on all the current CPUs I know of. The instruction set reference only guaranteed to be "a minimum of 32 bytes". Not sure why they bothered saying that; in any situation where it makes sense to use this gadget, you have to be assuming a lot about the particular CPU already.)
 
         
                                         
                                         
                                         
                                        ![Interactive visualization of a graph in python [closed]](https://www.devze.com/res/2023/04-10/09/92d32fe8c0d22fb96bd6f6e8b7d1f457.gif) 
                                         
                                         
                                         
                                         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论