Are there any guidelines for encouraging vectorization?

Hi, so I've been trying to research SIMD instructions and I've learned that the compiler can automatically use SIMD instructions in some cases with the appropriate compiler flags (/arch:SSE2 for MSVC). And I've learned that it mainly does this when unrolling loops where there's no dependency between iterations. In a blog post I learned that at least in the 2012 version you had to use int or size_t in the for loop. It seems it can only use the SIMD instructions for math operations if you use the math.h functions.

Are there any other guidelines that you should follow for automatic vectorization?

Edited by Bozemoto on Reason: Initial post
Yes, autovecorization is mostly for loops. MSVC does autovec when it can figure out access pattern. In simplest case where you have i'th element processed in loop, it will work pretty well. The operations doesn't need to be math.h functions. It works fine for just arithmetic. Here's an example: https://godbolt.org/g/o4w1nT Lines 35-50 in assembly contains vector mul/add.

It works fine also for sin/cos and similar function (contrary to what Casey said two weeks ago that MSVC doesn't do vector cos and sin, it does this now starting with MSVC2010 if I'm not mistaken): https://godbolt.org/g/QePfyW You can see line 38 contains call to 4-wide cosf builtin-function __vdecl_cosf4.

Btw /arch:SSE2 is invalid switch for 64-bit architecture. 64-bit intel architecture always has SSE2 instructions available, so this switch is usless, SSE2 code generation is always enabled.

Edited by Mārtiņš Možeiko on
Thanks that helps alot. I'm using x86 currently since I'm using an insignificant amount of memory at the moment.
though keep in mind that the compiler must consider some things as dependencies even where you think there really shouldn't be.

For example when passing 2 pointers to a function; the compiler must assume that any write to one pointer can affect future reads from the other. There is strict aliasing that lets it assume otherwise but that brings various restrictions with what you can do with pointers.

Also integer overflow can mess with index calculations so it cannot always unroll loops as you would expect.
Think the restrict keyword should help with that?
Yes, it will help with that. Here's an example: https://www.twitch.tv/videos/242024723?t=01h54m52s