Yes, vbroadcastsd
is a good asm instruction for broadcasting a pair of floats, and _mm256_broadcast_sd
+ a cast intrinsic is a safe way to implement it in C.
Note that you aren't dereferencing (in pure C) a double*
that points at float
objects. You're only passing it to an intrinsic function. _mm256_set1_pd( *(double*)floatp )
would be strict aliasing undefined behaviour in C, but load/store intrinsics are defined to work regardless of what the pointer is actually pointing at. Exactly so you can easily do wide loads/stores to whatever data you actually have, not just __int64
or double
.
For example, GCC's header defines _mm256_broadcastsd(const double*)
as a wrapper around __builtin_ia32_vbroadcastsd256
. And GCC defines _mm_loadl_epi64
to include a dereference of *(__m64_u *)__P
, where __m64_u
is an unaligned may-alias version of __m64
which it defines as.
typedef int __m64_u __attribute__ ((__vector_size__ (8), __may_alias__, __aligned__ (1)));
(See also Is `reinterpret_cast`ing between hardware SIMD vector pointer and the corresponding type an undefined behavior?)
In general, even load/store intrinsics that take a float*
or double*
(instead of __m128i*
) are alignment and strict-aliasing safe. (Or at least I think they're supposed to be. On some compilers there might be some which aren't actually strict-aliasing safe. So it can be a pain to get them to safely emit vpbroadcastd
from a pointer that isn't actually pointing at an int
, for example; I forget which intrinsic it was that found some compiler not respecting possible aliasing for.)
Your example 2 is not clear. Are you wanting to bit-shift the bit-patterns of floats? Yes, of course you can do that, that's why SIMD cast intrinsics exist to keep the C compiler happy when you want to reinterpret the same bits as a different vector type.
It's common to do that as part of implementing exp()
or log
for example, such as Fastest Implementation of Exponential Function Using AVX