we are doing some profiling with socket send time and found out the “send” takes around 5-7us, the packet size is 215 bytes.
Profiling code
enter image description here
using VMA_stat, we confirm that everything sent are already “offloaded”, so, I assume the
send(xxx,xxx,xxx,xxx)
are using mellanox library instead of <sys/socket.h> ?
question
- Any optimization techniques to reduce the socket transmission time?
- Any obvious problem with how we use LD_preload?
we are using VMA_SPEC=latency and running on CentOS