Hello,
If I understand correctly you're trying to measure IPoIB latency ?
When you're trying to offload IPoIB traffic using VMA, what are the numbers you're getting ? (usec) ?
Are you using datagram or connected mode?
any affinity tunings done on these servers ? i.e binding sockperf to use NumaX with taskset ?
What configurations have you tried using with ethtool ? (request low latency etc...)
Any drops on the IPoIB interfaces ? you can see them with ethtool -S