If you start naively without any library that avoids the problem then memory access is the problem. Have a look at how much effort is needed to avoid the problem, for example with blocking algorithms.
New lower values for p get discovered all the time (maybe once a year). It is conjectured that they will approach 2.0 without ever getting quite to it. Somehow Quanta Mag heard about the new result ...