December 6, 2013

Java: Check your memory access time

Waiting for data is likely the most common thing your CPU is up to. Even when you CPU is running at 100%, it is likely to waiting for data from RAM. Random access to RAM memory is often a bottleneck in a "CPU-bound" Java program. Such programs should really be called RAM-bound, since their performance is bound by the performance of the RAM, not the CPU.

Out of curiosity, let's measure the performance of memory access from Java. We limit the test to summing values of a single long[] array. The memory size of the array is 0.8 MB when testing the cache and 400 MB for testing the RAM. The array is accesses either sequentially or using random access. So, we have a total of four test cases: cache with sequential access, cache with random access, RAM with sequential access, and RAM with random access.

My setup: an i7-3687U CPU (2.10 GHz), Ubuntu 12.04 64-bit, 4 MiB cache, Oracle JVM 1.7.0_25 (64-Bit, build 23.25-b01). The "-server" flag is used when running Java. Computer: Toshiba Portege Z930-14L.

My Java code. Consider it public domain, do whatever you like with it. It's all in one class and it has no dependencies besides a JVM. There are deficiences, of course. Code style has sometime been sacrificed for performance. Loop unrolling did not improve performance, however. There are many other things that happens then just accessing memory. Computation of the next index is a significant part of the cacheRan test.

    Test        Access time
    cacheSeq     0.33 ns,  0.69 cycles
    cacheRan     3.16 ns,  6.64 cycles
    ramSeq       0.56 ns,  1.37 cycles
    ramRan      20.07 ns, 42.16 cycles















Sequential access to the long[] array is surprisingly fast. It is interesting to note that accessing data from the cache in sequence results in less than a cycle per access. I believe this could be because SSE instructions are used. I have not confirmed this. It could also be because of Intel's turbo boost technology. The cycle time assumed is based on a frequency of 2.10 GHz.

Sequential access in RAM is on par with the result for sequential access to cached data. Random access to the cache is significantly slower (3 ns) and random access to the big RAM memory is many times slower (20 ns).

Cache-aware algorithms are important. More on that later.

Check the memory access time on your computer! Just run the Java class provided. Add the result as a comment, please.

UPDATE: Needless to say (?). It is very seldom a programmer would ever need to consider making his code cache-aware. Other things (disk/network IO) usually limits application performance. And the value of the product is seldom limited by its CPU performance anyway. Cache-aware algorithms are however interesting for me while creating BergDB. I aim at very high performance for data that fits into RAM - it should be on par with databases that assumes all data fits into RAM. It is also interesting for those who writes code for basic data structures: maps, set, lists and so on.




3 comments:

  1. I have read your blog its very attractive and impressive. I like it your blog.

    Java Training in Chennai Core Java Training in Chennai Core Java Training in Chennai

    Java Online Training Java Online Training Core Java 8 Training in Chennai Core java 8 online training JavaEE Training in Chennai Java EE Training in Chennai

    ReplyDelete
  2. Java Online Training Java Online Training Java Online Training Java Online Training Java Online Training Java Online Training

    Hibernate Online Training Hibernate Online Training Spring Online Training Spring Online Training Spring Batch Training Online Spring Batch Training Online

    ReplyDelete
  3. This website of yours is really helpful it provided us with massive valuable information to work on there are many impressive post that you have done in this site which we found it as treasure in details please keep his excellent and delight job in further updates thank you very much indeed
    Amazon Product Upload Services

    ReplyDelete