NONMEM on a SGI R8000
We wanted to buy the R8000 challenge from SGI. Before buying we wanted to have a
benchmark to see how the performance compared to our R4400 150 Mhz.
It turned out that the R8000 did not deliver the performance we (and SGI) expected.
So SGI had someone look at the sources and discovered that it was no floating point
dependent source. It was more coupled to the integer performance.
Then we had a look at the R4400 200 Mhz, and found our run speeds improved two-fold.
This was explained by the bigger secondary cache (R4400 150 Mhz: 1 Mb, R4400 200 Mhz: 4 Mb)
and by the higher specsint of this machine compared to the R8000 and R4400 150 Mhz Beside
this, on SGI machines it can be important to adjust the size of the buffe
When looking with a profiler NONMEM spends most time doing freads. This is done for a big
part to NONMEM FILE10. This has increased our run times almost with a factor of two, just
by adjusting NONMEM buffers.
After adjusting the buffers freads decreased from more than 35 % to less than 1%.
This is probably dependent on secondary cache since the performance got worse on a R3000
(no 1 Mb sec. cache). We have ordered the R4400 200 Mhz with two processors, our next
steps will be optimizing the NONMEM source and parallel processing.