RE: Linear speedup of NONMEM on quad-core CPUs?
Steve,
It really, really should be the case that speed up for multiple
simulataneous runs is linear. In looking at it for many years, NONMEM
execution really is consistently proportional to benchmarks like
specfp95. It seems that disc I/O is trivial, the entire data set can
typically be put into cache on modern machines. I have noted
differences between "cheaper" 2.8 Ghz dual core machines (Dell E510)
and "better" 2.8 Ghz machines I've gotten (from Gateway). But, if you
look at the specfp95 ( http://www.spec.org/cpu95/results/cfp95.html),
there are difference between machines using the same CPU - I can't
claim to understand why. Memory should not be an issue - NONMEM
typically uses less than 5 Mb of memory.
I have done what you ask (I think) in a two stage, but not the whole
thing:
Dual core does increase run speed (1/time) linearly (note that dual core
are typically a little slower clock speed) for 2 processes - this is
what I currently run.
4 processor (single core - a Proliant 4 processor server running Windows
Server 2000) machine does increase run speed (1/time) linearly, for four
processes.
The Intel quad core is just two dual core processor stuck together with
a single front side bus, they don't share cache or registers. This
probably is better for NONMEM than the AMD approach, sharing registers,
since separate NONMEM runs obviously don't need to share anything. (the
Intel approach is worse for games, since latency to cache memory is
worse)
But, a 4 processor dual core will cost you > $12,000, and will not use
less power than 4 dual core boxes - why go to the quad processor?
(Trust me, it won't make less noise either) You can buy 4 dual core
boxes, set up a LAN and map the c: drive on one "main" machine to all
the machines (so from the "main" machine, everthing looks like it is
happening on the local drive, when in fact execution is happening on
the other machines), use remote desktop to control all 4 computers from
one monitor/mouse/keyboard. A dual core Dell is about $700. Best price
for quad core right now is about $2000 (i.e, more $/Ghz than dual core)
The current Intel quad core is intended for servers, and is expensive.
The desktop version is due out late this year - should be cheaper and
prices will probably come down when AMD comes out with their quad core
CPU.
Brian,
You're observation (if I understand correctly that you are talking
about running only one NONMEM run) is a little surprising, NONMEM is
single threaded. So the current appoach to parallel computing
(multithreading) isn't going to happen. The parallel option on the
Intel compiler can, in theory, "unroll" loops in Fortran. But, in
reality, the code has to be specifically written to do this, and NONMEM
certainly is not. I tried this, in collaboration with Silicon Graphics
about 10 years ago (who claimed to have the best parallel compiler
around, right before they went out of business), and got zero
parallelization for a single run of NONMEM. But this was a long time
ago, maybe Intel figured out something new.
Mark
Mark Sale MD
Next Level Solutions, LLC
www.NextLevelSolns.com
Quoted reply history
> -------- Original Message --------
> Subject: Re: [NMusers] Linear speedup of NONMEM on quad-core CPUs?
> From: Steve Chapel <[EMAIL PROTECTED]>
> Date: Mon, March 12, 2007 11:30 am
> To: [email protected]
>
> That's really not my question. My question was about speedup of multiple
> NONMEM runs, not one NONMEM run. Let me rephrase the question.
>
> Let's say I have eight NONMEM jobs to run each week. Each NONMEM job
> takes eight hours to run. I go to a computer and start one NONMEM job,
> and when it is finished, I start another, and so on. After eight hours,
> all eight NONMEM jobs are run.
>
> The next week, I get a great idea. Instead of using one computer, I can
> use eight computers. I start all eight NONMEM jobs at the same time, and
> after only one hour they are all done. I have achieved eightfold
> (linear) speedup in running eight jobs by using eight computers.
>
> The next week, I make a further realization. The computers I was running
> the NONMEM jobs are dual-core, so I need to use only four computers. I
> start two NONMEM jobs on each of the four computers, and after one hour
> all the jobs are done. The benefit is that this week I needed only four
> computers to be available.
>
> It might occur to me that all I really need is one computer with two
> quad-core processors. I could start all eight NONMEM jobs simultaneously
> on just one computer. The question is, has anyone actually tried this?
> Does it run all eight NONMEM jobs in the same time it would take to run
> one NONMEM jobs? In other words, has going from one core to eight cores
> enabled an eightfold (linear) speedup in running eight NONMEM jobs? If
> not, how much speedup might I expect from an eight-core computer?
>
> -- Steve
>
>
> Brian M. Sadler wrote:
> > Steve,
> >
> > I have just set up NONMEM 6 on a 4GB Core(2) Quad system running XP64. I
> > don't yet have benchmarks, but I have noted activity on all four CPU using
> > the "/Qparallel" option with the Intel Fortran Compiler. I look forward to
> > hearing of others' experiences.
> >
> > Cheers... Brian
> >
> >
> > -----Original Message-----
> > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
> > Behalf Of Steve Chapel
> > Sent: Friday, March 09, 2007 12:08 PM
> > To: [email protected]
> > Subject: [NMusers] Linear speedup of NONMEM on quad-core CPUs?
> >
> > A few years ago there was a post about benchmarking results for NONMEM
> > on a dual-core CPU ( http://huxley.phor.com/nonmem/nm/99nov212005.html).
> > Given the relatively recent release of Xeon quad-core processors I
> > wanted to know if anybody has compared NONMEM runs on a machine with two
> > dual-core processors to NONMEM runs on a quad-core CPU, or even NONMEM
> > runs on a computer with two quad-core CPUs. Has anyone confirmed that
> > having four or eight cores provides linear speedup of running four or
> > eight NONMEM jobs? Alternatively, if anyone has confirmed that the
> > speedup is not linear, what is the approximate speedup, and what was
> > model number of the CPU(s)?
> >
> > If a similar topic has been discussed recently (in January or February)
> > on this mailing list, could someone please re-post the information? I
> > just joined in March 2007, and the archives seem to contain no messages
> > from 2007.
> >
> > Thanks,
> > Steve
> >
> >
> >
> >
> >