Dear All,
I am testing the new Epyc processors from AMD (comparing with Intel Xeon), and getting different results. Just wondering whether anybody faced the problem of differences between AMD and Intel processors and knows how to solve it. I am using Intel compiler but ready to switch to gfortran or anything else if this would help to get identical results. There were reports of Intel slowing the AMD execution in the past, but in my tests, speed is comparable but the results differ.
Thanks
Leonid
AMD vs Intel
10 messages
6 people
Latest: Nov 20, 2019
Leonid - when you say different. What do you mean? Fixed effect and random
effects? Different OFV?
We did a poster at AAPS a decade or so ago comparing results across different
platforms using the same data and model. We got different results on the
standard errors (which related to matrix inversion and how those are done using
software-hardware configurations). And with overparameterized models we got
different error messages - some platforms converged with no problem while some
did not converge and gave R matrix singularity.
Did your problems go beyond this?
pete
Peter Bonate, PhD
Executive Director
Pharmacokinetics, Modeling, and Simulation
Astellas
1 Astellas Way, N3.158
Northbrook, IL 60062
[email protected]
(224) 205-5855
Details are irrelevant in terms of decision making - Joe Biden.
Quoted reply history
-----Original Message-----
From: [email protected] <[email protected]> On Behalf Of
Leonid Gibiansky
Sent: Monday, November 18, 2019 11:05 AM
To: nmusers <[email protected]>
Subject: [NMusers] AMD vs Intel
Dear All,
I am testing the new Epyc processors from AMD (comparing with Intel Xeon), and
getting different results. Just wondering whether anybody faced the problem of
differences between AMD and Intel processors and knows how to solve it. I am
using Intel compiler but ready to switch to gfortran or anything else if this
would help to get identical results.
There were reports of Intel slowing the AMD execution in the past, but in my
tests, speed is comparable but the results differ.
Thanks
Leonid
Hi Leonid,
"A while" back we compared model development trajectories and results
between two computational platforms, Itanium and Xeon, see
https://www.page-meeting.org/?abstract=1188. The results roughly were:
1/3 equal, 1/3 rounding differences and 1/3 real different results. From
discussions with the technical knowledgeable people I worked with at the
time, I recall that there are three levels/sources for those differences:
1) computational (hardware) platform
2) compilers (+ optimization settings)
3) libraries (floating point handling does matter)
Assuming you would like to compare the speed of the platforms wrt
NONMEM, my advice would be to test a large series of different models,
from simple ADVAN1 or 2 to complex ODE, ranging from FO to LAPLACIAN INT
NUMERICAL, while keeping compilers and libraries the same. Also small
and large datasets, as in some instances you might be testing only the
L1/L2/L3 cache strategies and Turbo settings. And with and without
parallelization - as that might determine runtime bottlenecks in practice.
Just having a peek at Epyc - seems interesting (noticed results w gcc7.4
compilation). As long as you are able to hold the computation in cache,
a big if for the 64-core, there might be an advantage.
All in all I am not sure that it is worth the trouble. For any given
PK-PD model there is a lot you can tune to gain speed, but the optimal
settings might be very different for the next and overrule any platform
differences.
Hope this helps,
Jeroen
http://pd-value.com
jeroen
+31 6 23118438
-- More value out of your data!
Quoted reply history
On 18/11/19 6:34 pm, Leonid Gibiansky wrote:
> Thanks Bob and Peter!
>
> The model is quite stable, but this is LAPLACIAN, so requires second
> derivatives. At iteration 0, gradients differ by about 50 to 100%
> between Intel and AMD. This leads to differences in minimization path,
> and slightly different results. Not that different to change the
> recommended dose, but sufficiently different to notice (OF difference
> of 6 points; 50% more model evaluations to get to convergence).
> Thanks
> Leonid
>
>
>
> On 11/18/2019 12:15 PM, Bonate, Peter wrote:
>> Leonid - when you say different. What do you mean? Fixed effect and
>> random effects? Different OFV?
>>
>> We did a poster at AAPS a decade or so ago comparing results across
>> different platforms using the same data and model. We got different
>> results on the standard errors (which related to matrix inversion and
>> how those are done using software-hardware configurations). And with
>> overparameterized models we got different error messages - some
>> platforms converged with no problem while some did not converge and
>> gave R matrix singularity.
>>
>> Did your problems go beyond this?
>>
>> pete
>>
>>
>>
>> Peter Bonate, PhD
>> Executive Director
>> Pharmacokinetics, Modeling, and Simulation
>> Astellas
>> 1 Astellas Way, N3.158
>> Northbrook, IL 60062
>> Peter.bonate
>> (224) 205-5855
>>
>>
>>
>> Details are irrelevant in terms of decision making - Joe Biden.
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: owner-nmusers
>> Behalf Of Leonid Gibiansky
>> Sent: Monday, November 18, 2019 11:05 AM
>> To: nmusers <nmusers
>> Subject: [NMusers] AMD vs Intel
>>
>> Dear All,
>>
>> I am testing the new Epyc processors from AMD (comparing with Intel
>> Xeon), and getting different results. Just wondering whether anybody
>> faced the problem of differences between AMD and Intel processors and
>> knows how to solve it. I am using Intel compiler but ready to switch
>> to gfortran or anything else if this would help to get identical
>> results.
>> There were reports of Intel slowing the AMD execution in the past,
>> but in my tests, speed is comparable but the results differ.
>>
>> Thanks
>> Leonid
>>
>>
>>
>
I did a similar comparison man years ago (circa 2010) while chasing down a
1-bit difference in an Intel vs AMD floating point result. There is (or at
least was) a slight difference in the 80 bit floating point hardware (Intel
and AMD both use an 80 bit registers internal) and even found a reference
somewhere in either an Intel or AMD document (sorry, no longer have it) that
can give rise to this. Apparently there is a compiler switch that can be used
to turn off
The internal 80-bit computation - see
https://stackoverflow.com/questions/612507/what-are-the-applications-benefits-of-an-80-bit-extended-precision-data-type
and answer therein that refer to this compiler option.
Quoted reply history
-----Original Message-----
From: [email protected] <[email protected]> On Behalf Of
Leonid Gibiansky
Sent: Monday, November 18, 2019 12:05 PM
To: nmusers <[email protected]>
Subject: [NMusers] AMD vs Intel
Dear All,
I am testing the new Epyc processors from AMD (comparing with Intel Xeon), and
getting different results. Just wondering whether anybody faced the problem of
differences between AMD and Intel processors and knows how to solve it. I am
using Intel compiler but ready to switch to gfortran or anything else if this
would help to get identical results.
There were reports of Intel slowing the AMD execution in the past, but in my
tests, speed is comparable but the results differ.
Thanks
Leonid
Hi all,
The Intel fortran compiler has a switch - fp-strict that switches off
numerical optimization.
See also
https://software.intel.com/sites/default/files/comment/1721462/03-fp-consistency-2012.pdf
Op ma 18 nov. 2019 18:42 schreef Leonid Gibiansky <[email protected]
>:
> Thanks Bob and Peter!
>
> The model is quite stable, but this is LAPLACIAN, so requires second
> derivatives. At iteration 0, gradients differ by about 50 to 100%
> between Intel and AMD. This leads to differences in minimization path,
> and slightly different results. Not that different to change the
> recommended dose, but sufficiently different to notice (OF difference of
> 6 points; 50% more model evaluations to get to convergence).
> Thanks
> Leonid
>
>
>
Quoted reply history
> On 11/18/2019 12:15 PM, Bonate, Peter wrote:
> > Leonid - when you say different. What do you mean? Fixed effect and
> random effects? Different OFV?
> >
> > We did a poster at AAPS a decade or so ago comparing results across
> different platforms using the same data and model. We got different
> results on the standard errors (which related to matrix inversion and how
> those are done using software-hardware configurations). And with
> overparameterized models we got different error messages - some platforms
> converged with no problem while some did not converge and gave R matrix
> singularity.
> >
> > Did your problems go beyond this?
> >
> > pete
> >
> >
> >
> > Peter Bonate, PhD
> > Executive Director
> > Pharmacokinetics, Modeling, and Simulation
> > Astellas
> > 1 Astellas Way, N3.158
> > Northbrook, IL 60062
> > [email protected]
> > (224) 205-5855
> >
> >
> >
> > Details are irrelevant in terms of decision making - Joe Biden.
> >
> >
> >
> >
> >
> >
> > -----Original Message-----
> > From: [email protected] <[email protected]> On
> Behalf Of Leonid Gibiansky
> > Sent: Monday, November 18, 2019 11:05 AM
> > To: nmusers <[email protected]>
> > Subject: [NMusers] AMD vs Intel
> >
> > Dear All,
> >
> > I am testing the new Epyc processors from AMD (comparing with Intel
> Xeon), and getting different results. Just wondering whether anybody faced
> the problem of differences between AMD and Intel processors and knows how
> to solve it. I am using Intel compiler but ready to switch to gfortran or
> anything else if this would help to get identical results.
> > There were reports of Intel slowing the AMD execution in the past, but
> in my tests, speed is comparable but the results differ.
> >
> > Thanks
> > Leonid
> >
> >
> >
>
>
Hi Leonid,
"A while" back we compared model development trajectories and results between two computational platforms, Itanium and Xeon, see https://www.page-meeting.org/?abstract=1188 . The results roughly were: 1/3 equal, 1/3 rounding differences and 1/3 real different results. From discussions with the technical knowledgeable people I worked with at the time, I recall that there are three levels/sources for those differences:
1) computational (hardware) platform
2) compilers (+ optimization settings)
3) libraries (floating point handling does matter)
Assuming you would like to compare the speed of the platforms wrt NONMEM, my advice would be to test a large series of different models, from simple ADVAN1 or 2 to complex ODE, ranging from FO to LAPLACIAN INT NUMERICAL, while keeping compilers and libraries the same. Also small and large datasets, as in some instances you might be testing only the L1/L2/L3 cache strategies and Turbo settings. And with and without parallelization - as that might determine runtime bottlenecks in practice.
Just having a peek at Epyc - seems interesting (noticed results w gcc7.4 compilation). As long as you are able to hold the computation in cache, a big if for the 64-core, there might be an advantage.
All in all I am not sure that it is worth the trouble. For any given PK-PD model there is a lot you can tune to gain speed, but the optimal settings might be very different for the next and overrule any platform differences.
Hope this helps,
Jeroen
http://pd-value.com
[email protected]
@PD_value
+31 6 23118438
-- More value out of your data!
Quoted reply history
On 18/11/19 6:34 pm, Leonid Gibiansky wrote:
> Thanks Bob and Peter!
>
> The model is quite stable, but this is LAPLACIAN, so requires second derivatives. At iteration 0, gradients differ by about 50 to 100% between Intel and AMD. This leads to differences in minimization path, and slightly different results. Not that different to change the recommended dose, but sufficiently different to notice (OF difference of 6 points; 50% more model evaluations to get to convergence).
>
> Thanks
> Leonid
>
> On 11/18/2019 12:15 PM, Bonate, Peter wrote:
>
> > Leonid - when you say different. What do you mean? Fixed effect and random effects? Different OFV?
> >
> > We did a poster at AAPS a decade or so ago comparing results across different platforms using the same data and model. We got different results on the standard errors (which related to matrix inversion and how those are done using software-hardware configurations). And with overparameterized models we got different error messages - some platforms converged with no problem while some did not converge and gave R matrix singularity.
> >
> > Did your problems go beyond this?
> >
> > pete
> >
> > Peter Bonate, PhD
> > Executive Director
> > Pharmacokinetics, Modeling, and Simulation
> > Astellas
> > 1 Astellas Way, N3.158
> > Northbrook, IL 60062
> > [email protected]
> > (224) 205-5855
> >
> > Details are irrelevant in terms of decision making - Joe Biden.
> >
> > -----Original Message-----
> >
> > From: [email protected] < [email protected] > On Behalf Of Leonid Gibiansky
> >
> > Sent: Monday, November 18, 2019 11:05 AM
> > To: nmusers <[email protected]>
> > Subject: [NMusers] AMD vs Intel
> >
> > Dear All,
> >
> > I am testing the new Epyc processors from AMD (comparing with Intel Xeon), and getting different results. Just wondering whether anybody faced the problem of differences between AMD and Intel processors and knows how to solve it. I am using Intel compiler but ready to switch to gfortran or anything else if this would help to get identical results. There were reports of Intel slowing the AMD execution in the past, but in my tests, speed is comparable but the results differ.
> >
> > Thanks
> > Leonid
Hi Leonid,
In that case, have you tried virtualization? Couldn’t help but notice
involvement of VMware in Epyc. If you run from virtual, you have an additional
abstraction layer. You still have to deal with the optimizations of VMware
unfortunately, but it might improve comparability.
Best,
Jeroen
http://pd-value.com
[email protected]
@PD_value
+31 6 23118438
-- More value out of your data!
Quoted reply history
> Op 18 nov. 2019 om 23:54 heeft Leonid Gibiansky <[email protected]>
> het volgende geschreven:
>
> Hi Jeroin,
>
> Thanks for your input, very interesting. As far as the goal is concerned, I
> am mostly interested to find options that would give identical results on two
> platform rather than in speed. So far no luck: 4 combinations of gfortran /
> Intel compilers on Xeon / AMD processors give 4 sets of results that are
> close but not identical.
>
> Related question to the group: have anybody experimented with gfortran
> options (rather than using default provided by Nonmem distribution)? Any
> recommendations? Same goal: maximum reproducibility across different OSs,
> parallelization options, and processor types.
>
> Thanks
> Leonid
>
>
>
>
>> On 11/18/2019 5:28 PM, Jeroen Elassaiss-Schaap (PD-value B.V.) wrote:
>> Hi Leonid,
>> "A while" back we compared model development trajectories and results
>> between two computational platforms, Itanium and Xeon, see
>> https://www.page-meeting.org/?abstract=1188. The results roughly were: 1/3
>> equal, 1/3 rounding differences and 1/3 real different results. From
>> discussions with the technical knowledgeable people I worked with at the
>> time, I recall that there are three levels/sources for those differences:
>> 1) computational (hardware) platform
>> 2) compilers (+ optimization settings)
>> 3) libraries (floating point handling does matter)
>> Assuming you would like to compare the speed of the platforms wrt NONMEM, my
>> advice would be to test a large series of different models, from simple
>> ADVAN1 or 2 to complex ODE, ranging from FO to LAPLACIAN INT NUMERICAL,
>> while keeping compilers and libraries the same. Also small and large
>> datasets, as in some instances you might be testing only the L1/L2/L3 cache
>> strategies and Turbo settings. And with and without parallelization - as
>> that might determine runtime bottlenecks in practice.
>> Just having a peek at Epyc - seems interesting (noticed results w gcc7.4
>> compilation). As long as you are able to hold the computation in cache, a
>> big if for the 64-core, there might be an advantage.
>> All in all I am not sure that it is worth the trouble. For any given PK-PD
>> model there is a lot you can tune to gain speed, but the optimal settings
>> might be very different for the next and overrule any platform differences.
>> Hope this helps,
>> Jeroen
>> http://pd-value.com
>> [email protected]
>> @PD_value
>> +31 6 23118438
>> -- More value out of your data!
>>> On 18/11/19 6:34 pm, Leonid Gibiansky wrote:
>>> Thanks Bob and Peter!
>>>
>>> The model is quite stable, but this is LAPLACIAN, so requires second
>>> derivatives. At iteration 0, gradients differ by about 50 to 100% between
>>> Intel and AMD. This leads to differences in minimization path, and slightly
>>> different results. Not that different to change the recommended dose, but
>>> sufficiently different to notice (OF difference of 6 points; 50% more model
>>> evaluations to get to convergence).
>>> Thanks
>>> Leonid
>>>
>>>
>>>
>>> On 11/18/2019 12:15 PM, Bonate, Peter wrote:
>>>> Leonid - when you say different. What do you mean? Fixed effect and
>>>> random effects? Different OFV?
>>>>
>>>> We did a poster at AAPS a decade or so ago comparing results across
>>>> different platforms using the same data and model. We got different
>>>> results on the standard errors (which related to matrix inversion and how
>>>> those are done using software-hardware configurations). And with
>>>> overparameterized models we got different error messages - some platforms
>>>> converged with no problem while some did not converge and gave R matrix
>>>> singularity.
>>>>
>>>> Did your problems go beyond this?
>>>>
>>>> pete
>>>>
>>>>
>>>>
>>>> Peter Bonate, PhD
>>>> Executive Director
>>>> Pharmacokinetics, Modeling, and Simulation
>>>> Astellas
>>>> 1 Astellas Way, N3.158
>>>> Northbrook, IL 60062
>>>> [email protected]
>>>> (224) 205-5855
>>>>
>>>>
>>>>
>>>> Details are irrelevant in terms of decision making - Joe Biden.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: [email protected] <[email protected]> On
>>>> Behalf Of Leonid Gibiansky
>>>> Sent: Monday, November 18, 2019 11:05 AM
>>>> To: nmusers <[email protected]>
>>>> Subject: [NMusers] AMD vs Intel
>>>>
>>>> Dear All,
>>>>
>>>> I am testing the new Epyc processors from AMD (comparing with Intel Xeon),
>>>> and getting different results. Just wondering whether anybody faced the
>>>> problem of differences between AMD and Intel processors and knows how to
>>>> solve it. I am using Intel compiler but ready to switch to gfortran or
>>>> anything else if this would help to get identical results.
>>>> There were reports of Intel slowing the AMD execution in the past, but in
>>>> my tests, speed is comparable but the results differ.
>>>>
>>>> Thanks
>>>> Leonid
>>>>
>>>>
>>>>
>>>
Hi Leonid,
When upgrading from gfortran 4.4.7 to 5.1.1 we ran around 20 models with both
compilers and turning off the -ffast-math. The runs where on the same hardware.
The differences in the parameter estimates and OFV were in general small. One
big difference we could see was that the success of the covariance step was
seemingly random. It could succeed on one compiler version, but not the other
and it could also start failing when the option was turned off. I have kept the
runs, so let me know if you would be interested. I also started some
experiments using machine dependent compiler flags, but as our cluster is
heterogeneous I abandoned this testing.
I think that getting identical results could be possible, but that it would be quite
a challenge. There are many components that affect the results. The compiler, the
compiler flags, the libc implementation, the hardware and sometimes the operating
system. To see for example where the standard libraries comes into play you can do nm
nonmem on the nonmem executable (in linux) to list all symbols compiled in. Some are
function from external libraries, for example my exponential function is from libc:
exp@@GLIBC_2.2.5<mailto:exp@@GLIBC_2.2.5> . Even the functions that read in
numbers from text strings could introduce rounding errors since the text
representation is decimal and the internal floating point number is binary.
Best regards,
Rikard Nordgren
--
Rikard Nordgren
Systems developer
Dept of Pharmaceutical Biosciences
Faculty of Pharmacy
Uppsala University
Box 591
75124 Uppsala
Phone: +46 18 4714308
http://www.farmbio.uu.se/research/researchgroups/pharmacometrics/
Quoted reply history
On 2019-11-18 23:54, Leonid Gibiansky wrote:
Hi Jeroin,
Thanks for your input, very interesting. As far as the goal is concerned, I am
mostly interested to find options that would give identical results on two
platform rather than in speed. So far no luck: 4 combinations of gfortran /
Intel compilers on Xeon / AMD processors give 4 sets of results that are close
but not identical.
Related question to the group: have anybody experimented with gfortran options
(rather than using default provided by Nonmem distribution)? Any
recommendations? Same goal: maximum reproducibility across different OSs,
parallelization options, and processor types.
Thanks
Leonid
On 11/18/2019 5:28 PM, Jeroen Elassaiss-Schaap (PD-value B.V.) wrote:
Hi Leonid,
"A while" back we compared model development trajectories and results between
two computational platforms, Itanium and Xeon, see
https://www.page-meeting.org/?abstract=1188. The results roughly were: 1/3 equal, 1/3
rounding differences and 1/3 real different results. From discussions with the technical
knowledgeable people I worked with at the time, I recall that there are three
levels/sources for those differences:
1) computational (hardware) platform
2) compilers (+ optimization settings)
3) libraries (floating point handling does matter)
Assuming you would like to compare the speed of the platforms wrt NONMEM, my
advice would be to test a large series of different models, from simple ADVAN1
or 2 to complex ODE, ranging from FO to LAPLACIAN INT NUMERICAL, while keeping
compilers and libraries the same. Also small and large datasets, as in some
instances you might be testing only the L1/L2/L3 cache strategies and Turbo
settings. And with and without parallelization - as that might determine
runtime bottlenecks in practice.
Just having a peek at Epyc - seems interesting (noticed results w gcc7.4
compilation). As long as you are able to hold the computation in cache, a big
if for the 64-core, there might be an advantage.
All in all I am not sure that it is worth the trouble. For any given PK-PD
model there is a lot you can tune to gain speed, but the optimal settings might
be very different for the next and overrule any platform differences.
Hope this helps,
Jeroen
http://pd-value.com
[email protected]<mailto:[email protected]>
@PD_value
+31 6 23118438
-- More value out of your data!
On 18/11/19 6:34 pm, Leonid Gibiansky wrote:
Thanks Bob and Peter!
The model is quite stable, but this is LAPLACIAN, so requires second
derivatives. At iteration 0, gradients differ by about 50 to 100% between
Intel and AMD. This leads to differences in minimization path, and slightly
different results. Not that different to change the recommended dose, but
sufficiently different to notice (OF difference of 6 points; 50% more model
evaluations to get to convergence).
Thanks
Leonid
On 11/18/2019 12:15 PM, Bonate, Peter wrote:
Leonid - when you say different. What do you mean? Fixed effect and random
effects? Different OFV?
We did a poster at AAPS a decade or so ago comparing results across different
platforms using the same data and model. We got different results on the
standard errors (which related to matrix inversion and how those are done using
software-hardware configurations). And with overparameterized models we got
different error messages - some platforms converged with no problem while some
did not converge and gave R matrix singularity.
Did your problems go beyond this?
pete
Peter Bonate, PhD
Executive Director
Pharmacokinetics, Modeling, and Simulation
Astellas
1 Astellas Way, N3.158
Northbrook, IL 60062
[email protected]<mailto:[email protected]>
(224) 205-5855
Details are irrelevant in terms of decision making - Joe Biden.
-----Original Message-----
From: [email protected]<mailto:[email protected]>
<[email protected]><mailto:[email protected]> On Behalf Of
Leonid Gibiansky
Sent: Monday, November 18, 2019 11:05 AM
To: nmusers <[email protected]><mailto:[email protected]>
Subject: [NMusers] AMD vs Intel
Dear All,
I am testing the new Epyc processors from AMD (comparing with Intel Xeon), and
getting different results. Just wondering whether anybody faced the problem of
differences between AMD and Intel processors and knows how to solve it. I am
using Intel compiler but ready to switch to gfortran or anything else if this
would help to get identical results.
There were reports of Intel slowing the AMD execution in the past, but in my
tests, speed is comparable but the results differ.
Thanks
Leonid
När du har kontakt med oss på Uppsala universitet med e-post så innebär det att
vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du
läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/
E-mailing Uppsala University means that we will process your personal data. For
more information on how this is performed, please read here:
http://www.uu.se/en/about-uu/data-protection-policy
Thanks to all who shared their experience.
Here is the brief summary of observations:
4 combinations of Intel Fortran or gfortran with Xeon or AMD processors
(of approximately the same base frequency) provided similar speed but
different results. Time comparison is not straightforward as the number
of iterations required for convergence varied between these 4 versions
(FOCEI, LAPLACIAN, and SAEM with ADVAN13 were used for all tests).
Results are numerically different, but not really different as parameter
estimates differ by no more than the respective confidence intervals of
parameter estimates: few percents for the well defined parameters, more
for parameters with large RSEs. Thus, any of these 4 combinations can be
used, but it is better not to mix them in one analysis. Also it seems to
be a good practice to specify not only OS and compiler with options, but
also processor or at least processor type to ensure exact
reproducibility of results.
Unlike earlier (10+ years ago) reports, Intel (old, v.11) compiler seems
to provide similar speed on both Intel and AMD new processors.
Thanks!
Leonid
Quoted reply history
On 11/19/2019 4:32 AM, Rikard Nordgren wrote:
> Hi Leonid,
>
> When upgrading from gfortran 4.4.7 to 5.1.1 we ran around 20 models with
> both compilers and turning off the -ffast-math. The runs where on the
> same hardware. The differences in the parameter estimates and OFV were
> in general small. One big difference we could see was that the success
> of the covariance step was seemingly random. It could succeed on one
> compiler version, but not the other and it could also start failing when
> the option was turned off. I have kept the runs, so let me know if you
> would be interested. I also started some experiments using machine
> dependent compiler flags, but as our cluster is heterogeneous I
> abandoned this testing.
>
> I think that getting identical results could be possible, but that it
> would be quite a challenge. There are many components that affect the
> results. The compiler, the compiler flags, the libc implementation, the
> hardware and sometimes the operating system. To see for example where
> the standard libraries comes into play you can do nm nonmem on the
> nonmem executable (in linux) to list all symbols compiled in. Some are
> function from external libraries, for example my exponential function is
> from libc: exp
> from text strings could introduce rounding errors since the text
> representation is decimal and the internal floating point number is binary.
>
> Best regards,
> Rikard Nordgren
>
> --
> Rikard Nordgren
> Systems developer
>
> Dept of Pharmaceutical Biosciences
> Faculty of Pharmacy
> Uppsala University
> Box 591
> 75124 Uppsala
>
> Phone: +46 18 4714308
> www.farmbio.uu.se/research/researchgroups/pharmacometrics/
>
>
>
>
> On 2019-11-18 23:54, Leonid Gibiansky wrote:
>> Hi Jeroin,
>>
>> Thanks for your input, very interesting. As far as the goal is
>> concerned, I am mostly interested to find options that would give
>> identical results on two platform rather than in speed. So far no
>> luck: 4 combinations of gfortran / Intel compilers on Xeon / AMD
>> processors give 4 sets of results that are close but not identical.
>>
>> Related question to the group: have anybody experimented with gfortran
>> options (rather than using default provided by Nonmem distribution)?
>> Any recommendations? Same goal: maximum reproducibility across
>> different OSs, parallelization options, and processor types.
>>
>> Thanks
>> Leonid
>>
>>
>>
>>
>> On 11/18/2019 5:28 PM, Jeroen Elassaiss-Schaap (PD-value B.V.) wrote:
>>> Hi Leonid,
>>>
>>> "A while" back we compared model development trajectories and results
>>> between two computational platforms, Itanium and Xeon, see
>>> https://www.page-meeting.org/?abstract=1188. The results roughly
>>> were: 1/3 equal, 1/3 rounding differences and 1/3 real different
>>> results. From discussions with the technical knowledgeable people I
>>> worked with at the time, I recall that there are three levels/sources
>>> for those differences:
>>>
>>> 1) computational (hardware) platform
>>>
>>> 2) compilers (+ optimization settings)
>>>
>>> 3) libraries (floating point handling does matter)
>>>
>>> Assuming you would like to compare the speed of the platforms wrt
>>> NONMEM, my advice would be to test a large series of different
>>> models, from simple ADVAN1 or 2 to complex ODE, ranging from FO to
>>> LAPLACIAN INT NUMERICAL, while keeping compilers and libraries the
>>> same. Also small and large datasets, as in some instances you might
>>> be testing only the L1/L2/L3 cache strategies and Turbo settings. And
>>> with and without parallelization - as that might determine runtime
>>> bottlenecks in practice.
>>>
>>> Just having a peek at Epyc - seems interesting (noticed results w
>>> gcc7.4 compilation). As long as you are able to hold the computation
>>> in cache, a big if for the 64-core, there might be an advantage.
>>>
>>> All in all I am not sure that it is worth the trouble. For any given
>>> PK-PD model there is a lot you can tune to gain speed, but the
>>> optimal settings might be very different for the next and overrule
>>> any platform differences.
>>>
>>> Hope this helps,
>>>
>>> Jeroen
>>>
>>> http://pd-value.com
>>> jeroen
>>>
>>> +31 6 23118438
>>> -- More value out of your data!
>>>
>>> On 18/11/19 6:34 pm, Leonid Gibiansky wrote:
>>>> Thanks Bob and Peter!
>>>>
>>>> The model is quite stable, but this is LAPLACIAN, so requires second
>>>> derivatives. At iteration 0, gradients differ by about 50 to 100%
>>>> between Intel and AMD. This leads to differences in minimization
>>>> path, and slightly different results. Not that different to change
>>>> the recommended dose, but sufficiently different to notice (OF
>>>> difference of 6 points; 50% more model evaluations to get to
>>>> convergence).
>>>> Thanks
>>>> Leonid
>>>>
>>>>
>>>>
>>>> On 11/18/2019 12:15 PM, Bonate, Peter wrote:
>>>>> Leonid - when you say different. What do you mean? Fixed effect
>>>>> and random effects? Different OFV?
>>>>>
>>>>> We did a poster at AAPS a decade or so ago comparing results across
>>>>> different platforms using the same data and model. We got
>>>>> different results on the standard errors (which related to matrix
>>>>> inversion and how those are done using software-hardware
>>>>> configurations). And with overparameterized models we got different
>>>>> error messages - some platforms converged with no problem while
>>>>> some did not converge and gave R matrix singularity.
>>>>>
>>>>> Did your problems go beyond this?
>>>>>
>>>>> pete
>>>>>
>>>>>
>>>>>
>>>>> Peter Bonate, PhD
>>>>> Executive Director
>>>>> Pharmacokinetics, Modeling, and Simulation
>>>>> Astellas
>>>>> 1 Astellas Way, N3.158
>>>>> Northbrook, IL 60062
>>>>> Peter.bonate
>>>>> (224) 205-5855
>>>>>
>>>>>
>>>>>
>>>>> Details are irrelevant in terms of decision making - Joe Biden.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: owner-nmusers
>>>>> On Behalf Of Leonid Gibiansky
>>>>> Sent: Monday, November 18, 2019 11:05 AM
>>>>> To: nmusers <nmusers
>>>>> Subject: [NMusers] AMD vs Intel
>>>>>
>>>>> Dear All,
>>>>>
>>>>> I am testing the new Epyc processors from AMD (comparing with Intel
>>>>> Xeon), and getting different results. Just wondering whether
>>>>> anybody faced the problem of differences between AMD and Intel
>>>>> processors and knows how to solve it. I am using Intel compiler but
>>>>> ready to switch to gfortran or anything else if this would help to
>>>>> get identical results.
>>>>> There were reports of Intel slowing the AMD execution in the past,
>>>>> but in my tests, speed is comparable but the results differ.
>>>>>
>>>>> Thanks
>>>>> Leonid
>>>>>
>>>>>
>>>>>
>>>>
>>
>
> Page Title
>
>
>
>
>
>
>
> När du har kontakt med oss på Uppsala universitet med e-post så innebär
> det att vi behandlar dina personuppgifter. För att läsa mer om hur vi
> gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/
>
> E-mailing Uppsala University means that we will process your personal
> data. For more information on how this is performed, please read here:
> http://www.uu.se/en/about-uu/data-protection-policy
Thanks to all who shared their experience.
Here is the brief summary of observations:
4 combinations of Intel Fortran or gfortran with Xeon or AMD processors (of approximately the same base frequency) provided similar speed but different results. Time comparison is not straightforward as the number of iterations required for convergence varied between these 4 versions (FOCEI, LAPLACIAN, and SAEM with ADVAN13 were used for all tests). Results are numerically different, but not really different as parameter estimates differ by no more than the respective confidence intervals of parameter estimates: few percents for the well defined parameters, more for parameters with large RSEs. Thus, any of these 4 combinations can be used, but it is better not to mix them in one analysis. Also it seems to be a good practice to specify not only OS and compiler with options, but also processor or at least processor type to ensure exact reproducibility of results.
Unlike earlier (10+ years ago) reports, Intel (old, v.11) compiler seems to provide similar speed on both Intel and AMD new processors.
Thanks!
Leonid
Quoted reply history
On 11/19/2019 4:32 AM, Rikard Nordgren wrote:
> Hi Leonid,
>
> When upgrading from gfortran 4.4.7 to 5.1.1 we ran around 20 models with both compilers and turning off the -ffast-math. The runs where on the same hardware. The differences in the parameter estimates and OFV were in general small. One big difference we could see was that the success of the covariance step was seemingly random. It could succeed on one compiler version, but not the other and it could also start failing when the option was turned off. I have kept the runs, so let me know if you would be interested. I also started some experiments using machine dependent compiler flags, but as our cluster is heterogeneous I abandoned this testing.
>
> I think that getting identical results could be possible, but that it would be quite a challenge. There are many components that affect the results. The compiler, the compiler flags, the libc implementation, the hardware and sometimes the operating system. To see for example where the standard libraries comes into play you can do nm nonmem on the nonmem executable (in linux) to list all symbols compiled in. Some are function from external libraries, for example my exponential function is from libc: exp@@GLIBC_2.2.5 . Even the functions that read in numbers from text strings could introduce rounding errors since the text representation is decimal and the internal floating point number is binary.
>
> Best regards,
> Rikard Nordgren
>
> --
> Rikard Nordgren
> Systems developer
>
> Dept of Pharmaceutical Biosciences
> Faculty of Pharmacy
> Uppsala University
> Box 591
> 75124 Uppsala
>
> Phone: +46 18 4714308
> www.farmbio.uu.se/research/researchgroups/pharmacometrics/
>
> On 2019-11-18 23:54, Leonid Gibiansky wrote:
>
> > Hi Jeroin,
> >
> > Thanks for your input, very interesting. As far as the goal is concerned, I am mostly interested to find options that would give identical results on two platform rather than in speed. So far no luck: 4 combinations of gfortran / Intel compilers on Xeon / AMD processors give 4 sets of results that are close but not identical.
> >
> > Related question to the group: have anybody experimented with gfortran options (rather than using default provided by Nonmem distribution)? Any recommendations? Same goal: maximum reproducibility across different OSs, parallelization options, and processor types.
> >
> > Thanks
> > Leonid
> >
> > On 11/18/2019 5:28 PM, Jeroen Elassaiss-Schaap (PD-value B.V.) wrote:
> >
> > > Hi Leonid,
> > >
> > > "A while" back we compared model development trajectories and results between two computational platforms, Itanium and Xeon, see https://www.page-meeting.org/?abstract=1188 . The results roughly were: 1/3 equal, 1/3 rounding differences and 1/3 real different results. From discussions with the technical knowledgeable people I worked with at the time, I recall that there are three levels/sources for those differences:
> > >
> > > 1) computational (hardware) platform
> > >
> > > 2) compilers (+ optimization settings)
> > >
> > > 3) libraries (floating point handling does matter)
> > >
> > > Assuming you would like to compare the speed of the platforms wrt NONMEM, my advice would be to test a large series of different models, from simple ADVAN1 or 2 to complex ODE, ranging from FO to LAPLACIAN INT NUMERICAL, while keeping compilers and libraries the same. Also small and large datasets, as in some instances you might be testing only the L1/L2/L3 cache strategies and Turbo settings. And with and without parallelization - as that might determine runtime bottlenecks in practice.
> > >
> > > Just having a peek at Epyc - seems interesting (noticed results w gcc7.4 compilation). As long as you are able to hold the computation in cache, a big if for the 64-core, there might be an advantage.
> > >
> > > All in all I am not sure that it is worth the trouble. For any given PK-PD model there is a lot you can tune to gain speed, but the optimal settings might be very different for the next and overrule any platform differences.
> > >
> > > Hope this helps,
> > >
> > > Jeroen
> > >
> > > http://pd-value.com
> > > [email protected]
> > > @PD_value
> > > +31 6 23118438
> > > -- More value out of your data!
> > >
> > > On 18/11/19 6:34 pm, Leonid Gibiansky wrote:
> > >
> > > > Thanks Bob and Peter!
> > > >
> > > > The model is quite stable, but this is LAPLACIAN, so requires second derivatives. At iteration 0, gradients differ by about 50 to 100% between Intel and AMD. This leads to differences in minimization path, and slightly different results. Not that different to change the recommended dose, but sufficiently different to notice (OF difference of 6 points; 50% more model evaluations to get to convergence).
> > > >
> > > > Thanks
> > > > Leonid
> > > >
> > > > On 11/18/2019 12:15 PM, Bonate, Peter wrote:
> > > >
> > > > > Leonid - when you say different. What do you mean? Fixed effect and random effects? Different OFV?
> > > > >
> > > > > We did a poster at AAPS a decade or so ago comparing results across different platforms using the same data and model. We got different results on the standard errors (which related to matrix inversion and how those are done using software-hardware configurations). And with overparameterized models we got different error messages - some platforms converged with no problem while some did not converge and gave R matrix singularity.
> > > > >
> > > > > Did your problems go beyond this?
> > > > >
> > > > > pete
> > > > >
> > > > > Peter Bonate, PhD
> > > > > Executive Director
> > > > > Pharmacokinetics, Modeling, and Simulation
> > > > > Astellas
> > > > > 1 Astellas Way, N3.158
> > > > > Northbrook, IL 60062
> > > > > [email protected]
> > > > > (224) 205-5855
> > > > >
> > > > > Details are irrelevant in terms of decision making - Joe Biden.
> > > > >
> > > > > -----Original Message-----
> > > > >
> > > > > From: [email protected] < [email protected] > On Behalf Of Leonid Gibiansky
> > > > >
> > > > > Sent: Monday, November 18, 2019 11:05 AM
> > > > > To: nmusers <[email protected]>
> > > > > Subject: [NMusers] AMD vs Intel
> > > > >
> > > > > Dear All,
> > > > >
> > > > > I am testing the new Epyc processors from AMD (comparing with Intel Xeon), and getting different results. Just wondering whether anybody faced the problem of differences between AMD and Intel processors and knows how to solve it. I am using Intel compiler but ready to switch to gfortran or anything else if this would help to get identical results. There were reports of Intel slowing the AMD execution in the past, but in my tests, speed is comparable but the results differ.
> > > > >
> > > > > Thanks
> > > > > Leonid
>
> Page Title
>
> När du har kontakt med oss på Uppsala universitet med e-post så innebär det att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/
>
> E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/en/about-uu/data-protection-policy