NOD32 AntiVirus 12 patch Archives

NOD32 AntiVirus 12 patch Archives

NOD32 AntiVirus 12 patch Archives

NOD32 AntiVirus 12 patch Archives

Re: [AMBER] experiences with EVGA GTX TITAN Superclocked - memtestG80 - UNDERclocking in Linux ?

: Scott Le Grand <varelse2005.gmail.com>
: Wed, 19 Jun 2013 10:03:18 -0700

Hey Jonathan,
Thanks for the 780 numbers! The problem really does seem Titan-specific.
I'd like to get a few more repros of your work before I conclude that as a
sample size of 1 is intriguing but not conclusive.





On Wed, Jun 19, 2013 at 9:34 AM, Jonathan Gough
<jonathan.d.gough.gmail.com>wrote:

> FWIW I posted GTX 780 results
>
> here http://archive.ambermd.org/201306/0207.html
>
> and here
>
> http://archive.ambermd.org/201306/0211.html
>
>
> If you would like me to test anything else, let me know.
>
> Would Nvidia be willing to trade me a GTX 780 for my Titan?
>
>
>
> On Wed, Jun 19, 2013 at 11:50 AM, Scott Le Grand <varelse2005.gmail.com
> >wrote:
>
> > Hey Marek,
> > No updates per se. I had a theory about what was going on that proved to
> > be wrong after testing, but I'm still waiting on NVIDIA to report
> something
> > beyond having reproed the problem.
> >
> > Really really really interested in GTX 780 data right now...
> >
> >
> >
> > On Wed, Jun 19, 2013 at 8:20 AM, Marek Maly <marek.maly.ujep.cz> wrote:
> >
> > > Hi all,
> > >
> > > just a small update from my site.
> > >
> > > As I have yesterday obtained announcement that the CUDA 5.5 is
> > > now available for public (not just for developers).
> > >
> > > I downloaded it from here:
> > >
> > > https://developer.nvidia.com/**cuda-pre-production<
> > https://developer.nvidia.com/cuda-pre-production>
> > >
> > > It is still "just" release candidate ( as all Amber/Titan club members
> > > perfectly know :)) ).
> > >
> > > So I installed this newest release and recompiled Amber cuda code.
> > >
> > > I was hoping that maybe there was "silently" incorporated some
> > > improvement (e.g. in cuFFT) as the result e.g. of Scott's bug report.
> > >
> > > The results of my 100K tests are attached. It seems that comparing to
> my
> > > latest
> > > tests with CUDA 5.5. release candidate from June 3rd (when it was
> > > accessible just for CUDA developers in the form of *.run binary
> > installer)
> > > there
> > > is some slight improvement - e.g. my more stable TITAN was able to
> finish
> > > successfully
> > > all the 100K tests including Cellulose twice. But there is still an
> issue
> > > with JAC NVE/NPT irreproducible results. On my "less stable" TITAN the
> > > results are slightly better
> > > then those older ones as well but still not err free (JAC/CELLULOSE) -
> > see
> > > attached file.
> > >
> > > FACTOR IX NVE/NPT finished again with 100% reproducibility on both GPUs
> > as
> > > usually.
> > >
> > > Scott, do you have any update regarding the "cuFFT"/TITAN issue which
> you
> > > reported/described
> > > to NVIDIA guys ? The latest info from you regarding this story was,
> that
> > > they were able to
> > > reproduce the "cuFFT"/TITAN error as well. Do you have any more recent
> > > information ? How long
> > > time it might take to NVIDIA developers to fully solve such problem in
> > > your opinion ?
> > >
> > > Another thing. It seems that you successfully solved the "GB/TITAN"
> > > problem in case of bigger molecular systems, here is your relevant
> > message
> > > form June 7th.
> > >
> > > ------------------------------**----------------
> > >
> > > Really really interesting...
> > >
> > > I seem to have found a fix for the GB issues on my Titan - not so
> > > surprisingly, it's the same fix as on GTX4xx/GTX5xx...
> > >
> > > But this doesn't yet explain the weirdness with cuFFT so we're not done
> > > here yet...
> > > ------------------------------**---------------
> > >
> > > It was already after the latest Amber12 bugfix18 was released and there
> > > was no additional
> > > bugfix released from that moment. So the "GB/TITAN" patch will be
> > > released later maybe as the part of some bigger bugfix ? Or you simply
> > > additionally included it into bugfix 18 after it's release ?
> > >
> > >
> > > My last question maybe deserves the new separate thread, but anyway
> would
> > > be interesting
> > > to have some information how "Amber-stable" are GTX780 comparing to
> > TITANS
> > > (of course based
> > > on experience of more users or on testing more than 1 or 2 GTX780
> GPUs).
> > >
> > > Best wishes,
> > >
> > > Marek
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > Dne Mon, 03 Jun 2013 01:57:36 +0200 Marek Maly <marek.maly.ujep.cz>
> > > napsal/-a:
> > >
> > >
> > > Hi here are my results with CUDA 5.5
> > >> (Total energy at step 100K(PME)/1000K(GB) (driver 319.23, Amber12
> bugfix
> > >> 18 applied, cuda 5.5))
> > >>
> > >>
> > >> No significant differences comparing the previous test with CUDA 5.0
> > >> (I also added those data to the attached table with CUDA 5.5 test).
> > >>
> > >> Still the same trend instability in JAC tests, perfect stability and
> > >> reproducibility
> > >> in FACTOR_IX tests (interesting isn't it ? especially if we consider
> 23K
> > >> atoms
> > >> in JAC case and 90K atoms in case of FACTOR_IX). Again the same
> crashes
> > in
> > >> CELLULOSE
> > >> test now also in case of TITAN_1. Also in stable and reproducible
> > >> FACTOR_IX slightly
> > >> changed the final energy values comparing to CUDA 5.0 case.
> > >>
> > >> GB simulations (1M steps) again perfectly stable and reproducible.
> > >>
> > >> So to conclude, Scott we trust you :)) !
> > >>
> > >> If you have any idea what to try else (except GPU bios editing,
> perhaps
> > >> too
> > >> premature step at this moment) let me know. I got just last idea,
> > >> which could be perhaps to try change rand seed and see if it has any
> > >> influence in actual trends (e.g. JAC versus FACTOR_IX).
> > >>
> > >> TO ET : I am curious about your test in single GPU configuration.
> > >> Regarding
> > >> to your Win tests, in my opinion it is just wasting of time. They
> > perhaps
> > >> tells
> > >> you just something about the GPU performance not about the eventual
> GPU
> > >> "soft" errs.
> > >>
> > >> If intensive memtestG80 and/or cuda_memtest results were negative
> there
> > is
> > >> in my opinion
> > >> very unlikely that Win performace testers will find any errs, but I am
> > not
> > >> an expert
> > >> here ...
> > >>
> > >> Anyway If you learn which tests the ebuyer is using to confirm GPU
> errs,
> > >> let us know.
> > >>
> > >> M.
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> Dne Sun, 02 Jun 2013 19:22:54 +0200 Marek Maly <marek.maly.ujep.cz>
> > >> napsal/-a:
> > >>
> > >> Hi so I finally succeeded to compile GPU Amber part under CUDA 5.5
> > >>> (after "hacking" of the configure2 file) with common results in
> > >>> consequent tests:
> > >>>
> > >>> ------
> > >>> 80 file comparisons passed
> > >>> 9 file comparisons failed
> > >>> 0 tests experienced errors
> > >>> ------
> > >>>
> > >>> So now I am running the 100K(PME)/1000K(GB) repetitive benchmark
> tests
> > >>> under
> > >>> this configuration: drv. 319.23, CUDA 5.5. , bugfix 18 installed
> > >>>
> > >>> When I finish it I will report results here.
> > >>>
> > >>> M.
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> Dne Sun, 02 Jun 2013 18:44:23 +0200 Marek Maly <marek.maly.ujep.cz>
> > >>> napsal/-a:
> > >>>
> > >>> Hi Scott thanks for the update !
> > >>>>
> > >>>> Anyway any explanation regarding "cuFFT hypothesis" why there are no
> > >>>> problems
> > >>>> with GTX 580, GTX 680 or even K20c ???
> > >>>>
> > >>>>
> > >>>> meanwhile I also tried to recompile GPU part of Amber with
> > >>>> cuda 5.5 installed before, I have obtained these errs
> > >>>> already in configure phase:
> > >>>>
> > >>>> --------
> > >>>> [root.dyn-138-272 amber12]# ./configure -cuda -noX11 gnu
> > >>>> Checking for updates...
> > >>>> Checking for available patches online. This may take a few
> seconds...
> > >>>>
> > >>>> Available AmberTools 13 patches:
> > >>>>
> > >>>> No patches available
> > >>>>
> > >>>> Available Amber 12 patches:
> > >>>>
> > >>>> No patches available
> > >>>> Searching for python2... Found python2.6: /usr/bin/python2.6
> > >>>> Error: Unsupported CUDA version 5.5 detected.
> > >>>> AMBER requires CUDA version == 4.2 .or. 5.0
> > >>>> Configure failed due to the errors above!
> > >>>> ---------
> > >>>>
> > >>>> so it seems that Amber is possible to compile only with CUDA 4.2 or
> > 5.0
> > >>>> at
> > >>>> the moment:
> > >>>>
> > >>>> and this part of configure2 file has to be edited:
> > >>>>
> > >>>>
> > >>>> -----------
> > >>>> nvcc="$CUDA_HOME/bin/nvcc"
> > >>>> sm35flags='-gencode arch=compute_35,code=sm_35'
> > >>>> sm30flags='-gencode arch=compute_30,code=sm_30'
> > >>>> sm20flags='-gencode arch=compute_20,code=sm_20'
> > >>>> sm13flags='-gencode arch=compute_13,code=sm_13'
> > >>>> nvccflags="$sm13flags $sm20flags"
> > >>>> cudaversion=`$nvcc --version | grep 'release' | cut -d' ' -f5 |
> > cut
> > >>>> -d',' -f1`
> > >>>> if [ "$cudaversion" == "5.0" ]; then
> > >>>> echo "CUDA Version $cudaversion detected"
> > >>>> nvccflags="$nvccflags $sm30flags $sm35flags"
> > >>>> elif [ "$cudaversion" == "4.2" ]; then
> > >>>> echo "CUDA Version $cudaversion detected"
> > >>>> nvccflags="$nvccflags $sm30flags"
> > >>>> else
> > >>>> echo "Error: Unsupported CUDA version $cudaversion detected."
> > >>>> echo "AMBER requires CUDA version == 4.2 .or. 5.0"
> > >>>> exit 1
> > >>>> fi
> > >>>> nvcc="$nvcc $nvccflags"
> > >>>>
> > >>>> fi
> > >>>>
> > >>>> -----------
> > >>>>
> > >>>> would it be just OK to change
> > >>>> "if [ "$cudaversion" == "5.0" ]; then"
> > >>>>
> > >>>> to
> > >>>>
> > >>>> "if [ "$cudaversion" == "5.5" ]; then"
> > >>>>
> > >>>>
> > >>>> or some more flags etc. should be defined here to proceed
> > successfully ?
> > >>>>
> > >>>>
> > >>>> BTW it seems Scott, that you are on the way to isolate the problem
> > soon
> > >>>> so maybe it's better to wait and not to loose time with cuda 5.5
> > >>>> experiments.
> > >>>>
> > >>>> I just thought that cuda 5.5 might be more "friendly" to Titans :))
> > e.g.
> > >>>> in terms of cuFFT function ....
> > >>>>
> > >>>>
> > >>>> I will keep fingers crossed :))
> > >>>>
> > >>>> M.
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> Dne Sun, 02 Jun 2013 18:33:52 +0200 Scott Le Grand
> > >>>> <varelse2005.gmail.com>
> > >>>> napsal/-a:
> > >>>>
> > >>>> PS this *might* indicate a software bug in cuFFT, but it needs more
> > >>>>> characterization... And things are going to get a little stream of
> > >>>>> consciousness from here because you're getting unfiltered raw data,
> > so
> > >>>>> please don't draw any conclusions towards anything yet - I'm just
> > >>>>> letting
> > >>>>> you guys know what I'm finding out as I find it...
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> On Sun, Jun 2, 2013 at 9:31 AM, Scott Le Grand
> > >>>>> <varelse2005.gmail.com>wrote:
> > >>>>>
> > >>>>> And bingo...
> > >>>>>>
> > >>>>>> At the very least, the reciprocal sum is intermittently
> > >>>>>> inconsistent...
> > >>>>>> This explains the irreproducible behavior...
> > >>>>>>
> > >>>>>> And here's the level of inconsistency:
> > >>>>>> 31989.38940628897399 vs
> > >>>>>> 31989.39168370794505
> > >>>>>>
> > >>>>>> That's error at the level of 1e-7 or a somehow missed
> > single-precision
> > >>>>>> transaction somewhere...
> > >>>>>>
> > >>>>>> The next question is figuring out why... This may or may not
> > >>>>>> ultimately
> > >>>>>> explain the crashes you guys are also seeing...
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> On Sun, Jun 2, 2013 at 9:07 AM, Scott Le Grand
> > >>>>>> <varelse2005.gmail.com>wrote:
> > >>>>>>
> > >>>>>>
> > >>>>>>> Observations:
> > >>>>>>> 1. The degree to which the reproducibility is broken *does*
> appear
> > to
> > >>>>>>> vary between individual Titan GPUs. One of my Titans breaks
> within
> > >>>>>>> 10K
> > >>>>>>> steps on cellulose, the other one made it to 100K steps twice
> > without
> > >>>>>>> doing
> > >>>>>>> so leading me to believe it could be trusted (until yesterday
> > where I
> > >>>>>>> now
> > >>>>>>> see it dies between 50K and 100K steps most of the time).
> > >>>>>>>
> > >>>>>>> 2. GB hasn't broken (yet). So could you run myoglobin for 500K
> and
> > >>>>>>> TRPcage for 1,000,000 steps and let's see if that's universal.
> > >>>>>>>
> > >>>>>>> 3. Turning on double-precision mode makes my Titan crash rather
> > than
> > >>>>>>> run
> > >>>>>>> irreproducibly, sigh...
> > >>>>>>>
> > >>>>>>> So whatever is going on is triggered by something in PME but not
> > GB.
> > >>>>>>> So
> > >>>>>>> that's either the radix sort, the FFT, the Ewald grid
> > interpolation,
> > >>>>>>> or the
> > >>>>>>> neighbor list code. Fixing this involves isolating this and
> > figuring
> > >>>>>>> out
> > >>>>>>> what exactly goes haywire. It could *still* be software at some
> > very
> > >>>>>>> small
> > >>>>>>> probability but the combination of both 680 and K20c with ECC off
> > >>>>>>> running
> > >>>>>>> reliably is really pointing towards the Titans just being clocked
> > too
> > >>>>>>> fast.
> > >>>>>>>
> > >>>>>>> So how long with this take? Asking people how long it takes to
> > fix a
> > >>>>>>> bug
> > >>>>>>> never really works out well. That said, I found the 480 bug
> > within a
> > >>>>>>> week
> > >>>>>>> and my usual turnaround for a bug with a solid repro is <24
> hours.
> > >>>>>>>
> > >>>>>>> Scott
> > >>>>>>>
> > >>>>>>> On Sun, Jun 2, 2013 at 7:58 AM, Marek Maly <marek.maly.ujep.cz>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>> Hi all,
> > >>>>>>>>
> > >>>>>>>> here are my results after bugfix 18 application (see
> attachment).
> > >>>>>>>>
> > >>>>>>>> In principle I don't see any "drastical" changes.
> > >>>>>>>>
> > >>>>>>>> FACTOR_IX still perfectly stable/reproducible on both cards,
> > >>>>>>>>
> > >>>>>>>> JAC tests - problems with finishing AND/OR reproducibility the
> > >>>>>>>> same CELLULOSE_NVE although here it seems that my TITAN_1
> > >>>>>>>> has no problems with this test (but the same same trend I saw
> also
> > >>>>>>>> before bugfix 18 - see my older 500K steps test).
> > >>>>>>>>
> > >>>>>>>> But anyway bugfix 18 brought here one change.
> > >>>>>>>>
> > >>>>>>>> The err
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> #1 ERR writtent in mdout:
> > >>>>>>>> ------
> > >>>>>>>> | ERROR: max pairlist cutoff must be less than unit cell max
> > >>>>>>>> sphere
> > >>>>>>>> radius!
> > >>>>>>>> ------
> > >>>>>>>>
> > >>>>>>>> was substituted with err/warning ?
> > >>>>>>>>
> > >>>>>>>> #0 no ERR writtent in mdout, ERR written in standard output
> > >>>>>>>> (nohup.out)
> > >>>>>>>> -----
> > >>>>>>>> Nonbond cells need to be recalculated, restart simulation from
> > >>>>>>>> previous
> > >>>>>>>> checkpoint
> > >>>>>>>> with a higher value for skinnb.
> > >>>>>>>>
> > >>>>>>>> -----
> > >>>>>>>>
> > >>>>>>>> Another thing,
> > >>>>>>>>
> > >>>>>>>> recently I started on another machine and GTX 580 GPU simulation
> > of
> > >>>>>>>> relatively
> > >>>>>>>> big system ( 364275 atoms/PME ). The system is composed also
> from
> > >>>>>>>> the
> > >>>>>>>> "exotic" molecules like polymers. ff12SB, gaff, GLYCAM
> forcefields
> > >>>>>>>> used
> > >>>>>>>> here. I had problem even with minimization part here, having big
> > >>>>>>>> energy
> > >>>>>>>> on the start:
> > >>>>>>>>
> > >>>>>>>> -----
> > >>>>>>>> NSTEP ENERGY RMS GMAX NAME
> > >>>>>>>> NUMBER
> > >>>>>>>> 1 2.8442E+09 2.1339E+02 1.7311E+04 O
> > >>>>>>>> 32998
> > >>>>>>>>
> > >>>>>>>> BOND = 11051.7467 ANGLE = 17720.4706 DIHED =
> > >>>>>>>> 18977.7584
> > >>>>>>>> VDWAALS = ************* EEL = -1257709.6203 HBOND =
> > >>>>>>>> 0.0000
> > >>>>>>>> 1-4 VDW = 7253.7412 1-4 EEL = 149867.0207 RESTRAINT =
> > >>>>>>>> 0.0000
> > >>>>>>>>
> > >>>>>>>> ----
> > >>>>>>>>
> > >>>>>>>> with no chance to minimize the system even with 50 000 steps in
> > both
> > >>>>>>>> min cycles (with constrained and unconstrained solute) and hence
> > >>>>>>>> heating
> > >>>>>>>> NVT
> > >>>>>>>> crashed immediately even with very small dt. I patched Amber12
> > here
> > >>>>>>>> with
> > >>>>>>>> the
> > >>>>>>>> bugfix 18 and the minimization was done without any problem with
> > >>>>>>>> common
> > >>>>>>>> 5000 steps
> > >>>>>>>> (obtaining target Energy -1.4505E+06 while that initial was that
> > >>>>>>>> written
> > >>>>>>>> above).
> > >>>>>>>>
> > >>>>>>>> So indeed bugfix 18 solved some issues, but unfortunately not
> > those
> > >>>>>>>> related to
> > >>>>>>>> Titans.
> > >>>>>>>>
> > >>>>>>>> Here I will try to install cuda 5.5, recompile GPU Amber part
> > with
> > >>>>>>>> this
> > >>>>>>>> new
> > >>>>>>>> cuda version and repeat the 100K tests.
> > >>>>>>>>
> > >>>>>>>> Scott, let us know how finished your experiment with
> downclocking
> > of
> > >>>>>>>> Titan.
> > >>>>>>>> Maybe the best choice would be here to flash Titan directly with
> > >>>>>>>> your
> > >>>>>>>> K20c bios :))
> > >>>>>>>>
> > >>>>>>>> M.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Dne Sat, 01 Jun 2013 21:09:46 +0200 Marek Maly <
> > marek.maly.ujep.cz>
> > >>>>>>>> napsal/-a:
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Hi,
> > >>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> first of all thanks for providing of your test results !
> > >>>>>>>>>
> > >>>>>>>>> It seems that your results are more or less similar to that of
> > >>>>>>>>> mine maybe with the exception of the results on FactorIX tests
> > >>>>>>>>> where I had perfect stability and 100% or close to 100%
> > >>>>>>>>> reproducibility.
> > >>>>>>>>>
> > >>>>>>>>> Anyway the type of errs which you reported are the same which I
> > >>>>>>>>> obtained.
> > >>>>>>>>>
> > >>>>>>>>> So let's see if the bugfix 18 will help here (or at least on
> NPT
> > >>>>>>>>> tests)
> > >>>>>>>>> or not. As I wrote just before few minutes, it seems that it
> was
> > >>>>>>>>> not
> > >>>>>>>>> still
> > >>>>>>>>> loaded
> > >>>>>>>>> to the given server, although it's description is already
> present
> > >>>>>>>>> on
> > >>>>>>>>> the
> > >>>>>>>>> given
> > >>>>>>>>> web page ( see
> > >>>>>>>>> http://ambermd.org/bugfixes12.****html<
> > http://ambermd.org/bugfixes12.**html>
> > >>>>>>>>> <http://ambermd.org/**bugfixes12.html<
> > http://ambermd.org/bugfixes12.html>
> > >>>>>>>>> >).
> > >>>>>>>>>
> > >>>>>>>>> As you can see, this bugfix contains also changes in CPU code
> > >>>>>>>>> although
> > >>>>>>>>> the majority is devoted to GPU code, so perhaps the best will
> be
> > to
> > >>>>>>>>> recompile
> > >>>>>>>>> whole amber with this patch although this patch would be
> perhaps
> > >>>>>>>>> applied
> > >>>>>>>>> even after just
> > >>>>>>>>> GPU configure command ( i.e. ./configure -cuda -noX11 gnu ) but
> > >>>>>>>>> after
> > >>>>>>>>> consequent
> > >>>>>>>>> building, just the GPU binaries will be updated. Anyway I would
> > >>>>>>>>> rather
> > >>>>>>>>> recompile
> > >>>>>>>>> whole Amber after this patch.
> > >>>>>>>>>
> > >>>>>>>>> Regarding to GPU test under linux you may try memtestG80
> > >>>>>>>>> (please use the updated/patched version from here
> > >>>>>>>>> https://github.com/ihaque/****memtestG80<
> > https://github.com/ihaque/**memtestG80>
> > >>>>>>>>> <https://github.com/**ihaque/memtestG80<
> > https://github.com/ihaque/memtestG80>
> > >>>>>>>>> >
> > >>>>>>>>> )
> > >>>>>>>>>
> > >>>>>>>>> just use git command like:
> > >>>>>>>>>
> > >>>>>>>>> git clone
> > >>>>>>>>> https://github.com/ihaque/****memtestG80.git<
> > https://github.com/ihaque/**memtestG80.git>
> > >>>>>>>>> <https://github.**com/ihaque/memtestG80.git<
> > https://github.com/ihaque/memtestG80.git>
> > >>>>>>>>> >**PATCHED_MEMTEST-G80
> > >>>>>>>>>
> > >>>>>>>>> to download all the files and save them into directory named
> > >>>>>>>>> PATCHED_MEMTEST-G80.
> > >>>>>>>>>
> > >>>>>>>>> another possibility is to try perhaps similar (but maybe more
> up
> > to
> > >>>>>>>>> date)
> > >>>>>>>>> test
> > >>>>>>>>> cuda_memtest (
> > >>>>>>>>> http://sourceforge.net/****projects/cudagpumemtest/<
> > http://sourceforge.net/**projects/cudagpumemtest/>
> > >>>>>>>>> <http:**//sourceforge.net/projects/**cudagpumemtest/<
> > http://sourceforge.net/projects/cudagpumemtest/>
> > >>>>>>>>> >).
> > >>>>>>>>>
> > >>>>>>>>> regarding ig value: If ig is not present in mdin, the default
> > value
> > >>>>>>>>> is
> > >>>>>>>>> used
> > >>>>>>>>> (e.g. 71277) if ig=-1 the random seed will be based on the
> > current
> > >>>>>>>>> date
> > >>>>>>>>> and time, and hence will be different for every run (not a good
> > >>>>>>>>> variant
> > >>>>>>>>> for our testts). I simply deleted eventual ig records from all
> > >>>>>>>>> mdins
> > >>>>>>>>> so
> > >>>>>>>>> I
> > >>>>>>>>> assume that in each run the default seed 71277 was
> automatically
> > >>>>>>>>> used.
> > >>>>>>>>>
> > >>>>>>>>> M.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Dne Sat, 01 Jun 2013 20:26:16 +0200 ET <sketchfoot.gmail.com>
> > >>>>>>>>> napsal/-a:
> > >>>>>>>>>
> > >>>>>>>>> Hi,
> > >>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> I've put the graphics card into a machine with the working GTX
> > >>>>>>>>>> titan
> > >>>>>>>>>> that I
> > >>>>>>>>>> mentioned earlier.
> > >>>>>>>>>>
> > >>>>>>>>>> The Nvidia driver version is: 133.30
> > >>>>>>>>>>
> > >>>>>>>>>> Amber version is:
> > >>>>>>>>>> AmberTools version 13.03
> > >>>>>>>>>> Amber version 12.16
> > >>>>>>>>>>
> > >>>>>>>>>> I ran 50k steps with the amber benchmark using ig=43689 on
> both
> > >>>>>>>>>> cards.
> > >>>>>>>>>> For
> > >>>>>>>>>> the purpose of discriminating between them, the card I believe
> > >>>>>>>>>> (fingers
> > >>>>>>>>>> crossed) is working is called GPU-00_TeaNCake, whilst the
> other
> > >>>>>>>>>> one
> > >>>>>>>>>> is
> > >>>>>>>>>> called GPU-01_008.
> > >>>>>>>>>>
> > >>>>>>>>>> *When I run the tests on GPU-01_008:*
> > >>>>>>>>>>
> > >>>>>>>>>> 1) All the tests (across 2x repeats) finish apart from the
> > >>>>>>>>>> following
> > >>>>>>>>>> which
> > >>>>>>>>>> have the errors listed:
> > >>>>>>>>>>
> > >>>>>>>>>> ------------------------------****--------------
> > >>>>>>>>>> CELLULOSE_PRODUCTION_NVE - 408,609 atoms PME
> > >>>>>>>>>> Error: unspecified launch failure launching kernel kNLSkinTest
> > >>>>>>>>>> cudaFree GpuBuffer::Deallocate failed unspecified launch
> failure
> > >>>>>>>>>>
> > >>>>>>>>>> ------------------------------****--------------
> > >>>>>>>>>> CELLULOSE_PRODUCTION_NPT - 408,609 atoms PME
> > >>>>>>>>>> cudaMemcpy GpuBuffer::Download failed unspecified launch
> > failure
> > >>>>>>>>>>
> > >>>>>>>>>> ------------------------------****--------------
> > >>>>>>>>>> CELLULOSE_PRODUCTION_NVE - 408,609 atoms PME
> > >>>>>>>>>> Error: unspecified launch failure launching kernel
> kNLSkinTest
> > >>>>>>>>>> cudaFree GpuBuffer::Deallocate failed unspecified launch
> failure
> > >>>>>>>>>>
> > >>>>>>>>>> ------------------------------****--------------
> > >>>>>>>>>> CELLULOSE_PRODUCTION_NPT - 408,609 atoms PME
> > >>>>>>>>>> cudaMemcpy GpuBuffer::Download failed unspecified launch
> > failure
> > >>>>>>>>>> grep: mdinfo.1GTX680: No such file or directory
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> 2) The sdiff logs indicate that reproducibility across the
> two
> > >>>>>>>>>> repeats
> > >>>>>>>>>> is
> > >>>>>>>>>> as follows:
> > >>>>>>>>>>
> > >>>>>>>>>> *GB_myoglobin: *Reproducible across 50k steps
> > >>>>>>>>>> *GB_nucleosome:* Reproducible till step 7400
> > >>>>>>>>>> *GB_TRPCage:* Reproducible across 50k steps
> > >>>>>>>>>>
> > >>>>>>>>>> *PME_JAC_production_NVE: *No reproducibility shown from step
> > 1,000
> > >>>>>>>>>> onwards
> > >>>>>>>>>> *PME_JAC_production_NPT*: Reproducible till step 1,000. Also
> > >>>>>>>>>> outfile
> > >>>>>>>>>> is
> > >>>>>>>>>> not written properly - blank gaps appear where something
> should
> > >>>>>>>>>> have
> > >>>>>>>>>> been
> > >>>>>>>>>> written
> > >>>>>>>>>>
> > >>>>>>>>>> *PME_FactorIX_production_NVE:* Reproducible across 50k steps
> > >>>>>>>>>> *PME_FactorIX_production_NPT:* Reproducible across 50k steps
> > >>>>>>>>>>
> > >>>>>>>>>> *PME_Cellulose_production_NVE:***** Failure means that both
> runs
> > >>>>>>>>>> do
> > >>>>>>>>>> not
> > >>>>>>>>>> finish
> > >>>>>>>>>> (see point1)
> > >>>>>>>>>> *PME_Cellulose_production_NPT: *Failure means that both runs
> do
> > >>>>>>>>>> not
> > >>>>>>>>>> finish
> > >>>>>>>>>> (see point1)
> > >>>>>>>>>>
> > >>>>>>>>>>
> ##############################****############################**
> > >>>>>>>>>> ##**
> > >>>>>>>>>> ###########################
> > >>>>>>>>>>
> > >>>>>>>>>> *When I run the tests on * *GPU-00_TeaNCake:*
> > >>>>>>>>>> *
> > >>>>>>>>>> *
> > >>>>>>>>>> 1) All the tests (across 2x repeats) finish apart from the
> > >>>>>>>>>> following
> > >>>>>>>>>> which
> > >>>>>>>>>> have the errors listed:
> > >>>>>>>>>> ------------------------------****-------
> > >>>>>>>>>> JAC_PRODUCTION_NPT - 23,558 atoms PME
> > >>>>>>>>>> PMEMD Terminated Abnormally!
> > >>>>>>>>>> ------------------------------****-------
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> 2) The sdiff logs indicate that reproducibility across the
> two
> > >>>>>>>>>> repeats
> > >>>>>>>>>> is
> > >>>>>>>>>> as follows:
> > >>>>>>>>>>
> > >>>>>>>>>> *GB_myoglobin:* Reproducible across 50k steps
> > >>>>>>>>>> *GB_nucleosome:* Reproducible across 50k steps
> > >>>>>>>>>> *GB_TRPCage:* Reproducible across 50k steps
> > >>>>>>>>>>
> > >>>>>>>>>> *PME_JAC_production_NVE:* No reproducibility shown from step
> > >>>>>>>>>> 10,000
> > >>>>>>>>>> onwards
> > >>>>>>>>>> *PME_JAC_production_NPT: * No reproducibility shown from step
> > >>>>>>>>>> 10,000
> > >>>>>>>>>> onwards. Also outfile is not written properly - blank gaps
> > appear
> > >>>>>>>>>> where
> > >>>>>>>>>> something should have been written. Repeat 2 Crashes with
> error
> > >>>>>>>>>> noted
> > >>>>>>>>>> in
> > >>>>>>>>>> 1.
> > >>>>>>>>>>
> > >>>>>>>>>> *PME_FactorIX_production_NVE:* No reproducibility shown from
> > step
> > >>>>>>>>>> 9,000
> > >>>>>>>>>> onwards
> > >>>>>>>>>> *PME_FactorIX_production_NPT: *Reproducible across 50k steps
> > >>>>>>>>>>
> > >>>>>>>>>> *PME_Cellulose_production_NVE: *No reproducibility shown from
> > step
> > >>>>>>>>>> 5,000
> > >>>>>>>>>> onwards
> > >>>>>>>>>> *PME_Cellulose_production_NPT: ** *No reproducibility shown
> from
> > >>>>>>>>>> step
> > >>>>>>>>>> 29,000 onwards. Also outfile is not written properly - blank
> > gaps
> > >>>>>>>>>> appear
> > >>>>>>>>>> where something should have been written.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> Out files and sdiff files are included as attatchments
> > >>>>>>>>>>
> > >>>>>>>>>> ##############################****###################
> > >>>>>>>>>>
> > >>>>>>>>>> So I'm going to update my nvidia driver to the latest version
> > and
> > >>>>>>>>>> patch
> > >>>>>>>>>> amber to the latest version and rerun the tests to see if
> there
> > is
> > >>>>>>>>>> any
> > >>>>>>>>>> improvement. Could someone let me know if it is necessary to
> > >>>>>>>>>> recompile
> > >>>>>>>>>> any
> > >>>>>>>>>> or all of AMBER after applying the bugfixes?
> > >>>>>>>>>>
> > >>>>>>>>>> Additionally, I'm going to run memory tests and heaven
> > benchmarks
> > >>>>>>>>>> on
> > >>>>>>>>>> the
> > >>>>>>>>>> cards to check whether they are faulty or not.
> > >>>>>>>>>>
> > >>>>>>>>>> I'm thinking that there is a mix of hardware
> error/configuration
> > >>>>>>>>>> (esp
> > >>>>>>>>>> in
> > >>>>>>>>>> the case of GPU-01_008) and amber software error in this
> > >>>>>>>>>> situation.
> > >>>>>>>>>> What
> > >>>>>>>>>> do
> > >>>>>>>>>> you guys think?
> > >>>>>>>>>>
> > >>>>>>>>>> Also am I right in thinking (from what Scott was saying) that
> > all
> > >>>>>>>>>> the
> > >>>>>>>>>> benchmarks should be reproducible across 50k steps but begin
> to
> > >>>>>>>>>> diverge
> > >>>>>>>>>> at
> > >>>>>>>>>> around 100K steps? Is there any difference from in setting *ig
> > *to
> > >>>>>>>>>> an
> > >>>>>>>>>> explicit number to removing it from the mdin file?
> > >>>>>>>>>>
> > >>>>>>>>>> br,
> > >>>>>>>>>> g
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> On 31 May 2013 23:45, ET <sketchfoot.gmail.com> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>> I don't need sysadmins, but sysadmins need me as it gives
> > purpose
> > >>>>>>>>>> to
> > >>>>>>>>>>
> > >>>>>>>>>>> their
> > >>>>>>>>>>> bureaucratic existence. A encountered evil if working in an
> > >>>>>>>>>>> institution
> > >>>>>>>>>>> or
> > >>>>>>>>>>> comapny IMO. Good science and indiviguality being sacrificed
> > for
> > >>>>>>>>>>> standardisation and mediocrity in the intrerests of
> maintaing a
> > >>>>>>>>>>> system
> > >>>>>>>>>>> that
> > >>>>>>>>>>> focusses on maintaining the system and not the objective.
> > >>>>>>>>>>>
> > >>>>>>>>>>> You need root to move fwd on these things, unfortunately. and
> > ppl
> > >>>>>>>>>>> with
> > >>>>>>>>>>> root are kinda like your parents when you try to borrow money
> > >>>>>>>>>>> from
> > >>>>>>>>>>> them
> > >>>>>>>>>>> .
> > >>>>>>>>>>> age 12 :D
> > >>>>>>>>>>> On May 31, 2013 9:34 PM, "Marek Maly" <marek.maly.ujep.cz>
> > >>>>>>>>>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>> Sorry why do you need sysadmins :)) ?
> > >>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> BTW here is the most recent driver:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > http://www.nvidia.com/object/****linux-display-amd64-319.23-**<
> > http://www.nvidia.com/object/**linux-display-amd64-319.23-**>
> > >>>>>>>>>>>> driver.html<http://www.nvidia.**com/object/linux-display-**
> > >>>>>>>>>>>> amd64-319.23-driver.html<
> > http://www.nvidia.com/object/linux-display-amd64-319.23-driver.html>
> > >>>>>>>>>>>> >
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I do not remember anything easier than is to install driver
> > >>>>>>>>>>>> (especially
> > >>>>>>>>>>>> in case of binary (*.run) installer) :))
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> M.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Dne Fri, 31 May 2013 22:02:34 +0200 ET <
> sketchfoot.gmail.com>
> > >>>>>>>>>>>> napsal/-a:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> > Yup. I know. I replaced a 680 and the everknowing
> sysadmins
> > >>>>>>>>>>>> are
> > >>>>>>>>>>>> reluctant
> > >>>>>>>>>>>> > to install drivers not in the repositoery as they are
> lame.
> > :(
> > >>>>>>>>>>>> > On May 31, 2013 7:14 PM, "Marek Maly" <marek.maly.ujep.cz
> >
> > >>>>>>>>>>>> wrote:
> > >>>>>>>>>>>> >>
> > >>>>>>>>>>>> >> As I already wrote you,
> > >>>>>>>>>>>> >>
> > >>>>>>>>>>>> >> the first driver which properly/officially supports
> Titans,
> > >>>>>>>>>>>> should
> > >>>>>>>>>>>> be
> > >>>>>>>>>>>> >> 313.26 .
> > >>>>>>>>>>>> >>
> > >>>>>>>>>>>> >> Anyway I am curious mainly about your 100K repetitive
> tests
> > >>>>>>>>>>>> with
> > >>>>>>>>>>>> >> your Titan SC card. Especially in case of these tests (
> > >>>>>>>>>>>> JAC_NVE,
> > >>>>>>>>>>>> JAC_NPT
> > >>>>>>>>>>>> >> and CELLULOSE_NVE ) where
> > >>>>>>>>>>>> >> my Titans SC randomly failed or succeeded. In
> > FACTOR_IX_NVE,
> > >>>>>>>>>>>> >> FACTOR_IX_NPT
> > >>>>>>>>>>>> >> tests both
> > >>>>>>>>>>>> >> my cards are perfectly stable (independently from drv.
> > >>>>>>>>>>>> version)
> > >>>>>>>>>>>> and
> > >>>>>>>>>>>> also
> > >>>>>>>>>>>> >> the runs
> > >>>>>>>>>>>> >> are perfectly or almost perfectly reproducible.
> > >>>>>>>>>>>> >>
> > >>>>>>>>>>>> >> Also if your test will crash please report the eventual
> > errs.
> > >>>>>>>>>>>> >>
> > >>>>>>>>>>>> >> To this moment I have this actual library of errs on my
> > >>>>>>>>>>>> Titans
> > >>>>>>>>>>>> SC
> > >>>>>>>>>>>> GPUs.
> > >>>>>>>>>>>> >>
> > >>>>>>>>>>>> >> #1 ERR writtent in mdout:
> > >>>>>>>>>>>> >> ------
> > >>>>>>>>>>>> >> | ERROR: max pairlist cutoff must be less than unit
> cell
> > >>>>>>>>>>>> max
> > >>>>>>>>>>>> sphere
> > >>>>>>>>>>>> >> radius!
> > >>>>>>>>>>>> >> ------
> > >>>>>>>>>>>> >>
> > >>>>>>>>>>>> >>
> > >>>>>>>>>>>> >> #2 no ERR writtent in mdout, ERR written in standard
> output
> > >>>>>>>>>>>> (nohup.out)
> > >>>>>>>>>>>> >>
> > >>>>>>>>>>>> >> ----
> > >>>>>>>>>>>> >> Error: unspecified launch failure launching kernel
> > >>>>>>>>>>>> kNLSkinTest
> > >>>>>>>>>>>> >> cudaFree GpuBuffer::Deallocate failed unspecified launch
> > >>>>>>>>>>>> failure
> > >>>>>>>>>>>> >> ----
> > >>>>>>>>>>>> >>
> > >>>>>>>>>>>> >>
> > >>>>>>>>>>>> >> #3 no ERR writtent in mdout, ERR written in standard
> output
> > >>>>>>>>>>>> (nohup.out)
> > >>>>>>>>>>>> >> ----
> > >>>>>>>>>>>> >> cudaMemcpy GpuBuffer::Download failed unspecified launch
> > >>>>>>>>>>>> failure
> > >>>>>>>>>>>> >> ----
> > >>>>>>>>>>>> >>
> > >>>>>>>>>>>> >> Another question, regarding your Titan SC, it is also
> EVGA
> > as
> > >>>>>>>>>>>> in
> > >>>>>>>>>>>> my
> > >>>>>>>>>>>> case
> > >>>>>>>>>>>> >> or it is another producer ?
> > >>>>>>>>>>>> >>
> > >>>>>>>>>>>> >> Thanks,
> > >>>>>>>>>>>> >>
> > >>>>>>>>>>>> >> M.
> > >>>>>>>>>>>> >>
> > >>>>>>>>>>>> >>
> > >>>>>>>>>>>> >>
> > >>>>>>>>>>>> >> Dne Fri, 31 May 2013 19:17:03 +0200 ET <
> > sketchfoot.gmail.com
> > >>>>>>>>>>>> >
> > >>>>>>>>>>>> napsal/-a:
> > >>>>>>>>>>>> >>
> > >>>>>>>>>>>> >> > Well, this is interesting...
> > >>>>>>>>>>>> >> >
> > >>>>>>>>>>>> >> > I ran 50k steps on the Titan on the other machine with
> > >>>>>>>>>>>> driver
> > >>>>>>>>>>>> 310.44
> > >>>>>>>>>>>> >> and
> > >>>>>>>>>>>> >> > it
> > >>>>>>>>>>>> >> > passed all the GB steps. i.e totally identical results
> > over
> > >>>>>>>>>>>> two
> > >>>>>>>>>>>> >> repeats.
> > >>>>>>>>>>>> >> > However, it failed all the PME tests after step 1000.
> I'm
> > >>>>>>>>>>>> going
> > >>>>>>>>>>>> to
> > >>>>>>>>>>>> > update
> > >>>>>>>>>>>> >> > the driver and test it again.
> > >>>>>>>>>>>> >> >
> > >>>>>>>>>>>> >> > Files included as attachments.
> > >>>>>>>>>>>> >> >
> > >>>>>>>>>>>> >> > br,
> > >>>>>>>>>>>> >> > g
> > >>>>>>>>>>>> >> >
> > >>>>>>>>>>>> >> >
> > >>>>>>>>>>>> >> > On 31 May 2013 16:40, Marek Maly <marek.maly.ujep.cz>
> > >>>>>>>>>>>> wrote:
> > >>>>>>>>>>>> >> >
> > >>>>>>>>>>>> >> >> One more thing,
> > >>>>>>>>>>>> >> >>
> > >>>>>>>>>>>> >> >> can you please check under which frequency is running
> > that
> > >>>>>>>>>>>> your
> > >>>>>>>>>>>> >> titan ?
> > >>>>>>>>>>>> >> >>
> > >>>>>>>>>>>> >> >> As the base frequency of normal Titans is 837MHz and
> the
> > >>>>>>>>>>>> Boost
> > >>>>>>>>>>>> one
> > >>>>>>>>>>>> is
> > >>>>>>>>>>>> >> >> 876MHz I
> > >>>>>>>>>>>> >> >> assume that yor GPU is running automatically also
> under
> > >>>>>>>>>>>> it's
> > >>>>>>>>>>>> boot
> > >>>>>>>>>>>> >> >> frequency (876MHz).
> > >>>>>>>>>>>> >> >> You can find this information e.g. in Amber mdout
> file.
> > >>>>>>>>>>>> >> >>
Источник: [https://torrent-igruha.org/3551-portal.html]
, NOD32 AntiVirus 12 patch Archives

[KB2885] Download and install ESET offline or install older versions of ESET Windows home products

Issue

  • You receive an installation error when attempting to install your product using ESET Live Installer
  • You need to install using ESET offline installer(s)
  • You need to install ESET on a computer with no Internet connection
  • Downloading the installation file (.exe) for a previous version of your ESET Windows home product

Solution

macOS users

If you receive an installation error when using the ESET Live Installer, follow the instructions below to download and install your ESET Windows home product using the offline installer.
Last Updated: Jul 30, 2020

© 1992 - 2019 ESET, spol. s r.o. - All rights reserved. Trademarks used therein are trademarks or registered trademarks of ESET, spol. s r.o. or ESET North America. All other names and brands are registered trademarks of their respective companies.

Источник: [https://torrent-igruha.org/3551-portal.html]
NOD32 AntiVirus 12 patch Archives

How to Activate your ESET Windows home product (10.x)

Jul 9, 2017 | ESET Software

This video demonstrates how to activate your ESET Windows home product. Steps: 1. Open your Windows ESET product. 2. Click Help and support → Change License. 3. Type, or copy/paste, your ESET-issued License Key into the License Key field and then click Activate. Make...

Google Patches Critical Android Vulnerabilities in July Update

Jul 7, 2017 | ESET Mobile Security, ESET Software

Google has quietly pushed out it’s July update for Android and 11 of the patches that come in the update are rated as ‘Critical’ – meaning they allow remote code execution and/or privilege escalation. What does that mean? They were bugs which...

Download, Install and License ESET Antivirus for Android

Jun 28, 2017 | ESET Mobile Security, ESET Software

Here is a quick video from the ESET knowledgebase team on how to download, install and activate your paid license for ESET Antivirus for Android! The Android antivirus product’s premium features can be bought as a standalone license or you can use one of your...

ESET Smart Security 9 received the highest rating in the AV-Comparative File Detection Test

Apr 16, 2016 | Awards, ESET Smart Security

April 16, 2016 ESET Smart Security version 9 – March, 2016 ESET Smart Security 9 received the highest rating in the AV-Comparative File Detection Test and was awarded with the File Detection, Advanced + Award. The File Detection Test assesses the ability of antivirus...

ESET Wins PC Magazine Parental Controls Award

Apr 16, 2016 | Awards, ESET Mobile Security

April 5, 2016 ESET Parental Control – April, 2016 PC Magazine awarded ESET Parental Control an “Excellent” rating, recommending if for parents, and noting that it is now available with ESET Multi-Device Security. CLICK TO READ “ESET Parental Control for Android is a...

ESET assists law enforcement in mumblehard takedown

Apr 11, 2016 | ESET Software

One year after the release of the technical analysis of the Mumblehard Linux botnet, we are pleased to report that it is no longer active. ESET, in cooperation with the Cyber Police of Ukraine and CyS Centrum LLC, have taken down the Mumblehard botnet, stopping all...
Источник: [https://torrent-igruha.org/3551-portal.html]
.

What’s New in the NOD32 AntiVirus 12 patch Archives?

Screen Shot

System Requirements for NOD32 AntiVirus 12 patch Archives

Add a Comment

Your email address will not be published. Required fields are marked *