STOP RECEIVED FROM...

Search:

STOP RECEIVED FROM...

From: Timm Essigke <essigke_at_email.domain.hidden>
Date: Thu, 10 Oct 2002 00:08:33 +0200

Dear ADF-List,

I try to run a big system (112 atoms) in parallel (30 CPUs) with PVM. I
get the following error:

=====
 S C F
 =====
 

 CYCLE 1

 STOP RECEIVED FROM 1835016
 MESSAGE: ppchec missing kid2

 *******************************************************************************

 ADF EXIT called
 STOP RECEIVED from 1835016 , tag= 718

 *******************************************************************************
 *******************************************************************************

                             A D F E X I T
 pp info: terminating timer Cycl.cycle p
 pp info: terminating timer Cycl.Fcky.focktr p
 pp info: terminating timer Cycl.Fcky.ftrans p
 pp info: terminating timer Cycl.Fcky.ftrans
 pp info: terminating timer Cycl.Fcky.focktr
 pp info: terminating timer Cycl.cycle
 STOP RECEIVED from 1835016 , tag= 718

 Current Execution Stack has 5 elements
 Last to be Executed : ADF

 Stack of Active SubPrograms:
 ----------------------------
 FOCKTR
 AFOCKY
 CYCLE
 AMOL
 ADF
...

*******************************************************************************************
(LOGFILE)
 <Oct09-2002> <23:21:13> ADF 2002.02 RunTime: Oct09-2002 23:21:13
 <Oct09-2002> <23:21:13> Title
 <Oct09-2002> <23:21:17> RunType : SINGLE POINT
 <Oct09-2002> <23:21:18> Net Charge: -3 (Nuclei minus Electrons)
 <Oct09-2002> <23:21:19> Spin polar: 1 (Spin_A minus Spin_B electrons)
 <Oct09-2002> <23:21:19> Symmetry : NOSYM
 <Oct09-2002> <23:21:19> >>>> FRAGM
 <Oct09-2002> <23:21:24> >>>> CORORT
 <Oct09-2002> <23:21:33> >>>> FITINT
 <Oct09-2002> <23:22:00> >>>> CLSMAT
 <Oct09-2002> <23:23:34> >>>> ORTHON
 <Oct09-2002> <23:28:17> >>>> CRTP12
 <Oct09-2002> <23:28:22> >>>> GENPT
 <Oct09-2002> <23:28:31> Acc.Num.Int.= 4.000
 <Oct09-2002> <23:28:33> Block Length= 128
 <Oct09-2002> <23:31:00> >>>> PTCOR
 <Oct09-2002> <23:31:01> >>>> PTBAS
 <Oct09-2002> <23:35:58> >>>> CYCLE
 <Oct09-2002> <23:42:12> STOP RECEIVED from 1835016 , tag= 718
 <Oct09-2002> <23:42:12> WARNING: not all scratch files were closed
 <Oct09-2002> <23:42:12> END

A smaller system works fine, so I expect that it is some kind of timeout
problem during communication.
I set SCM_MAX_RCV_TIMEOUTS=20 and SCM_RCV_TIMEOUT=18000, but it doesn't
help. The same input-file runs fine (but slow ;-) ) on a single CPU.

Has anybody experienced the same problem and found a solution?

Thanks in advance!

Timm
Received on 2002-10-10 01:14:12

This archive was generated by hypermail 2.2.0 : 2006-11-02 07:00:02 CET