Opened 6 weeks ago

Last modified 5 weeks ago

#307 accepted Defect

Flexpart-WRF crashes due to array boundary violation in outgrid_init_reg.f90

Reported by: srakesh Owned by: pesei
Priority: major Milestone:
Component: FP coding/compilation Version: FLEXPART-WRF
Keywords: Cc:

Description (last modified by pesei)

Deal All,

I am using Flexpart-wrf version 3.3.2, compiled for mpi (version cray-mpich/7.6.3) using gnu compilers (version 7.2.0) on cray architecture.
The model has run successfully for 2 simulations, but on the next simulation, I am getting segmentation fault:

---------------------------- Running Flexwrf ----------------------------------
CPU: n180 N9

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x2aaaac8b693f in ???
#1  0x44baaf in ???
#2  0x4a830f in ???
#0  0x2aaaac8b693f in ???
#1  0x44baaf in ???
#2  0x4a830f in ???
#3  0x2aaaaace5aad in gomp_thread_start
        at ../../../cray-gcc-7.2.0-201709081833.7aac99f36ce61/libgomp/team.c:120
#4  0x2aaaab17c723 in ???
#5  0x2aaaac96bc9c in ???
#6  0xffffffffffffffff in ???
_pmiu_daemon(SIGCHLD): [NID 00069] [c0-0c1s1n1] [Fri Aug 13 12:56:57 2021] PE RANK 0 exit signal Segmentation fault
[NID 00069] 2021-08-13 12:56:57 Apid 182238106: initiated application termination.
--------------------------------------------------------------------------

These are my flags for compilation:

GNU_FFLAGS  =  -O2 -m64 -mcmodel=medium -fconvert=little-endian -finit-local-zero -fno-range-check -fbacktrace
GNU_LDFLAGS = -O2 -m64 -mcmodel=medium -fconvert=little-endian -finit-local-zero -lnetcdff -fno-range-check

I have used:

 ulimit -s unlimited
 ulimit -c unlimited

before running the model.

Any idea how I can fix this problem?

Attachments (1)

outgrid_init_reg.f90 (14.2 KB) - added by pesei 5 weeks ago.
Test version of offending subroutine

Download all attachments as: .zip

Change History (6)

comment:1 Changed 6 weeks ago by pesei

  • Description modified (diff)

I don't know what your problem is, but I would try next to compile with -g -fcheck=all to hopefully get more information about it.

comment:2 Changed 6 weeks ago by srakesh

Hii,
As you mention, I have included this flag and recompiled the model.
It seems that the error is coming from an array 'oro' in outgrid_init_reg.f90 :

At line 210 of file outgrid_init_reg.f90
Fortran runtime error: Index '-1' of dimension 1 of array 'oro' below lower bound of 0At line 210 of file outgrid_init_reg.f90

#1  0x45d7e3 in flexwrf_mpi
        at Src_flexwrf_v3.3.2/flexwrf_mpi.f90:243

How to resolve this now?

Thanks, Rakesh

Changed 5 weeks ago by pesei

Test version of offending subroutine

comment:3 follow-up: Changed 5 weeks ago by pesei

  • Owner set to pesei
  • Status changed from new to accepted

Please try to rerun with the attached source file and send me the compressed standard output to petra.seibert @ univie.ac.at. Hope it compiles, haven't tested.

comment:4 Changed 5 weeks ago by pesei

  • Component changed from FP other to FP coding/compilation
  • Summary changed from Flexpart-WRF showing segmentation fault after running for some time to Flexpart-WRF crashes due to array boundary violation in outgrid_init_reg.f90

comment:5 in reply to: ↑ 3 Changed 5 weeks ago by pesei

Replying to pesei:

Please try to rerun with the attached source file and send me the compressed standard output to petra.seibert @ univie.ac.at. Hope it compiles, haven't tested.

If there is a lot of output, probably the last part will suffice.

Note: See TracTickets for help on using tickets.
hosted by ZAMG