[Sandia National Laboratories]

[navigation panel]

Zoltan Home Page
Zoltan User's Guide
Zoltan Developer's Guide
Frequently Asked Questions
Zoltan Project Description
Papers and Presentations
How to Cite Zoltan
Download Zoltan
Report a Zoltan Bug
Contact Zoltan Developers
Sandia Privacy and Security Notice
Zoltan: 
Parallel Partitioning, Load Balancing and Data-Management Services

Frequently Asked Questions


  1. What does the following message mean during compilation of zoltan?
    Makefile:28: mem.d: No such file or directory
  2. On some platforms, why do Zoltan partitioning methods RCB and RIB use an increasing amount of memory over multiple invocations?
  3. Why does compilation of the Fortran90 driver zfdrive fail in fdr_const.f90?
  4. During runs (particularly on RedStorm), MPI reports that it is out of resources or too many messages have been posted. What does this mean and what can I do?
  5. Realloc fails when there is plenty of memory. Is this a Zoltan bug?



  1. What does the following message mean during compilation of Zoltan?
    Makefile:28: mem.d: No such file or directory
    Every time Zoltan is built, gmake looks for a dependency file filename.d for each source file filename.c. The first time Zoltan is built for a given platform, the dependency files do not exist. The dependency files are also removed by "gmake clean." Don't worry, though; after producing this warning, gmake will create the dependency files it needs and continue compilation.
  2. On some platforms, why do Zoltan partitioning methods RCB and RIB use an increasing amount of memory over multiple invocations?

    Zoltan partitioning methods RCB and RIB use MPI_Comm_dup and MPI_Comm_split to recursively create communicators with subsets of processors. Some implementations of MPI (e.g., the default MPI on Sandia's Thunderbird cluster) do not correctly release memory associated with these communicators during MPI_Comm_free, resulting in growing memory use over multiple invocations of RCB or RIB. An undocumented workaround in Zoltan is to set the TFLOPS_SPECIAL parameter to 1 (e.g., Zoltan_Set_Param(zz,"TFLOPS_SPECIAL","1");), which causes an implementation that doesn't use MPI_Comm_split to be invoked.


  3. Why does compilation of the Fortran90 driver zfdrive fail in fdr_const.f90?

    The Fortran90 driver zfdrive uses user-defined data types for a mesh data structure. It passes these data types to the Zoltan query functions through the void *data argument. Strict type checking in Fortran90 requires that the query interface have these user-defined data types compiled into the interface. The solution is as follows:

    cd Zoltan/fort
    mv zoltan_user_data.f90 zoltan_user_data.f90.zoltan
    ln -s ../fdriver/zoltan_user_data.f90 .
    cd ..
    touch fdriver/zoltan_user_data.f90 fort/zoltan_user_data.f90
    gmake zfdrive
    See the Fortran90 API description in the User's Guide and instructions for using zfdrive in the Developer's Guide for more details.
  4. During runs (particularly on RedStorm), MPI reports that it is out of resources or too many messages have been posted. What does this mean and what can I do?

    Some implementations of MPI (including RedStorm's implementation) limit the number of message receives that can be posted simultaneously. Some communications in Zoltan (including hashing of IDs to processors in the Zoltan Distributed Data Directory) can require messages from large numbers of processors, triggering this error on certain platforms.

    To avoid this problem, Zoltan contains logic to use AllToAll communication instead of point-to-point communication when a large number of receives are needed. The maximum number of simultaneous receives allowed can be set as a compile-time option to Zoltan. In the native Zoltan build environment, add -DMPI_RECV_LIMIT=# to the DEFS line of zoltan/src/Utilities/Config/Config.<platform> , where # is the maximum number of simultaneous receives allowed. In the Autotool build environment, option --enable-mpi-recv-limit=# sets the maximum number of simultaneous receives allowed. The default value is 2000.


  5. Realloc fails when there is plenty of memory. Is this a Zoltan bug?

    This problem has been noted on different Linux clusters running parallel applications using different MPI libraries and C++ libraries. Realloc fails where a malloc call will succeed. The source of the error has not been identified, but it is not a Zoltan bug. The solution is to compile Zoltan with the flag -DREALLOC_BUG. Zoltan will replace every realloc call with malloc followed by a memcpy and a free.


Updated: $Date$
Copyright (c) 2000-2007, Sandia National Laboratories.
The Zoltan Library and its documentation are released under the GNU Lesser General Public License (LGPL). See the README file in the main Zoltan directory for more information.