# What is Dendro-4.0 ?

Dendro is a distributed memory partial differential equations (PDEs) solver using numerical methods like, Finite Difference Method, Finite Element Method, Wavelet Method etc. It uses adaptive octree meshes as the geometric discretization.

# How do we partition octree among p processors.

We Space Filling Curve (SFC) based flexible partitioning scheme to partition the adaptive octree among 'p' processors. Current implementation of Dendro-5.0 supports both Hilbert curve and Morton Curve. We have experimented with the partitions that we get from using Hilbert and Morton , and we found that when we are moving towards large scale Hilbert curve based partitioning gives, more energy and communication efficient partitions compared to Morton curve.

# How to run Dendro-4.0

You can clone the repository using , 'git clone https://github.com/paralab/Dendro4.git'

## How to build Dendro-4.0 ?

You need to install http://ftp.mcs.anl.gov/pub/petsc/release-snapshots/petsc-3.5.4.tar.gz in order to use dendro4. We will provide support soon for the latest PETSC version.

You need CMake to build dendro. Create a build directory using 'mkdir build'. Then go into the build directory by 'cd build' then execute 'ccmake ..' to generate the make files. You can build Dendro-5.0 with several options.

- ALLTOALLV_FIX : OFF,Need to turn off
- DIM_2: OFF, This can be turned on if you need to run Dendro-5.0 in 2D case. default: OFF (Which means it assumes 3D domain)
- HILBERT_ORDERING:ON, This specify which SFC to use to partition the data. HILBERT_ORDERING: ON means it uses Hilbert curve, otherwise it uses Morton curve for partitioning.
- PROFILE_TREE_SORT: OFF
- NUM_NPES_THRESHOLD: square root of P (number of processors)
- SPLITTER_SELECTION_FIX: ON. This will perform the data exchange in the octree partitioning in stages. This is mandatory when you run dendro in very large scale.

## What can you run ?

See the codes in example folder on how to get things started.

# Scalability studies on Dendro-4.0

We have performed octree generation and partitioning up to 262144 cores in ORNL's titan super computer. We have managed to partition 1.3x10^12 octants among 262144 processors with in 4 seconds.

[SC16 Poster: http://sc16.supercomputing.org/sc-archive/tech_poster/poster_files/post245s2-file2.pdf)