Vite is an MPI+OpenMP implementation of Louvain method for (undirected)
graph clustering or community detection, that supports a number of heuristics
to improve the scalability. Vite can use both real-world and synthetic graph
(uses an in-memory random geometric graph generator).
We recently added an option to improve the distribution to
be more edge-balanced rather than standard vertex-based ("-b"),
which can avoid communication and improve execution time.
Relevant discussion in the following paper:
https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8916299
If quality is important for you, please consider the coloring
options. Coloring will increase the overall execution time and
communication, but it is possible to combine coloring with
other heuristics to improve the performance. Relevant discussion
in the following paper:
https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8547534
Vite can be downloaded from (please don't use the past
PNNL/PNL link to download Vite mentioned in the paper, the
current GitHub link is the correct one):
https://github.com/Exa-Graph/vite
This code requires an MPI library (preferably MPI-3 compatible)
and C++11 compliant compiler for building.
Please contact the following for any queries or support:
Sayan Ghosh, PNNL (sg0 at pnnl dot gov)
Mahantesh Halappanavar, PNNL (hala at pnnl dot gov)
Antonino Tumeo, PNNL (antonino.tumeo at pnnl dot gov)
Please '*' this repository on GitHub if the code is useful to you.
COMPILATION
Pass -DDEBUG_PRINTF if detailed diagonostics is required along
program run. This program requires OpenMP and C++11 support,
so pass -fopenmp (for g++)/-qopenmp (for icpc) and -std=c++11/
-std=c++0x. For Cray systems, use CC.
Pass -DUSE_32_BIT_GRAPH if number of nodes in the graph are
within 32-bit range (2 x 10^9), else 64-bit range is assumed.
Pass -DDONT_CREATE_DIAG_FILES if you dont want to create 2 files
per process with detail diagonostics.
Upon building, the program will generate two binaries:
bin/graphClustering (parallel) and bin/fileConvert (serial).
Please use bin/fileConvert for input graph conversion from
native formats to a binary format that bin/graphClustering will
be able to read.
Compiling on Intel KNL:
We made some modifications to the code to port it to Cray XC systems
with Intel KNL. We use lib-memkind to allocate some heavily used data
structures to KNL MCDRAM (for usage in flat-mode).
Please ensure lib-memkind module is loaded and pass
-DUSE_AUTOHBW_MEMALLOC, and pass -xmic-AVX512 instead of -xHost in
the Makefile.
Experimental device offload:
The modularity computation is the only arithmetic logic in this code.
Even though, due to the limited computation associated with the modularity
computation, it is doubtful how much GPU can help in improving the
performance relative to using OpenMP host pragmas. However, we have added
an option to enable OpenMP offload only for the modularity computation
kernel. Vite must be built with -DOMP_TARGET_OFFLOAD and a compiler that
supports OpenMP 4.5. The default compiler options in the Makefile needs to
be updated as well, for instance, on Summit, for enabling OpenMP offload
in the XL compiler, one would need to pass: "-qsmp=omp -qoffload" currently.
Initial testing suggests that for certain graphs, it is possible to achieve
about 10-15% improvement in performance relative to host OpenMP.
INPUT GRAPH CONVERSION
Convert input files to binary from various formats.
Possible options (can be combined):
-n : Input graph in SNAP format
-m : Input graph in Matrix-market format
-e : Input graph in METIS format
-p : Input graph in Pajek format
-d : Input graph in DIMACS format. Pass 0 or 1
to indicate undirected/directed graph
-s : Input graph is directed edge list
-u : Input graph is undirected edge list
(Can be used for Graph Challenge official
datasets - http://graphchallenge.mit.edu/)
-x : Read a number of files with edge list
information, usage e.g.:
-x " "
Requires conformant file names.
-i : Accept weights as is from the file. If this
option is not passed, then by default the
absolute value of the original weight is
chosen.
-r : Create random edge weights
-w : Initialize edge weights to 1.0
-o : Output binary file with full path
-z : If the index of input graph is 1-based,
then this option makes it 0-based
Note: DIMACS file could be directed or undirected, so
pass 0 if input is directed, or a number > 0 input
graph is undirected. Also, we expect DIMACS inputs to
have weights, therefore passing -r has no effect.
bin/./fileConvert -d 0 -f sample.gr -o sample.bin
Same as the one before, except initializes all weights
to 1, no matter what is in the input file.
Note: We consider files containing edge list, as 'simple'
formatted files (option "-s" for directed, and option "-u"
for undirected). If you have a weighted simple format file
(i.e., is u v w format) then you have to additionally
pass "-i".
If there are weights in such an edge-list file and for some
reason you do not want to consider them, pass "-w" or "-r"
in addition to "-s" or "-u" such that we can assign one-weights
or random-weights internally.
By default, we assume edge list files to not have any weights
(as in, just u v per line).
The -s or -u option is just to tell the implementation about
your native graph format (as in if you pass -s, we assume in
your edge list just u v exists and not its complement; whereas
when you pass -u, we will assume that both u v and v u
exists).
Regardless of the native format, internally, we always use an
undirected graph representation, because the algorithm itself
assumes an undirected input, and accordingly prints the number of
edges as double if your native graph was directed (on the other
hand, if your native graph was undirected, and for e.g., you used
-u to convert to binary, the #edges printed by the implementation
should exactly match with the #edges of your native graph).
Apart from the simple edge list formats and matrix-market formats,
we do not actively check the correctness of the other formats.
SYNTHETIC GRAPH GENERATION
RGG
Apart from real world graphs, users can use specific options
to generate a Random Geometric Graph (RGG) in parallel.
RGGs have been known to have good community structure:
https://arxiv.org/pdf/1604.03993.pdf
The way we have implemented a parallel RGG generator, vertices
owned by a process will only have cross edges with its logical
neighboring processes (each process owning 1x1/p chunk of the
1x1 unit square). If MPI process mapping is such that consecutive
processes (for e.g., p and p+1) are physically close to each other,
then there is not much communication stress in the application.
Therefore, we allow an option to add extra edges between randomly
chosen vertices, whose owners may be physically far apart. Relevant
discussion in paper:
https://ieeexplore.ieee.org/abstract/document/8641631
We require the total number of processes to be a power of 2 and
total number of vertices to be perfectly divisible by the number of
processes when parallel RGG generation options are used.
An n-D random geometric graph (RGG), is generated by randomly placing N
vertices in an n-D space and connecting pairs of vertices whose Euclidean
distance is less than or equal to d. We only consider 2D RGGs contained
within a unit square, [0,1]^2. We distribute the domain such that each
process receives N/p vertices (where p is the total number of processes).
Each process owns (1 * 1/p) portion of the unit square and d is computed
as (please refer to Section 4 of above paper for details):
d = (dc + dt)/2;
where, dc = sqrt(ln(N) / piN); dt = sqrt(2.0736 / piN)
Therefore, the number of vertices (N) passed during miniVite execution on p
processes must satisfy the condition -- 1/p > d.
Please note, the default distribution of graph generated from the in-built random
geometric graph generator causes a process to only communicate with its two
immediate neighbors. If you want to increase the communication intensity for
generated graphs, please use the "-e" option to specify an extra percentage of edges
that will be generated, linking random vertices. As a side-effect, this option
significantly increases the time required to generate the graph.
We also have bindings to NetworKit[*], that can help generate random and scale
free graphs, but this option is actively supported and may be purged from future
Vite versions. This is mostly serial, but NetworKit internally may use OpenMP.
Note: We have only tested with NetworKit v4.2, and have no plans on supporting
other NetworKit versions (we will most probably remove this option from
future releases).
-p <...> : Probability for edge creation (for ER graph)
-k <...> : Number of clusters for RGG (each unit square is divided into
clusters where edges are created within) OR attachments per
node for Barabasi-Albert graph
-m <...> : Initial nodes attached for Barabasi-Albert
Input file conversion to Vite binary format from the respective native
file formats: convert the input file to binary format using fileConvert.
Set the desired number of processes and threads.
Run distributed parallel Louvain algorithm (graphClustering binary)
using the binary file created in Step #0:
E.g., pass a binary file using "-f" option and include other heuristics:
mpiexec -n 2 bin/./graphClustering -i -f karate.bin
E.g., generate a random geometric graph by passing #vertices=64:
mpiexec -n 2 bin/./graphClustering -n 64
E.g., generate a random geometric graph by passing #vertices=64
and store the resultant binary graph:
mpiexec -n 2 bin/./graphClustering -n 64 -s rgg64.bin
E.g., read a previously generated random geometric binary graph:
mpiexec -n 2 bin/./graphClustering -f rgg64.bin
-c " <single-color-iteration?>" :
Enable coloring, adjacent vertices are processed
in different order. Synchronization happens once
per , so this option can significantly
increase execution time.
We use incomplete iterative
coloring algorithm (Jones-Plassmann), so passed colors
is just a hint, also pass whether a single-color-iteration
is expected, by default multiple iterations are invoked.
-d " <single-color-iteration?>" :
Enable coloring, adjacent vertices are processes
in different order. No communication overhead like
above, only local overhead of coloring subgraph.
-i : Threshold cycling, threshold changes dynamically
in every phase of Louvain algorithm.
-t 1 : If a vertex is in the same community for past 3
iterations, then consider it inactive.
-t 3 : If a vertex is in the same community for past 3
iterations, then consider it inactive. Also,
adds a communication step to gather inactive
vertices and terminate Louvain if >= 90% vertices
at an iteration are inactive.
-t 2 -a <0-1> : Early termination* using probability alpha.
-t 4 -a <0-1> : Early termination* using probability alpha. Also,
adds a communication step to gather inactive
vertices and terminate Louvain if >= 90% vertices
at an iteration are inactive.
-b : Only valid for real-world inputs. Attempts to
distribute approximately equal number of edges among
processes. Irregular number of vertices owned by a
particular process. Increases the distributed graph
creation time due to serial overheads, but may
significantly improve the overall execution time.
-o : Output communities into a file. This option will result
in Vite dumping the communities (community-per-vertex in
each line, total number of lines == number of vertices)
in a text file named .communities in
the same path as the input binary file.
-r : This is used to control the number of aggregators in MPI
I/O and is meaningful when an input binary graph file is
passed with option "-f".
naggr := (nranks > 1) ? (nprocs/nranks) : nranks;
-g : Pass a ground truth file for community comparison. We
expect the ground truth file to contain N lines (equal to
the total #vertices in the graph), while each line containing
a distinct vertex ID and associated community ID, separated by
a space or tab. Ground truth community comparison is performed
in a single node, and it uses OpenMP to parallelize. It may take
a substantial amount of time for large files.
-z : Only applicable if "-g " option is passed. This tells us
that the passed ground truth file is 1-based. If this option is
not passed, we assume the ground truth to the 0-based.
-n <|V|> : Generate graph in memory (uses a parallel Random Geometric Graph
generator).
-e <%> : Used in conjunction with the "-n <|V|>" option to generate RGG.
This option tells the percentage of edges to be added, randomly
connecting vertices across processes. Currently, the maximum number
of randomly added edges to RGG cannot exceed INT_MAX (there is no
check to determine this, so exercise caution).
-p : Run a single phase of Louvain.
-s : Only applicable to the generated graphs (for e.g., -n <|V|>),
output is a binary file as per the output file path.
18: -j : Just process (read or generate) the graph and exit without running
community detection.
By default, the coloring method is run for multiple iterations, to make it run for
just a single iteration, pass "true". For e.g.:
mpiexec -n 2 bin/graphClustering -c "4 true" -f karate.bin
*Note: Option to update inactive vertices percentage is defined as
macro ET_CUTOFF in louvain.hpp, and the default is 2%.
COMPARING COMMUNITIES WITH GROUND TRUTH DATA
Generate benchmark graphs with ground truth
community information. For e.g., we have tested
with LFR benchmark.
(https://sites.google.com/site/santofortunato/inthepress2)
LFR generates an edge list (index 1-based), along with community
membership of each vertex.
Convert the generated network to binary using
fileConvert binary.
Execute the program as shown in the e.g. below:
mpiexec -n 2 bin/./graphClustering -r 1 -f lfrtest.bin -z -g community.dat
The difference between standard way to execute the program and this way
is that one needs to pass the ground truth vertex-community mapping
using the "-g" option. Also, the "-z" option tells the program that
the input ground truth file -- community.dat is 1-based. If "-z" is
not passed, then we assume that the file is 0-based.
We return some statistics such as F-score, Gini-coefficient,
false positives/negatives and true positives. We expect that the isolated
vertices will be placed into their respective communities, and not into
a single one.
However, extra communication per phase is required when these metrics
are to be computed, so the program execution time will increase
significantly. Plus, the community comparison part is currently
multithreaded (uses OpenMP), so everything is brought to a single
node/PE and then communities are compared.
OUTPUT RESULT
If -DDONT_CREATE_DIAG_FILES is passed during compilation,
then output is send to stdout.
Otherwise, the output result is dumped per process on files
named as dat.out.. Check dat.out.0 to review
program diagonostics.
Output files are cleared with: make clean.
FAQs
Unknown package
Package last updated on 04 Nov 2021
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.