Graph computation

Engines for calculating things on or about graphs. Which is everything, in a trivial case, but usually when we talk about graph computation we mean things that are more simply or more elegantly represented as graphs, which usually implies having some kind of sparsity in edges.

For specific applications of such computations, see e.g. graphical models or complex networks, inference on social graphs etc. Sometimes we use a linear algebra representation of a graph connectivity pattern, e.g. a graph Lapalacian, and in that context some graph algorithms can be represented as matrix factorisation or similar problems.

A prime example of such a thing is Laplacians.jl (Julia):

Laplacians is a package containing graph algorithms, with an emphasis on tasks related to spectral and algebraic graph theory. It contains (and will contain more) code for solving systems of linear equations in graph Laplacians, low stretch spanning trees, sparsifiation, clustering, local clustering, and optimization on graphs.

All graphs are represented by sparse adjacency matrices. This is both for speed, and because our main concerns are algebraic tasks. It does not handle dynamic graphs. It would be very slow to implement dynamic graphs this way.

The other options that follow are more general creatures than this.

  • Cytoscape

    Cytoscape is an open source software platform for visualizing complex networks and integrating these with any type of attribute data. A lot of Apps are available for various kinds of problem domains, including bioinformatics, social network analysis, and semantic web.

  • SNAP.py is engineered do Stuff To Large Networks in python. It plugs into snapvx, a ‘nearly-convex’ solver for graph-like data.

  • Apache Giraph is a large scale graph processor.

    Apache Giraph is an iterative graph processing system built for high scalability. For example, it is currently used at Facebook to analyze the social graph formed by users and their connections. Giraph originated as the open-source counterpart to Pregel, the graph processing architecture developed at Google and described in a 2010 paper. Both systems are inspired by the Bulk Synchronous Parallel model of distributed computation introduced by Leslie Valiant. Giraph adds several features beyond the basic Pregel model, including master computation, sharded aggregators, edge-oriented input, out-of-core computation, and more. With a steady development cycle and a growing community of users worldwide, Giraph is a natural choice for unleashing the potential of structured datasets at a massive scale. To learn more, consult the User Docs section above.

  • neo4j is the previous season’s hot graph query system.

  • graphlab, now a Turi product has “graph” in its name. I think tries to solve general non-graphish ML problems but also… solves graphish ones? Do they also run a cloud? Or something? Maybe someone should actually read their website. I clearly didn’t.

  • ASAP presents some background on graph pattern mining versus graph analysis.

    Today, a deluge of graph processing frameworks exist, both in academia and open-source. These frameworks typically provide high-level abstractions that make it easy for developers to implement many graph algorithms. A vast majority of the existing graph processing frameworks however have focused on graph analysis algorithms. These frameworks are fast and can scale out to handle very large graph analysis settings: for instance, GraM can run one iteration of page rank on a trillion-edge graph in 140 seconds in a cluster. In contrast, systems that support graph pattern mining fail to scale to even moderately sized graphs, and are slow, taking several hours to mine simple patterns

  • Gephi is a classic graph visualizer, which I have played with a lot, although I never found a use for it outside of pretty pictures.

    Gephi is the leading visualization and exploration software for all kinds of graphs and networks. Gephi is open-source and free.

Otasek, David, John H. Morris, Jorge Bouças, Alexander R. Pico, and Barry Demchak. 2019. “Cytoscape Automation: Empowering Workflow-Based Network Analysis.” Genome Biology 20 (1): 185. https://doi.org/10.1186/s13059-019-1758-4.

Shannon, Paul, Andrew Markiel, Owen Ozier, Nitin S. Baliga, Jonathan T. Wang, Daniel Ramage, Nada Amin, Benno Schwikowski, and Trey Ideker. 2003. “Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks.” Genome Research 13 (11): 2498–2504. https://doi.org/10.1101/gr.1239303.