Visualizing networks with ggplot2 in R

When I had to visualize some network data last semester in my social network analysis class, I wasn’t happy with the plot function in R‘s sna-package. It is not very flexible and doesn’t allow to modify the graph figure flexible. Thus, I decided to write a little function to visualize network data with the ggplot2 engine.

The biggest challenge in network visualization is usually to come up with the coordinates of the nodes in the two dimensional space. The sna-package relies on a set of functions that can calculate a set of optimal coordinates with respect to some criteria. Two of the most prominente algorithms (Fruchterman & Reingold’s force-directed placement algorithm and Kamada-Kawai’s) are implemented in the sna-package function gplot.layout.fruchtermanreingold and gplot.layout.kamadakawai. Both can be used in my function below.

In the first part of the function, the layout function calculates the coordinates for every node in a two dimensional space. In line 14 to 18 the function takes the node coordinates and combines them with the edge list data to come up with the coordinate pairs to characterize the edges in the network.
In the middle part the data are passed to the ggplot function and used to plot the nodes (a set of points) and edges (a set of segments). In line 26 to 30 I am discarding the default grid from the ggplot figure and other default layout elements. The last part of the code generates a random network and passes it to the plot function.

library(network)
library(ggplot2)
library(sna)
library(ergm)


plotg <- function(net, value=NULL) {
	m <- as.matrix.network.adjacency(net) # get sociomatrix
	# get coordinates from Fruchterman and Reingold's force-directed placement algorithm.
	plotcord <- data.frame(gplot.layout.fruchtermanreingold(m, NULL)) 
	# or get it them from Kamada-Kawai's algorithm: 
	# plotcord <- data.frame(gplot.layout.kamadakawai(m, NULL)) 
	colnames(plotcord) = c("X1","X2")
	edglist <- as.matrix.network.edgelist(net)
	edges <- data.frame(plotcord[edglist[,1],], plotcord[edglist[,2],])
	plotcord$elements <- as.factor(get.vertex.attribute(net, "elements"))
	colnames(edges) <-  c("X1","Y1","X2","Y2")
	edges$midX  <- (edges$X1 + edges$X2) / 2
	edges$midY  <- (edges$Y1 + edges$Y2) / 2
	pnet <- ggplot()  + 
			geom_segment(aes(x=X1, y=Y1, xend = X2, yend = Y2), 
				data=edges, size = 0.5, colour="grey") +
			geom_point(aes(X1, X2,colour=elements), data=plotcord) +
			scale_colour_brewer(palette="Set1") +
			scale_x_continuous(breaks = NA) + scale_y_continuous(breaks = NA) +
			# discard default grid + titles in ggplot2 
			opts(panel.background = theme_blank()) + opts(legend.position="none")+
			opts(axis.title.x = theme_blank(), axis.title.y = theme_blank()) +
			opts( legend.background = theme_rect(colour = NA)) + 
			opts(panel.background = theme_rect(fill = "white", colour = NA)) + 
			opts(panel.grid.minor = theme_blank(), panel.grid.major = theme_blank())
	return(print(pnet))
}


g <- network(150, directed=FALSE, density=0.03)
classes <- rbinom(150,1,0.5) + rbinom(150,1,0.5) + rbinom(150,1,0.5)
set.vertex.attribute(g, "elements", classes)

plotg(g)

I was too lazy to make this function more general (and user friendly). That’s why, for most practical purposes it needs to be modified to make pretty visualization – but nevertheless I hope that it provides a useful jumping point for others. Some of my plots from the class are below. I included them to show the flexibility when using the ggplot2 engine instead of sna’s default plot function. Unfortunately I can’t post the data for these networks.

Update: Another interesting approach to visualized geocoded network data in R is explained in the FlowingData blog.


26 Comments on “Visualizing networks with ggplot2 in R”

  1. Adam says:

    This is a great function. I wish you’d pursue it. The igraph, SNA, and network packages all lack the beauty of ggplot2. I’d like to see a pure R alternative to Gephi, actually. =)

  2. Adam says:

    I don’t suppose you could quickly walk me through how you added vertex labels, could you?
    I’d understand the setting part as…

    set.vertex.attribute(g, “label”, labelvector)
    or
    set.vertex.attribute(g, “names”, labelvector)

    But what needs to change in the plotg(g) function call?
    plotg(g, labels=???)?

  3. mm says:

    adam, there is no way to inject labels into the function above. you need to modify it slightly if you want to see labels in the plot. a simple way would be to add a text layer after the point layer (after the line geom_point(…..) ), e.g. geom_text(aes(X1, X2, label=seq(1,length(X2))), data=plotcord ) this will plot the IDs above the nodes. If you want to supply labels directly, substitute the seq(…) in geom_text() with a variable and pass it through plotg()

  4. Ana says:

    Hi,
    I am using your code but I need to do a slight alteration. First, using the probabilities I need to scale the sizes of vertices and then use again different probability vector to modify the thickness of edges. Unfortunately, none of the scaling ggplot2 options seem to do the job. I always get very big miss-match on edges and vertex sizes. (scale_area, scale_size etc. did not work) .In the end I got something but my vertex sizes are not scaled well, that is, the difference between the smallest and largest vertex is almost lost. I really do not know how to fix this. Do you have some ideas?

  5. sumtxt says:

    ana, i think your problem is more about ggplot2 then about the function above. did you check the new package documentation website http://docs.ggplot2.org/current/?

    • Ana says:

      Hi, thanks for your reply. It is fixed now. The frustrating thing with ggplot2 is that it has many many additions but such poor explanations and sometimes it is not nice to spend hours and hours on one plot. However, the fact that I can easily assign text where I want it is the reason why I am still sticking to ggplot2 ;-)
      Anyways, I am happy to report that I got there in the end. The plot is so cute and I am really pleased with it. Big thank you for sharing this code. It was really a life saver for me.

  6. Fr. says:

    Here’s my take at the function. There’s room for improvement, e.g. support other placement modes, improve how arguments are passed, better labels, etc. Supporting more than one placement mode should be achievable through do.call(). I would gladly take suggestions on how to straighten up the arguments.

    • sumtxt says:

      great job!

      • Fr. says:

        Thanks! I’m working on making the function robust to node coloring (not yet done). It is already robust to node weighting and now supports all placement methods.

        With your agreement, and after a few more edits, I can try submitting it for inclusion into the GGally package, with both of us as authors. Let me know :)

      • sumtxt says:

        sure, absolutely, do that. :-) i am currently super busy, but I try to take a closer look at the code in the next days. however, that shouldn’t stop you from going ahead. thx for your efforts, i am sure that there are many potential users out there.

      • Fr. says:

        Okay, will do!

  7. […] couvrent les cas les plus simples avec des données simulées et montrent bien la filiation avec le code original de Moritz Marbach […]

  8. Fr. says:

    Quick update: I have submitted the function to GGally, with two working examples using parliamentary networks. There are more examples in the documentation and at the Polit’bistro blog (see pingback link, in French).

    The function has been starred a number of times on GitHub and has generated some helpful comments. Someone has suggested supporting bipartite networks, which is more or less what your midpoint edge labels seem to do, but I have too little experience with bipartite data to make that work (although this post explains a lot of stuff).

    My hope is that the GGally maintainer will accept the pull request and push the new version of his package to CRAN. His work on GGally is slow-paced, though.

    • Fr. says:

      Update: ggnet is part of the GGally package. It’s on GitHub and should get to CRAN at some point. Please let me know if you want me to change anything to it. I have not worked on edge labels or weights, but I have worked on making it work with bipartite plots (with the help of Pedro Jordano).

      It’s a nice function, and there’s already a few stargazers on GitHub :)

  9. Reblogged this on pmakui and commented:
    Great article of visualizing networks with ggplot2

  10. How would one use the faceting capability of the ggplot2 package with this fuction??

    • sumtxt says:

      I never used the faceting… You might have to modify the function a little. Pls share your code if you figured it out.

      • Thank you for the prompt reply. Working to see if it works since it will help me to facet my network data in one plot according to regions.
        Your code is really interesting. Thank you for sharing.

      • Fr. says:

        Not sure how you might facet a network object: you’d have to extract the subgraph first. It’s doable, but it’s probably not something that ggplot2 can do on its own from the information that is passed to it by the function, and even less so from a facet() formula.

  11. Just played with the above function by adding : facet_wrap(~elements) and the results were interesting in that it creates four different network maps with elements belong to each group but the edges are not affected.

    • Fr. says:

      That makes sense — the node placement comes from a single data frame, so you can facet that, but the edges are in another data frame.

      Furthermore, the node placement in your faceted graphs are going to be incorrect: they reflect the whole network, not the subgraph in the facet.

      Perhaps you want to play with grid, store four different graphs, and then have them arranged in squares?

      • Thanks for the insight, but while using the gridExra package, the grid.arrange() function only plots ggplot objects and after using the above function to plot different graphs individually and using grid.arrange to plot them, it gives the error; “inputs must be grobs”

      • Fr. says:

        Try the vplayout approach used in this plot function.

  12. […] Marbach has a great post here explaining how to easily get ggplot2 up and running with network data. It was still one of the […]


Leave a reply to sumtxt Cancel reply