IR dyad-year dataset with 10 lines of dplyr + tidyr codePosted: July 29, 2015
Directed (country)-dyad-year datasets are quite common in International Relations (IR) research and from time to time one requires an empty one. Creating this used to be a hassle with +50 lines of code (see example in PERL or R). Luckily, with tidyr + dplyr + pipping building such a dataset requires at most 10 lines of code:
system <- read.csv("./raw data/states2011.csv") system <- system %>% select(ccode, styear, endyear) system <- system %>% expand(statea=ccode, stateb=ccode, year=seq(1816,2011)) %>% filter(statea!=stateb) %>% left_join(., system, by=c("statea"="ccode")) %>% filter(year >= styear & year <= endyear) %>% select(-styear,-endyear) %>% left_join(., system, by=c("stateb"="ccode")) %>% filter(year >= styear & year <= endyear) %>% select(-styear,-endyear)
First, the code block creates all possible country-country-year pairings (line 3) and then filters out the dyad-years which are inadmissible either because a) the dyad involves the same country (line 4) or b) at least one of the dyad-members does not exist in a particular year (lines 5-10).
Data Source: Correlates of War System Membership 2011