What is EUGene, anyway?


EUGene is a program designed primarily for political scientists.  It has 2 purposes.  First, EUGene generates data for variables used to test Bruce Bueno de Mesquita and colleagues’ version of an expected utility theory of war and dispute initiation (Bueno de Mesquita, 1981, 1985;  Bueno de Mesquita and Lalman, 1992).  Second, EUGene serves as a data management tool for creating data sets for use in the quantitative analysis of international relations; with the country-year, directed-dyad-year, non-directed-dyad-year, and directed-dispute-dyad-year as the unit of analysis.  Until now, these data have been unavailable, and these data management tasks have frequently been cumbersome and difficult.  A paper exploring EUGene's capabilities and rationale is available here.

EUGene is an acronym for Expected Utility Generation and data management program.  EUGene is freeware, but is copyrighted.  A full download (see the download page) contains the program, expected utility data, documentation, and source code.

Read about Features
Read More about Expected Utility Data
Read More about Data Generation capabilities

Key Features:


Back to Top

Expected Utility Theory Data Generation

Bueno de Mesquita and Lalman’s so-called expected utility theory of war has become one of the most important theories of international conflict.  Following the most recent explication, a more appropriate label might be “the War and Reason game-theoretic theory of war," since War and Reason (Bueno de Mesquita and Lalman 1992) explicitly models game-theoretic interactions between states, and since the "international interaction" game in War and Reason represents only one of many possible games of international conflict that could be constructed.

Whatever the label, the testing of expected utility theory has lagged behind its theoretical development.  Even though it is one of the most widely-cited theories of international relations, in its most sophisticated formulation the theory has been tested only on 707 dyad years, all drawn from Europe between 1815 and 1970.  Testing has been limited because the necessary data for wider analysis – namely risk attitude scores and utility values for all states and years – have not been available.

EUGene generates expected utility data for all dyads and years. EUGene combines the 1992 methodology of War and Reason with an easy to use program to calculate expected utility values.  The program both generates expected utility data and predicts the International Interaction game equilibrium in any given dyad-year.  In addition, the program provides users with options for modifying expected utility calculations.  EUGene is the first program to implement Bueno de Mesquita and Lalman's (1992) methodology to generate data for the full population of cases in which we are interested as international relations scholars.

Back to Top

Data Management capabilities

EUGene makes a number of cumbersome tasks associated with building data sets for the quantitative analysis of international relations easier, especially data sets created with the directed dyad-year as the unit of analysis.  An example of a directed dyad-year is the US vs. the USSR in 1946.  Scholars have increasingly used data sets based on the dyad-year (both directed and non-directed) to conduct quantitative analyses both because dyadic interaction is believed to be at the heart of strategic international behavior, and because it is possible to combine explanations from multiple levels of analysis in one quantitative study when dyadic data is used.  Nevertheless, creating dyadic data sets is an onerous task for most researchers.  On the independent variable side, creating dyadic data sets involves merging data and renaming variables from multiple monadic data sets.  On the dependent variable side, the most common data sets with international conflict events (the Correlates of War Militarized Interstate Dispute data set and Interstate War data set) are not organized in dyadic form and must be converted into dyadic interactions;  such conversions are not always straightforward.

EUGene reads, merges, and outputs data from several of the most important other data sets in international relations.  In addition to simply merging the data files, EUGene outputs that data in a uniform format that can be imported into any statistical analysis package with ease.  Some of the input data sets have the country-year as the unit of analysis (e.g., the Correlates of War national capability data, Gurr Polity data).  Other data sets have the dyad as the unit of analysis, such as data about the physical distance between states, or the Correlates of War contiguity data set.  Still other data comes in a hybrid form, such as the Correlates of War alliance data set, which has the country-year as its unit of analysis in the data set structure, but which really contains data which is dyadic and annual in its underlying form.  Finally, some of the input data sets use multiple data set structures, such as the Correlates of War militarized interstate dispute data set, which comes as three files, one containing country-dispute level records, and two containing dispute-level records.  EUGene carries out necessary conversions between the formats, file structures, and differing units of analysis of these data sets.

EUGene also allows users to specify subsets of countries, years, and a variety of variables for output.  Data sets are saved in a text format that can be easily read into other programs for statistical analysis.  EUGene creates the command files for execution in SPSS, Stata, and LIMDEP to read in the data sets it creates.

The tasks that EUGene carries out can be (and have been) executed in other software programs.  However, it is cumbersome and time consuming to repeatedly program large numbers of merge and case selection commands in other programs.  For some (namely those not familiar with SPSS or Stata syntax), the process of data merging becomes a manual process of cut and paste.  We believe that the set of options provided with EUGene significantly simplifies the task of building data sets containing information from multiple inputs, allowing analysts to spend less time merging data and more time performing analysis.

Back to Top


Back to Top

EUGene Copyright

EUGene Copyright © 1997-2016, D. Scott Bennett and Allan C. Stam.  All Rights Reserved