JPDA: Just plain data analysis
- Constructing good tables for
the display of social indicators
- Constructing good charts and
principles of graphic display
- Bar charts
- Tips on using MS\Excel to
prepare charts and graphs.
- How to construct bad charts and
of the Week
JPDA: Just Plain Data Analysis
(this section is under construction)
compilation and presentation of numerical information
to support and illustrate arguments about
politics and public issues.
In recent years the discipline of
political science has a seen a resumption of the battles waged during the
behavioral revolution of the 1950’s. In the spring of 2000, an email
message written under the nom de guerre “Mr. Perestroika” initiated an
attack on what was seen as the "hegemony" of hard science methodology and
the suppression of case study and qualitative analyses in the discipline's
journals and many of its graduate programs. The most serious charges were
that the emphasis on hard quantitative science in the discipline's most
prestigious journals and many of its graduate programs did not reflect the
diversity or pluralism of the discipline and that the research often
addressed only apolitical and trivial subject matters, those most amenable
to the hard methodology. In political science “hard science” stood to mean
two somewhat disparate research enterprises: quantitative research in the
form of increasingly econometric research and statistics approach, and
formal modeling (a.k.a public choice theory, rational choice, or
disparagingly, “Rat choice”).
Perestroikans have pursued the cause of methodological pluralism (a concept
laden with many meanings in the writings of political scientists), those on the
hard science side have struggled to find a unifying methodology for the
discipline that would make the discipline more like, say, economics or
psychology. The two most recent attempts have been Gary King’s effort to
articulate the common principles of quantitative and qualititative research
(King, 1989 and King and Keohane, 1994) and the development of the “Empirical
Implications of Theoretical Models” (EITM) approach designed to encompass both
quantitative and rational choice methods (National Science Foundation, 2002).
The stakes in this
conflict are mostly academic, which is to say trivial, but they do involve
issues that are crucially important to many political scientists: which research
gets published, which scholars get hired for tenure track positions and what
types of courses are to be required of students engaged in graduate and
undergraduate political science study.
What most of this
debate misses, however, is that there is a quantitative political science
methodology, at least a century old, (cf. Allen, 1906) common to almost every
field of political science, practiced by political scientists in academia,
government and the private sector, and found in most of the political science
textbooks and in much of the most read literature in the discipline. I call it
Just Plain Data Analysis (JPDA).
What is JPDA?
JPDA is, simply, the
compilation and presentation of numerical information to support and illustrate
arguments about politics and public issues.
JPDA differs from
what is commonly regarded as quantitative political science methodology in that
it usually does not involve formal tests of theories, hypotheses or null
hypotheses. It abjures measures of association and statistical significance.
Rather than relying on statistical analysis of a single dataset, JPDA -- at its
best -- involves compiling all the relevant evidence from multiple data
sources. Although the utility
of any methodology or statistical procedure is best judged in the context of the
research questions and issues being addressed, JPDA has many advantages over the
standard hard quantitative analysis and there are many instances where political
scientists would be better off using just plain data analysis and foregoing
elaborate regression based analyses of single dataset. The foremost advantage
of JPDA is that it is accessible to a broad audience. While most
regression-based analyses of empirical models rest on often elaborate,
unexamined assumptions, plain data analysis is accessible and transparent.
This is not to say that plain data analysis does not
involve empirical analysis of political science theories.
Examples of JPDA in political science:
Just plain data analysis has a long tradition in political
science and it was essential to what political scientists meant when they
first use the term "science" to define what their discipline was all about.
|Figure I.1: Allen (1906)
(click on thumbnails)
Perhaps the first analysis of the impacts of
technology on elections was Philip Loring Allen’s 1906 study of the effects
of variations in state balloting procedures. Allen developed measures, still
in use today, of split-ticket voting and what would later be called “roll
off” to evaluate the fairness and reliability of a variety voting
procedures. Writing in the tradition of the progressive reform movement that
was a force in both American politics and American political science at the
time, Allen recognized that voting technology was not necessarily
politically neutral and was especially critical of the effect of voting
mechanisms that favored the “spoilsmen” over the independent voter, while
acknowledging these technologies had disparate impacts on the educated and
|Figure I.2 Nie. et. al (1976)
Much of the earliest "hard" quantitative research
in political science involved analyses of American voting behavior, made
possible by the availability of the American National Election Surveys
and the development of the computer equipment and software used to
analyze them in the 1960s and 1970s. One of the classic studies of
American voting behavior The Changing American Voter (1976), an
excellent example of just plain data analysis, summarized previous
quantitative analyses and defined the context for much of quantitative
research that was to come later. The high quality of the tabular
and graphical representations of data in the book -- at a time when
charts and graphs were done by hand --
|Figure I.3 Putnam (2003)
Robert Putnam's Bowling Alone (2000), a work that has
probably inspired more conference papers and journal articles across more of
the discipline's subfields than any other piece of political science, is
another good example of plain data analysis. Almost all of Putnam’s analysis
is grounded in some kind of presentation of quantitative data, from a wide
variety of sources, and presented in charts and graphs. Putnam describes his
strategy as attempting to “triangulate among as many independent sources of
information as possible” based on the “core principle” that “no single
source of data is flawless, but the more numerous and diverse the sources,
the less likely that the could all be influenced by the same flaw” (415).
Although almost all of the data are based on public opinion surveys, the
data presentations rarely require the use of measures of statistical
significance and are presented as illustration of the general theory rather
than statistical tests of hypotheses. Unfortunately, Putnam's graphics
leave much to be desired; many, if not most, of his figures violate
fundamental principles of graphic design.
|Figure I.4 Perestroikan JPDA
Just plain data analysis permeates almost every field of
political science. Consider the Perestroikan's own attack on "hard" political
science: a Perestroikan sponsored symposium in the American Political Science
Association’s newsletter-journal, PS, Political Science and Politics contained 5
essays decrying the hegemony of quantitative research (and its companion
non-quantitative formal modeling). And every one supports its argument with
reference to quantitative data derived from systematic empirical surveys of the
discipline's journals and graduate program curricular requirements. The results
are presented bar charts and time series charts (Bennett, Barth, and Rutherford
2003, 375-6) and tabulations (Shwartz-Shea 2003, 380-5) and in textual
discussions, such as this from Yanow (2003, 397):
1991–2000, research based on statistics and modeling accounted for 74% of
all published articles (53% and 21% respectively), political theory garnered
25% of journal space, and qualitative research captured 1% (one article each
in 1992, 1993, 1995, 1996, 1997)."
Perestroikans are not opposed to the use of quantitative data.
|Figure I.5 Sniderman (1997)
Doing JPDA well
involves an appreciable set of skills and, although JPDA is the most pervasive
form of quantitative political science analysis, it is generally not taught to
students in research methods courses. Six basic skills are involved:
- Understanding key political
- Finding meaningful data.
- Constructing appropriate statistical measures.
- Assessing data reliability and validity.
- Data presentation skills
- Use of spreadsheet charting software
Allen, Philip Loring. (1906). Ballot laws and their workings. Political Science Quarterly, 21(1), 38-58.
Andrew, Aharon Barth, Kenneth R. Rutherford. 2003. “Do We Preach What We
Practice? A Survey of Methods in Political Science Journals and Curricula.”
PS: Political Science and Politics 36 (July): 381–386.
King, Gary 1989.
Unifying Political Methodology: The Likelihood Theory of Statistical Inference
(University of Michigan Press).
King, Gary, Robert O.
Keohane, and Sidney Verba 1994. Designing Social Inquiry: Scientific
Inference in Qualitative Research (Princeton University Press).
National Science Foundation (2002) Political Science Program,
Directorate For Social, Behavioral and Economic Sciences. EITM: Empirical
Implications of Theoretical Models (Workshop Report)
Norman H. Nie, Sidney Verba, and John R. Petrocik,
1976 The Changing American Voter (Cambridge: Harvard University Press),
Putnam, Robert D. 2000. Bowling Alone.
New York: Simon and Schuster.
Sniderman, Paul M. and Edward G. Carmines. 1997. Reaching Beyond Race.
Cambridge: Harvard University Press.
Count, Divide, and Compare (C-D-C)*
Social indicators consist of numerical COUNTS of
political, social and economic phenomena, a DIVISOR (or denominator) to form rates,
ratios, or proportions for meaningful COMPARISONS between persons, groups,
places, entities or units of time.
It is critical to pay attention to both the count and
the divisor (the numerator and denominator) that is used to construct a
statistics. Consider what the following statistics might mean:
The divorce rate:
The percentage of marriages that end in divorce.
The number of divorces divided by the number of marriages in a given
The number of divorces divided by the number of married couples.
The number of divorces per 1,000 of population.
General Government Tax and Nontax Receipts, % of GDP
Count: total of tax and nontax receipts
Divisor: Gross Domestic Product
Comparisons: OECD Nations
Immigration Rate, 1940- 1999
Count: number of legal immigrants
Divisor: total US populations
Comparisons: years, annual
Corporation Income Taxes as a Percent of Individual and Corporate Income
Count: total corporate income taxes
Divisor: total income taxes
Comparisons: years, annual
Poverty Rates for Children and Elderly, 1959-2003 (see
Counts: number of persons in poverty
Divisor: Number of children and the number of elderly
Comparisons: age group and years, annual.
Victimization Rates, 1973-2003
Count: number of violent crimes reported (in the
Divisor: number of persons, in thousands (in the NCVS survey)
Comparison: years, annual
(Note: this rate measures number of crimes
per 1,000 persons, it is not the rate of persons victimized by crime)
Murder Rates in Ten Largest US Cities, 1995-98
Count: number of murders (and non-negligent
Divisor: number of persons (in 1000,000s)
Comparisons: ten cities, 1995 and 1998
*The C-D-C framework is an acronym developed by epidemiologists at the
Center for Disease Control.