Just Plain Data Analysis: Companion Website
Gary Klass
Department of Politics and Government
Illinois State University

Home

Home
Charting data
Interpreting data
Finding the data
References
Study questions
Acknwldgmnts
Errata
Comments

                                 

Related links:

Debra Dalgleish
DSA Insights
Freakonomics
Demtri Martin
Tushar Mehta
Kelly O'Day
Jon Peltier
Andy Pope
StatLit.org
Edward Tufte
Undrstndng World Today
John Walkenbach

Chapter 7: Finding the Data 

General Data Sources

International Data

- - Data Sources Used in this Book:

  • Corruption
  • Education
  • Global Warming
  • Health Expenditures
  • Millennium Development Goals
  • Political Participation
  • Poverty—Developing Nations
  • Poverty—Wealthy Nations
  • Voter Turnout and Election Systems
  • Notes on Data Formats

    Other Data Resource Guides


    U.S. Data
  • The U.S. Statistical Abstract
  • Federal Statistical Agencies
  • Public Opinion Polling Data
  • - - Data Sources Used in this Book:

  • Crime Rates
  • Dow Jones Industrial Average
  • Educational Achievement—National Data
  • Educational Achievement—State NCLB Data
  • Education—Higher Education
  • Federal Budgets
  • Gas prices
  • Homeownership
  • Income
  • Inflation
  • The Misery Index
  • National Debt
  • Presidential Elections
  • Political Corruption
  • Poverty
  • Presidential Approval
  • Unemployment
  • Social Capital Index
  • Voter Turnout—Voting Age Population
    Measures
  • Voting Turnout—Voting Eligible Population
     Measures
  • War Casualties


  • This concluding chapter will address what is logically the first stage of any data-based research project: before you present and interpret your data, you have to find them. This chapter comes at the end because a productive data search best begins with a clear understanding of what kind of data are likely to be available and how other investigators have used the available data to define and address the critical questions might wish to study. A subtext of what precedes this chapter has been to share with the reader an understanding of the variety and wealth of social indicator data that are available from governmental and nongovernmental sources for those who would use numerical evidence to address questions of public policy and political affairs. This chapter will take that a step further, offering a more detailed description of the data sources used for this book and the other data that are available from these and related sources.

    Almost all social indicators in this book—with the exception of data derived from agency records such as the FBI Uniform Crime Report and the U.S. Federal Budget—were originally derived from surveys, usually regular periodic surveys, conducted by either a governmental agency or a survey organization. Each indicator was first constructed from “raw” data files containing each individual response to the questions in the surveys. Thus, the Census Bureau calculates monthly employment indicators, annual poverty and income estimates, and biennial voter turnout measures from the raw responses to its monthly Current Population Survey.

    Much of the data used in the charts and tables in this book was obtained from the primary sources, often from the Census Bureau or statistical agency websites. In some cases secondary sources were used either to demonstrate how particular researchers used the data, because some secondary sources have improved on the raw data, or because the data were more conveniently formatted by the secondary source. Many publications and websites repackage primary source data, combining data from different sources on related topics in ways that provide added value for researchers. On education, for example, the journal Education Week produces an annual publication and database, Quality Counts, that reports a series of indicators on state educational achievement, school climate, state policies, and fiscal resources. The journal also scores states on several indexes related to school finance and student achievement. I] Similarly, the annual Kids Count Data Book provides a similar compilation of data on the status of American children.[ii]

    Some data were not available on any statistical website but were obtained from published research articles and, in one case, by requesting that author of a published study forward a copy of the data.  Finding data on the timing of state and national laws is a particularly challenging task.  For the analysis of Election Day registration in chapter 4, this information was provided through a published study and an internet search that revealed the timing of Oregon’s law. To fill-in missing data on the civil registries in Greece and Luxembourg, I phoned the Luxembourg embassy in Washington and emailed a graduate school friend working with a Greek political campaign organization.

    For most research endeavors, a good literature review is not only essential to defining the critical issues but also reveals crucial information on the kinds of data and the primary data sources that have been used to address the issue in the past.  Using both library and internet search engines, I have often found it useful to include the words “table” or “figure” with the search terms to find data-based research studies.

    General Data Sources

    The U.S. Statistical Abstract

    The U.S. Statistical Abstract, published annually by the Census Bureau since 1878, serves as a comprehensive source for American social indicator data and as a reference guide to a wide variety of U.S. governmental, international, and a few private data sources. [iii] The 2007 edition of the Abstract contains 1,376 data tables organized by thirty topical chapters (such as Population, Vital Statistics, Elections, Education, Agriculture, and Foreign Commerce). Each chapter begins with a general discussion of the data, how the data were collected, and the reliability and validity of the primary data sources.

    The Statistical Abstract website (www.census.gov/compendia/statab) contains every edition of the Abstract, downloadable in Adobe .pdf format. The website for the 2007 edition contains each chapter of the printed edition downloadable in .pdf format and links to spreadsheet files for each of the Abstract’s tables. The spreadsheet files often contain more data than is shown in the printed edition (for previous editions, these spreadsheet files were available on a CD-Rom). This is especially true of tables containing time series data. The historical poverty and income data in the printed Abstract tables, for example, covers the years from 1980 to 2005, while the spreadsheet files contain annual data back to the beginning of the income series in 1959. Note, however, that the Abstract’s data can be as much as a year old and the primary agency website will contain both the most recently released data and more detailed tabulations. Most of the printed Abstract tables indicate the original agency providing the data and the files contain direct links to the agency websites.

    Federal Statistical Agencies

    Almost every federal department, and many federal agencies, have a statistical bureau or agency website providing program operations data (such as characteristics of the targeted population of the agency’s programs), performance indicators (educational achievement data, highway fatality data), data collected from ongoing agency surveys, and data published in special studies and reports (see table 7.1). Many of the statistical agency websites also provide related international data and contain studies and reports addressing the reliability and validity of the data the department collects. The Fedstats.gov website provides a comprehensive set of links to federal agency statistics, organized alphabetically by topic.

    Table 7.1 Federal Statistical Agencies
    Department\
    Agency
    Statistical Agency\ Product Data provided
    Agriculture National Agricultural Statistics Service Agricultural commodities production and prices; quinquennial Census of Agriculture. 
    Commerce Bureau of Economic Analysis GDP and national accounts data, corporate profits
      Economics and Statistics Administration  Corporate profits, gross domestic product, housing construction, US international transactions
      Census Bureau  Decennial population census, annual Current Population Survey, income and economic statistics, international trade.
    Commerce Internal Revenue Service Federal personal and corporate income tax data
    Defense Statistical Information Analysis Division(SIAD)  Military casualties, armed forces personnel
    Education National Center for Education Statistics NAEP data, 
    Health and Human Services National Center for Health Statistics The federal government's principal source for health statistics.
      Social Security Administration Office of Research, Evaluation, & Statistics  Social Security and Medicare program data
      Energy Information Administration Nuclear, electricity, coal, natural gas, and petroleum.
      Environmental Protection Agency     Collects statistics and data on various environmental issues such as toxic chemical releases.
      Interagency Forum on Child & Family Statistics Statistics dealing with children and families.
    Homeland Security Office of Immigration Statistics  Mostly legal immigration and naturalization data
    Housing and Urban Development Office of Policy Development & Research-Dept. of Housing & Urban Development Providing access to Dept. of Housing & Urban Development reports and data on housing and community and economic development.
    Interior Bureau of Indian Affairs Data on American Indian Tribes is difficult to obtain.  The BIA website currently provides no information whatsoever due to a lawsuit concerning the agency's mismanagement of Indian trust assets. The Department of Interior website does provide limited data on National Parks and land management
    Energy Energy Information Administration Nuclear, electricity, coal, natural gas, and petroleum.
    Justice  Bureau of Justice Statistics   Crime, criminal victims, and judicial system operations; death penalty and incarceration data.
      Federal Bureau of Investigation, Uniform Crime Reports Uniform Crime Reports, Hate Crime Statistics
    Labor  Bureau of Labor Statistics  Employment, unemployment, and consumer and producer prices.
      AgingStats.Gov Federal aging statistical sources, publishes the annual Older American Update 
    Office of the President Office of Management & Budget Federal Budget
    Transportation National Center for Statistics and Analysis  Highway safety statistics
         

     

    The Organisation for Economic Co-operation and Development

    The OECD is an international organization of thirty nations and is the best source of comparable international social indicator data for the world’s developed democracies. The data include a broad range of economic indicators, government finance and program statistics, and education and health care data. Most convenient are the data provided through the SourceOECD website, a subscription service available through many colleges and universities, in predefined spreadsheet tables or in tables created through an interactive database query. The OECD provides free access to “frequently requested data” and the data contained in its main statistical reports (The OECD in Figures, The OECD Factbook, Society at a Glance, and Education at a Glance) through OECD.Stat, the OECD’s central data warehouse at stats.oecd.org/wbos.

    World Bank, International Monetary Fund, and United Nations

    The World Bank, the International Monetary Fund (IMF) and the United Nations are primary sources of international financial, trade, social, and economic development indicators. Each provides some data concerning its own programs, such as IMF lending and financial data, and the organizations share much of the economic data. Although the United Nations data are freely accessible, the World Bank and IMF charge fees for access to some of their databases.

    For data on developing countries, the World Bank provides for online queries of an archive of Millennium Development Goal indicators, international poverty estimates, and its World Development Indicator database. The IMF provides access to the same data and data on international trade and commodity prices at www.imf.org/external/data.htm. The United Nations Statistics Division provides access to data from three of its annual publications: Demographic Yearbook, Population and Vital Statistics Report, and Human Development Report. The Human Development Report database contains the most comprehensive set of indicators.

    Be warned that for many indicators on developing nations, particularly African nations, many of the social indicator data series have a great deal of missing data.

    Public Opinion Polling Data

    The Survey Documentation and Analysis center at the University of California, Berkeley, provides online database queries from two time series surveys, the American National Election Studies (ANES), and the National Opinion Research Center’s General Social Survey. The ANES, conducted every two years since 1948, contains a broad set of public opinion and political behavior questions asked in the biennial pre- and postelection surveys. The General Social survey, conducted annually from 1972 to 1994 and biennially from 1994 to 2004, contains a wide range of social, political, behavioral, and demographic questions. To construct time series indicators from these datasets using the center’s interface (sda.berkeley.edu), create a table by selecting “year” as the row variable and the relevant survey question as the column variable, specifying row percentages. The online query system also allows users to recode variables and to create demographic and other breakdowns of the questions.

    Major polling organizations usually maintain online archives of at least the aggregate response tallies to questions asked in most of their regularly administered polls. For the most part, however, the organizations provide only limited public access to their data and require paid subscriptions to access the full archive. Of these, the Gallup organization’s archive provides access to the longest running and most comprehensive set of polls, often in a convenient time series format. Over one hundred American universities and colleges provide their students free access to the Roper Center for Public Opinion Research archives containing polling data from twenty-seven polling organizations in the form of question-level responses, some time series or “trended” survey response data, and raw datasets. More universities subscribe to the LexisNexis® Academic Universe, which provides single question response data from a similar polling archive.

    For online access to international surveys, see Political Participation in the following section on international data sources.


    Notes on Data Sources Used in this Book

    International Data

  • Corruption
  • (figure 6.2) Transparency International’s Corruption Perceptions Index is compiled from surveys conducted by other organizations of international businesspeople and regional and country experts. Transparency International also sponsors its own public survey on corruption in sixty-nine countries to calculate the Global Corruption Barometer, and surveys businesses in exporting countries to construct the Bribes-Payer Index. Related cross-national indicators, employing a similar methodology, are Freedom House’s annual index (since 1973) of political rights and civil liberties and the Heritage Foundation’s Index of Economic Freedom, an annual index (since 1995) based on measures of ten “economic freedoms” related to taxation, protection of private property, and economic regulation. Transparency International’s Corruption Perceptions Index is one of the ten measures.
  • Education
  • (figures 5.1–5.3 and tables 5.1–5.3). The United States participates in several international educational achievement studies: the Trends in International Mathematics and Science Study, the Progress in International Reading Literacy Study, the OECD’s Program for International Student Assessment, and the OECD’s Adult Literacy and Lifeskills Survey. Each of these studies has its own website, but the U.S. National Center for Educational Statistics provides a single site containing the corss national data for each of the surveys and evaluations of the survey methodology and data reliability. Generally, the data consist of both the national scores on the assessment tests and data on family background, student behavior, and school characteristics. The OECD’s annual publication Education at a Glance summarizes the results of the international tests and provides data on a variety of indicators related to school conditions, staffing, and finance.
  • Global Warming
  • (figure 3.24) The Goddard Institute for Space Studies data  on global temperature anomalies is the most commonly cited evidence in the debates over global warming. The data are derived from worldwide meteorological station temperature records since the 1880s. The data measure departures from the normal monthly temperature at each station and are adjusted to account for localized urban warming, date, and time of day. The National Climatic Data Center at provides the most extensive collection of global and regional weather data, including long-term reconstructions of historical temperature data, based on tree-ring analysis, and other methods.
  • Health Expenditures
  • (figure 1.4 and table 1.1) These data were obtained from the OECD’s central data warehouse at stats.oecd.org/wbos.
  • Millennium Development Goals
  • (table 6.1) The World Bank’s Global Data Monitoring Information System provides for online queries of its Millennium Development Goals (MDG) database and the United Nations provides for a similar data query of the forty-eight MDG indicators.
  • Political Participation
  • (figures 4.1, 4.2) Since 1986, the International Social Survey Programme (ISSP) has conducted annual cross-national surveys (for as many as thirty-nine nations) on topical issues including citizen participation, the environment, religion, social inequality, gender roles, and the role of government. The data in figure 4.2 were obtained using the site’s online cross-tabular analysis of the survey data. To calculate country-level measures, cross-tabulate the country code id against a substantive variable. The Comparative Study of Electoral Systems (CSES) provides for similar online queries of cross-national national election surveys (including ANES 2004 data for the United States) for most of the OECD members, and a few other nations. At this writing, the site provides online tabulation only for election surveys conducted from 1996 to 2001. For similar general cross-national survey data, see the World Values Survey website.
  • Poverty—Developing Nations
  • (figures 6.1, 6.2, 6.3) The $1 and $2 a day poverty indicators are contained in the MDG database, but the World Bank also provides somewhat more flexible database query access to regional and national poverty measures through its PovcalNet website.
  • Poverty—Wealthy Nations
  • (figures 3.10, 6.4) The Luxembourg Income Survey compiles a variety of cross-national indicators related to income inequality and poverty using an archive of national income surveys obtained from thirty nations. It also provides cross-national data on wealth and characteristics of national social welfare programs.
  • Voter Turnout and Election Systems
  • (table 4.1) The International Institute for Democracy and Electoral Assistance (IDEA) websiteis a good source of cross-national turnout data and information about national election systems, but the data have not yet been updated for elections after 2001. The IDEA also provides data on women’s electoral participation and representation in national legislatures. The Administration and Cost of Elections Project provides a very comprehensive global survey of national election systems, including procedures for redistricting, voter registration, and vote counting.

    U.S. Data

  • Crime Rates
  • (figures 1.5, 1.10, 1.11, 1.12 and table 1.5) In addition to the National Criminal Victimization Survey data used in several of the figures in chapter 1, the Bureau of Justice Statistics (BJS) provides a series of reports and data compilations on sentencing and imprisonment, capital punishment, drugs, and firearms. The BJS website seems to provide most of the FBI Uniform Crime Report data, although the FBI provides its data in a variety of formats. The FBI also collects hate crime statistics, but inconsistent local reporting of these crimes results in serious reliability problems.
  • Dow Jones Industrial Average
  • (figure 3.29) While the DJIA is the oldest and most recognized measure of stock market performance, it indexes the stock prices of only thirty companies. The Dow Jones Wilshire 5000 Composite Index is a much broader measure of stock market performance. The Yahoo! Finance website is a convenient source of these and other stock market related data.
  • Educational Achievement—National Data
  • (figures 5.5–5.9) The National Center for Education Statistics (NCES) website provides several means of accessing National Assessment of Educational Progress (NAEP) data. Many NAEP tables are published in the annual Condition of Education and the NCES website provides access to all (over 400) tables in spreadsheet format. In addition, the NAEP Data Explorer, an online data-query tool, permits users to create their own tabulations from the NAEP database. The NCES’s annual Digest of Education Statistics provides enrollment, staffing, finance, educational attainment, higher education, and international data.
  • Educational Achievement—State NCLB Data
  • (figures 5.9, 5.10, 5.11 and table 5.3) Except in the case of special reports (for example, figure 5.9), the NCES website does not provide access to the data derived from statewide No Child Left Behind (NCLB) testing. Generally, state NCLB data are made available on each state’s department (or board) of education website. Commonly, the state websites provide very easy access to individual school report cards containing school, school district and state test score, demographic and expenditure data. Using the larger files containing data for all the schools and school districts can be more cumbersome: just the codebook listing all the data items for the Illinois Report Card data file was several hundred pages long and the 2007 version of Excel (but not the 2003 version) had difficulty processing the large data file..
  • Education—Higher Education
  • (figures 3.15, 3.16, 3.17) The higher education data in chapter 3 were compiled by the Illinois Board of Higher Education and are made readily available on the Board’s website. Most states have similar governing boards for higher education, but the governance structure varies from state to state. Most multi-institution governing boards and most colleges and universities have an institutional research department responsible for compiling data and preparing reports on enrollments, tuition and fees, staffing, expenditures, and student academic performance. Often the data are presented in an annual data profile. The NCES provides some higher education data, mostly data concerning enrollments, tuition, and programs. Although American universities are currently going through an “assessment” fad, there are no reliable measures of educational achievement for higher education in the United States.
  • Federal Budgets
  • (figures 1.8, 1.9 and 3.6, 3.8, 3.11, 3.14, 3.22, 3.23) The president’s Office of Management and Budget submits the proposed federal budget for the each fiscal year (beginning October 1) to Congress in January of each year. The last section of each budget, the Historical Tables, contains an extensive set of time series tables, following the same table numbering and format in each year’s volume. When using the federal budget data, be aware of the distinction between spending by function and by agency. Not all education spending, for example, is in the Department of Education’s budget, some Defense spending is in the Department of Energy budget, and the Department of Agriculture budget includes the food stamp program. Usually, the budget data defined by functional categories (function and subfunction) are more meaningful. The actual budget documents and spreadsheet files are available on the White House, the Office of Management and Budget, and the Government Printing Office websites.
  • Gas prices
  • (figure 3.29) The Department of Energy’s Energy Information Administration provides weekly gas price data and data related to all aspects of energy production and consumption. The agency’s website also provides data on renewable energy sources and worldwide and international greenhouse gases and emissions.
  • Homeownership
  • (figures 1.6, 1.7) The Census Bureau conducts a decennial Census of Housing and also includes a series of questions on housing conditions and homeownership in its quarterly Current Population/Housing Vacancy Survey. The Census Bureau provides more convenient access to these data than does the Department of Housing and Urban Development.
  • Income
  • (figure 6.8 and table 2.13) Using a single set of responses to the March Current Population Survey, the Census Bureau calculates a large number of income-related economic indicators (and poverty data) on this website. Annual mean and median income (the broader measure), earnings, and wages-and-salary data are reported for households, families, full-time year-round workers, and all persons. The time series data are reported in current and constant (inflation adjusted) dollars.
  • Inflation
  • (figures 1.2, 3.29) The Bureau of Labor Statistics is the primary source for consumer and producer price indexes. The consumer price index measures price changes in a market basket of goods and services that consumers typically purchase and is often used to adjust monetary times series data to constant dollars. The Bureau also provides several related inflation indexes and measures for specific sectors of the economy, such as energy and retail food. Note that the inflation rate is a complex statistic and the indicator may underestimate or overestimate the true inflation rate in several different ways. To adjust aggregate government expenditures for inflation, the Gross Domestic Product Deflator is the better measure. It is most conveniently found in the U.S. Budget Historical Tables, table 1.10.
  • The Misery Index
  • (figure 1.1) The Misery Index data used in figure 1.1 was obtained from a secondary website at miseryindex.us. The Bureau of Labor Statistics is the primary source of data for both unemployment and inflation.
  • National Debt
  • (figure 1.9) See section 7 of the Federal Budget.
  • Presidential Elections
  • (figure 3.28) The United States may be one of the few democracies where no single national governmental agency maintains official elections records, although the Federal Election Commission does maintain a database on campaign finance reports for federal elections and the Clerk of the House of Representatives does publish the vote counts (in a somewhat clumsy format) for each federal election since 1920. For the most part, official election records are maintained by each state’s Secretary of State office. Congressional Quarterly, Inc., a privately owned publishing company, collects almost all of the data related to the votes-cast turnout measures and results of gubernatorial and federal elections, published in its biennial, America Votes. The elections outcome data reported in the U.S. Statistical Abstract are mostly obtained from Congressional Quarterly, but the Abstract is the more accessible source of the data. The Interuniversity Consortium for Political and Social Research’s United States Historical Election Returns series contains congressional, presidential, and gubernatorial election return data at the state and county level for elections from 1788 through 1990. Many universities and colleges are members of the ICPSR, which provides an extensive library of raw data from surveys and research studies.
  • Political Corruption
  • (figures 3.25, 3.26) Political corruption is generally not included among the crimes reported on the Bureau of Justice Statistics website. To obtain the state data on prosecution rates of public officials, I e-mailed one of the authors of the study cited, Kenneth Meier, and he graciously sent them to me. He developed the measure based on data obtained from an annual (since 1978) report submitted to the Congress by the Department of Justice Public Integrity Section, which details convictions for political corruption for each U.S. Attorney’s office. Of course, there is a fundamental validity question involved in using the “number of officials caught” as a measure of political corruption.
  • Poverty
  • (figures 6.5, 6.6, 6.7, 6.9) In addition to the annual March Current Population Survey (CPS) that has been used to measure income and poverty since 1959, income and poverty estimates are also derived from the decennial census and, since 1998, from the Census Bureau’s monthly American Community Survey (ACS). The ACS is a much larger survey, sampling three million households each year, versus less than 100,000 for the CPS. Because the three surveys are conducted at different times of the year and use slightly different definitions of the target populations and adjustments for inflation, they produce slightly different estimates. ACS income estimates tend to be about four percent higher than those derived from the decennial census. The ACS also includes questions about housing, immigration, citizenship, and employment and will eventually replace the decennial census long-form questionnaire that has been administered to one out six households.  All these data are available at the Census Bureau’s Poverty site.
  • Presidential Approval
  • (figure 3.24). The standard presidential approval ratings are based on one of two questions. Since 1937 the Gallup Poll has asked, “Do you approve or disapprove of the job [president’s name] has done as president?” The alternative question, first used by the Harris Poll, asks, “How would you rate [president’s name] performance on the job: excellent, good, fair or poor?” The many other polling firms now use one or the other, or a slight variation, on these questions. The Gallup poll website has the most complete historical data on presidential approval and would be the best source for comparing several administrations’ approval data, but access to their data requires a subscription fee. The most complete collection of presidential approval data for each administration, but not including Zogby data shown in figure 3.24, are available (for nonsubscribers) from the Roper Center website at . The Professor PollKatz Poll of Polls website contains time series charts (but not the actual data) on presidential approval surveys conducted by fifteen polling organizations. The Pollingreport.com website is also an excellent source for political polling data on upcoming state and national election races.
  • Unemployment
  • (figures 1.2, 1.11, 3.24, 3.29) The Bureau of Labor Statistics provides a comprehensive set of monthly employment and unemployment statistics that are easily downloaded from their website. Note that the unemployment rate is a complex statistic with many issues involving the counts both of the workers and members of the labor force. The Bureau’s publication “How the Government Measures Unemployment” provides an excellent summary of how the indicator is constructed.
  • Social Capital Index
  • (figures 3.25, 4.7) The data for Robert Putnam’s Social Capital Index are available online through the Interuniversity Consortium for Political and Social Research (ICPSR) at www.icpsr.umich.edu.
  • Voter Turnout—Voting Age Population Measures
  • (figures 4.3, 4.4, 4.5) The Census Bureau’s postelection Current Population Survey data provide reported voter turnout data at the state level and for several demographic categories such as education, age, race, ethnicity, and gender. The data can be obtained from the Bureau’s Voting and Registration website. The votes-cast measure are reported in the Statistical Abstract and obtained from the United States Elections Project website. In recent years, the Census Bureau has begun reporting turnout rates based on estimates of the voting-age citizen population.
  • Voting Turnout—Voting Eligible Population Measures
  • (figures 1.26, 4.6, 4.7 and tables 4.2, 4.3) Michael McDonald maintains the United States Elections Project website that provides detailed state-level data used in calculating the voter-eligible turnout measures.
  • War Casualties

  • (table 2.11) The Iraq Casualties website provides accessible data on the Iraq War casualties and includes fatality data for the other coalition partners. In also includes, and documents, some casualty data based on news sources that the Defense Department has not yet confirmed. For official U.S. armed forces casualties, the Defense Department’s Statistical Information Analysis Division (SIAD) Military Casualty Information website provides a considerable amount of data on U.S. war casualties back to the Revolutionary War. For recent wars, the casualties are categorized by race, ethnicity, gender, service, home state, and circumstances.

    For the most comprehensive set of data on the Iraq war, see the Brookings Institute's IRAQ INDEX
    Tracking Reconstruction and Security in Post-Saddam Iraq
    (which, unfortunately, is in .pdf rather than Excel format).

    Notes on Data Formats The data from the statistical websites come in a variety of formats, even at a single agency website, and some agencies and websites make their social indicator data more accessible than others do. Ideally, the agency will provide the data in spreadsheet or a spreadsheet-compatible format (such as .csv), but data are also made available in web page (html) tables and in .pdf format. Data stored in a web page table easily transfers to a spreadsheet, by either cutting-and-pasting or by opening the URL address from the spreadsheet program. Some data are only available in Adobe .pdf files, particularly data contained in agency reports and research studies. The newest versions of the Adobe document viewer have some special features for copying tables and columns of data that work well with some tables but not others. Sometimes, all the columns from a .pdf data table (or data in plain text format on a web page) will “paste” into a single spreadsheet column. Depending on the table format, the “text to columns” function can often sort these data out.

     

    Other Data Resource Guides

    Hit Counter


    Notes

    [i]. Quality Counts at 10: A Decade of Standards-Based Reform,” Education Week, 25, no. 17 (January 5, 2006)

    [ii]. Annie E. Casey Foundation, 2007 Kids Count Data Book (Baltimore: Annie E. Casey Foundation, 2007) at www.kidscount.org/sld/databook.jsp.

    [iii]. U.S. Census Bureau, Statistical Abstract of the United States: 2007, 126th ed., 2006, at www.census.gov/statab/www/.