Content Analysis and Qualitative Research
Overview
Content analysis is the manual or automated coding of documents, transcripts, newspapers, or even of audio of video media to obtain counts of words, phrases, or word-phrase clusters for purposes of statistical analysis. Typically the researcher creates a dictionary which clusters words and phrases into conceptual categories for purposes of counting. Various constraints may filter the count, such as the constraint that one concept be or not be within so many words of another concept. While content analysis is normally focused on the analysis of print media and media transcripts, it is applicable to any form of communication, as, for instance, in the study by DuRant et al. (1997) on "Tobacco and alcohol use behaviors portrayed in music videos."
There are a large number of reasons for conducting content analysis, many enumerated by Berelson (1952) over half a century ago:
- To describe trends in content over time
- To describe the relative focus of attention for a set of topics
- To compare international differences in content
- To compare group differences in content
- To compare individual differences in communication style
- To trace conceptual development in intellectual history
- To compare actual content with intended content
- To expose use of biased terms in propaganda research
- To test hypotheses about cultural and symbolic use of terms
- To code open-ended survey items
Related information is contained in the sections on case study research and on ethnography.
Key Concepts
Krippendorf (2004) identifies five key processes inherent to content analysis:
- Unitizing. The researcher must establish the unit of analysis (word, meaning, sentence, paragraph, article, news clip, document, etc.).
- Sampling. Usually the universe of interest is too large to study the content of all units of analysis, and instead units must be sampled. Sampling involves counting, which may require the researcher to develop thesauruses (so different terms with like meanings will be counted under the same construct) and expert systems or other rule engines (so the proper contextual valence is assigned to each counted construct).
- Reducing. Content data must be reduced in complexity, usually by employing conventional summary statistical measures. Coding and statistical analysis is covered by Hodson (1999).
- Inferring. Contextual phenomena must be analyzed to provide the context for findings.
- Narrating. Conclusions in the content analytic tradition are usually communicated using narrative traditions and discursive conventions.
Software Resources
- ATLAS.ti is software for text analysis and model building. It handles graphical, audio, and video data files as well as text. With this package one can code and/or annotate text or media segments in a variety of ways, search/select segments by code (using proximity, Boolean, or semantic thesaurus methods), create hotlinks connecting segments, and display relationships among segments in diagrammatic format. An automatic coding mode codes all similar segments according to defined patterns. Video segments can be as small as frames and likewise audio segments can be detailed. Network diagrams, created with the built-in semantic network editor, can be exported to graphics and word processing packages and a built-in HTML generator creates web pages for sharing work with collaborators. Visually, annotations and links are made in a margin area of the computer display. Data can be generated in SPSS format for further analysis. However, ATLAS.ti is not a content analysis package per se, but rather a text management package lacking fundamental content analysis statistical functions.The Atlas-ti website is at http://www.atlasti.com.
- The General Inquirer is the classic package for content analysis, now web-enabled by psychologist Phil Stone (Harvard University). It contains large content dictionaries (Lasswell Value Dictionary; Harvard Psycho-Sociological Dictionary) which are used in conjunction with text scanning software to establish patterns in the meaning of words. The General Inquirer is now being distributed by the Zentrum fuer Umfragen, Methoden, und Analysen (ZUMA, Mannheim); for more information, contact Dr. Peter Ph. Mohler, O05@DHDURZ2.
- Intext and TextQuest. TextQuest is tje Windows version of the Intext content analysis software developed by Harald Klein, with a website at http://www.intext.de.
The software produces word lists, word sequence lists, word permutations, cross-references, and basic content analysis functions.
- NUD*IST has been a leading content analysis package, discussed by Richards and Richards (1991). It allows authors to establish lexical and conceptual relations among words, to index text files, and to conduct pattern matching and searching operations using Boolean co-occurrences of nodes in the text. NUD*IST can be used in conjunction with grounded theory to create and analyze theories and provide a framework for understanding. At this writing, the latest version is called simply N6. It can handle coding categories and sub-categories, supporting hierarchical indexing; browse and code documents and indexing databases; search for words and word patterns and combine them in indexes; "memo link" emerging codes and categories with their associated documents; and create new indexing categories out of existing ones. NUD*IST is referenced in numerous articles but has been replaced by nVivo and xSight software from QSR, the publisher of both.
- nVivo is qualitative analysis software useful for coding themes embedded in transcripts. Produced by QSR. nVivo software allows the researcher to import, sort and analyze audio files, videos, digital photos, Word, PDF, rich text and plain text documents; work with transcripts or work without them, analyzing material straight from audio and video files, or create transcripts or text files in the software as you go; import and code documents, including those that contain tables and images; work with material in any language and choose to work with an English, Simplified Chinese, Spanish or Japanese user interface; query one's data with a powerful search engine; graphically display project information, connections and findings in real time using models and charts; merge separate projects and still identify which work was completed by which person, as well as view the notes and analysis completed by each team member; and more.
- xSight is also qualitative analysis software from QSR, but with an orientation toward conceptual mapping. It allows users to capture ideas visually with 'maps', just as one would on a whiteboard or large flip chart; zoom in, zoom out, or drill down in the map; query data with a powerful search engine; use highlighter pens to capture or highlight information; work in multiple languages; and create reports and presentations..
- QUALRUS
is a general-purpose qualitative analysis program which supports
text and multimedia sources. It offers intelligent suggestions throughout
the coding process and comes with a number of tools to help with
analysis of data once it has already been coded. Users can customize and
automate many tasks by taking advantage of Qualrus's scripting
language. A free, functional demo version is available. More information on Qualrus is available at its homepage,
http://www.qualrus.com.
- TextSmart is SPSS's module for coding and analyzing open-ended survey questions. It supports text management, seaching, and some forms of text analysis. Its "Import Wizard" brings text data into a tab-delimited ASCII file format, on the fly filtering responses by automated stemming (a linguistic engine which identifies word stems to combine terms), aliasing (grouping synonyms), and excluding trivial words. The automatic categorization option automatically clusters terms that tend to occur together in responses, to create meaningful categories automatically. Some categorization parameters are user-controllable and the researcher can create his or her own categories by combining categories using Boolean logic. Output can be to an SPSS or a tab-delimited ASCII file, and categorization parameters can be saved for future TextSmart runs. Because TextSmart is "dictionary-free," the researcher is freed of the burden of creating a coding scheme or concept dictionary prior to beginning analysis. By the same token, if the control which comes with a user-defined dictionary is wanted, TextSmart is not the appropriate tool. Online information is available from SPSS, Inc.
Assumptions
- Sampling. Content analysis is subject to all the usual biases and problems of sampling.
- Contextual bias. Particularly in automated content analysis, crucial context for word and meaning counts may be flawed.
Frequently Asked Questions
- What is the address for the online discussion list about content analysis?
The CONTENT list is at content@sphinx.gsu.edu. To join, send the message "subscribe CONTENT yourfirstname yourlastname" (without quotes) to listproc@listproc.gsu.edu. The list editor has been William Evans, who maintains an archive site with additional resources at http://www.gsu.edu/~wwwcom/content.html.
- Where else can I find out about content analysis software?
Bibliography
- Berelson, B. (1952). Content Analysis in Communication Research. Glencoe, Ill: Free Press.
- DuRant,R. H.; E S Rome, M Rich, E Allred, S J Emans & E R Woods (1997). Tobacco and alcohol use behaviors portrayed in music videos: a content analysis. American Journal of Public Health 87(7), 1131-1135.
- Franzosi, Roberto (1990). Computer-assisted coding of textual data. An application to semantic grammars. Sociological Methods and Research, 19/2: 225-257.
- Gottschalk, Louis A. (1995). Content analysis of verbal behavior: New findings and clinical applications. Hillsdale, NJ: Lawrence Erlbaum.
- Hodson, Randy (1999). Analyzing documentary accounts. Thousand Oaks, CA: Sage Publications. Quantitative Applications in the Social Sciences Series No. 128. Describes random sampling of ethnographic field studies as a basis for applying a meta-analytic schedule. Hodson covers both coding issues and subsequent use of statistical techniques.
- Ilo, Saidat (2005). Research in public administration: A content analysis of applied research projects completed from 1999-2005 at Texas State University in the Masters of Public Administration Program. San Marcos, TX: Texas State University. Retrieved 9/27/07 from http://ecommons.txstate.edu/cgi/viewcontent.cgi?article=1010&context=arp. Provides a review of the use of content analysis in relation to analyzing MPA education.
- Klein, Harald (1991). "INTEXT/PC - A program package for the analysis of texts in the humanities and social sciences." Literary and Linguistic Computing, 6/2: 108-111.
- Krippendorf, Klaus (2004). Content analysis: An introduction to its methodology. 2nd ed. Thousand Oaks, CA: Sage Publications.
- Neuendorf, Kimberly A. (2002). The content analysis handbook. Thousand Oaks, CA: Sage Publications. Covers history of content analysis, sampling message units, handling variables, reliability, and use of NEXIS for text acquisition. Also covers PRAM, software for reliability assessment with multiple coders.
- Nissan, Ephraim, and Klaus Schmidt, eds. (1995). From information to knowledge: Conceptual and content analysis by computer. London: Intellect.
- Phillips, Nelson and Cynthia Hardy (2002). Discourse analysis: Investigating processes of social construction. Thousand Oaks, CA: Sage Publications. Perhaps the first full-length book on discourse analysis.
- Popping, Roel (1999). Computer-assisted text analysis. Thousand Oaks, CA: Sage.
- Richards, Thomas J. and Lyn Richards (1991), "The NUD.IST qualitative data analysis system", Qualitative Sociology 14(4), 307-24.
- Riffe, Daniel, Stephen Lacy, and Frederick G. Fico. (1998). Analyzing media messages: Using quantitative content analysis in research. Mahwah, NJ: Lawrence Erlbaum, 1998.
- Roberts, Carl W. and Roel Popping (1993). "Computer-supported content analysis: Some recent developments." Social Science Computer Review, 11: 283-291.
- Roberts, Carl W., ed. (1997). Text analysis for the social sciences: Methods for drawing inferences from texts and transcripts. Mahwah, NJ: Lawrence Erlbaum.
- Smith, Charles P., ed. (1992). Motivation and personality: Handbook of thematic content analysis. New York: Cambridge University Press.
- Stemler, Steve (2001). An overview of content analysis. Practical Assessment, Research & Evaluation 7(17). Available online: http://edresearch.org/pare/getvn.asp?v=7&n=17.
- Stone, Philip J., Dexter C. Dunphy; Marshall S. Smith, and Daniel M. Ogilvie (1966). General Inquirer: A computer approach to content analysis. Cambridge, MA: MIT Press. The original work popularizing The General Inquirer.
- Weber, Robert P. (1990). Basic content analysis. Second ed. Newbury Park, CA: Sage Publications. A standard introductory overview.
- Weitzman, Eben. A.; Miles, Matthew B. (1998): Computer programs for qualitative data analysis. A software sourcebook. Second ed. Thousand Oaks, CA: Sage Publications.
Copyright 1998, 2008 by G. David Garson.
Last updated 11/15/08.