By Shooshan Danagoulian and Kosali Simon
This is part of a series of articles that the ASHecon Newsletter will feature on health data topics.
As political and economic forces continue to change the financial landscape of the U.S. health care industry, hospitals face new payment structures and incentives to alter their practices and organization. Though hospitals account for the largest share of national health expenditure (33 % in 2017, by NHEA 2019),  the hospital sector maybe a challenging setting for new researchers to study because of lack of familiarity with data resources. Here, we describe four fairly widely available datasets on hospital financial, operational structure and other characteristics that can be used by economists to study changing financial incentives, mergers and acquisitions, and other operations of hospitals. Note that this review is not intended to be exhaustive. In addition, researchers often will want to use these data in combination with hospital claims or discharge records; a follow-on piece will describe these other kinds of hospital data. For each resource, we describe how to obtain the data, the information they contain, notable features and challenges, and some recent publications using those data (see Table 1). In the sections below, we synthesize and condense descriptions by researchers who have used or created each database.
Stay tuned for a Twitter discussion #hospitaldata (@ashecon and @Shooshan5) that we hope will lead to crowd-sourced comments on other possible hospital based data resources.
Table 1: Download URL for Datasets
|The Healthcare Cost Report Information System (HCRIS)||CMS: https://www.cms.gov/Research-Statistics-Data-and-Systems/Downloadable-Public-Use-Files/Cost-Reports/|
|Ian McCarthy’s Github Repo: https://github.com/imccart/HCRIS|
|Adam Sacarny’s Github Repo: https://github.com/asacarny/hospital-cost-reports|
|Hospital Ownership Transitions in the United States||https://healthcarepricingproject.org|
|Direct link to data: https://healthcarepricingproject.org/sites/default/files/20190108_mergerdata.zip|
|Hospital Compare||CMS Hospital Compare Archive: https://data.medicare.gov/data/archives/hospital-compare|
|AHA Hospital Survey||www.ahadata.com|
|Survey Questionnaire: https://www.ahadata.com/wp-content/uploads/AHA_Annual_Survey.pdf|
The Healthcare Cost Report Information System (HCRIS)
By Ian McCarthy and Sayeh Nikpay
As part of the Social Security Act, hospitals are required to submit an annual reimbursement settlement using Form 2552. Integrating the information provided on the form, the Centers for Medicare and Medicaid Services (CMS) compiles the Healthcare Cost Report Information System (HCRIS) on an annual basis. Data are currently available for year (2019 quarter 2), and start in 1996. Hospitals are identified using the CMS Medicare Record Number (MRN), so that multiple hospitals on the same campus may be incorporated in a single record.
HCRIS includes the most detailed publicly available financial data on U.S. hospitals. The financial information includes income statements, the balance sheet, and the statement of cash flow. Income statements represent the accounting of revenue and costs related to operations; the balance sheet includes accounting of all short and long-term assets and liabilities; and the statement of cash flow is the accounting of cash coming into and out of the hospital. In addition to financial data, HCRIS includes basic hospital characteristics such as available services, bed allocations, and some specialized staffing information such as nurse employment and wages.
Because of its long duration and explicit hospital identifiers, the HCRIS is particularly suited for health policy studies of the hospital industry, especially in settings where the policy effect varies geographically. Some researchers have also used these data to estimate prices, profit margins, and labor inputs. The degree of detail allows the data to be used for studies of many types of hospital behaviors.
HCRIS data should be used with great caution, however. The cost reports do not inform us about hospital quality, organizational structure (i.e. a hospital’s relationship with physicians), or the relationship with insurers. Furthermore, the HCRIS reports are not audited and are often not standardized; not all hospitals use the same reporting conventions. Keeping in mind that these reports were not intended for research use, be alert for timing inconsistencies. While the data are reported annually, the relevant time period is a fiscal year as defined by the hospital, which is sometimes aligned with the federal fiscal year of October of the previous year through September of the named year. Users should note that the data are labeled with the Federal fiscal year that the data were received rather than the hospital’s actual fiscal year.
Another reason for lack of consistency in defining a fiscal year is that any hospital undergoing mergers, acquisitions, or splits may have a shorter fiscal year and, as a result, may send duplicate reports to HCRIS. To gain more context, one may have to refer to the Hospital Ownership Transitions in the United States dataset described in the next section for independent confirmation of changes.
Furthermore, because hospitals often do not report data as instructed, researchers using these data should exercise additional caution. Data may reflect a hospital’s own interpretation of relevant information, resulting in discrepancy. To get context on a specific question, it may be helpful to examine blog posts or slide decks from consultants to see how hospitals are advised to fill out the questionnaire.
In 2010, the HCRIS questionnaire was changed, impacting the composition and location of many questions. In addition to disrupting the continuity of reporting, this change in form also resulted in duplicate reporting by some hospitals, as they submitted both old and new versions of the form. If the hospital fiscal year does not correspond to the federal fiscal year, a duplication of the hospital record may occur. In such a case, the hospital may choose to report the end of the old fiscal year as a separate record from the beginning of the new one corresponding to the federal reporting fiscal year.
Finally, HCRIS does not follow Generally Accepted Accounting Principles. As a result, a good understanding of a healthcare accounting or revenue cycle management textbook, such as Finkler and Ward (1999) or Herbert (2012), would be a useful reference for those working with this dataset.  To better understand the mismatch between the Generally Accepted Accounting Principles and the HCRIS cost reports, see Kane and Magnus (2001).
Those interested in working with HCRIS may wish to reference the following recent studies: Nikpay, Buchmueller and Levy (2015), Dranove, Garthwaite, and Ody (2016, 2017) for careful treatment of uncompensated care data; Nikpay (2019) who shows that measurement error in many financial outcomes is mean reverting; Dafny (2009) and Dafny, Ho, and Lee (2019) for hospital mergers; Friedman, Owen, and Perez (2016) for hospital closures; Darden, McCarthy, and Barrette (2018) and Lin and McCarthy (2019) for hospital pricing and physician practice.
HCRIS is publicly available and free to download (see Table 1 for download urls) directly from CMS, or a dedicated NBER portal. Two alternate sources of HCRIS data are Github collections created by Ian McCarthy and by Adam Sacarny, which include both raw and cleaned versions of the data, code, as well as extensive tips and instructions for use.
Hospital Ownership Transitions in the United States
By Zack Cooper and Stuart Craig
For their study, Cooper, Craig, Gaynor, and Van Reenen (2019) created a database of hospital ownership transitions using the American Hospital Association (AHA) Annual Survey of Hospitals. The data are aggregated at the annual level, from 2001 to 2014, and including hospital mergers and acquisitions. Each hospital site is identified using AHA hospital identifiers, along with geographic latitude and longitude based on hospital address.
The data include information on hospital ownership structure at the system level, making corrections for irregularities in reporting in the AHA Hospital Survey. First, when only one hospital answers the survey on behalf of previously separate multiple hospitals once a merger has been executed, the data corrects the record to reflect this change in hospital ownership structure. In the survey, this typically results in the combining of two hospitals into a single observation. In these cases, the authors disaggregate the transaction and impute observations for the “absorbed” hospitals in years following the merger, and, instead, track the ownership change using the system ID variable. Second, hospital ownership is typically recorded with a lag in the survey, while other variables in the survey are reported with respect to the reference year (typically one year prior to the answering of the survey). The authors undo these discrepancies and use several ancillary data sources on merger and acquisition activity to verify existence and the date of the merger (using Irving-Levin Associates, Factset, and SDC Platinum).
The Hospital Ownership Transition data is well suited for studying the changing organization of hospitals around mergers and acquisitions. However, the authors caution that while the data capture the universe of hospital ownership transitions between 2001 and 2014, the data are restricted to general surgical acute care hospitals. They do not include long-term care facilities or other specialty facilities (e.g. children’s hospitals, cardiac hospitals, or other specialty care facilities).
For those interested working with the data, the primary source of reference is the study by Cooper, Craig, Gaynor, and Van Reenen (2019) which studies the impact of hospital market structure on spending on privately insured patients. Additionally, Craig, Grennan, and Swanson (2018) examine the role of mergers in hospital buying power.
The data are available for free download from the Health Care Pricing Project website (see Table 1). The website includes the data, programming code, as well as a thorough description of data, identification of mergers, imputations, as well as a codebook.
By Cong Gian
As part of the Hospital Quality Initiative, CMS provides public data on hospital quality and performance to inform patient choice. The archives of these reports are available from 2005 to 2017. CMS publishes the data quarterly; however, some variables are measured for a quarter (with a year-over-year comparison) and updated quarterly, while others are measured at an annual rate and updated annually. Hospitals are identified using the National Provider Identifier (NPI), and geographic identifiers include zip code, county name (not FIPS), city, and state. These data include information on over 4,000 Medicare-certified hospitals, including 130 Veterans Affairs (VA) medical centers.
Though the data are collected primarily to enable patient choice of hospital, some financial information is included here. Payment is reported by patient type, specific clinical category, and some major diagnosis related groups. More particularly, average payment is reported for Medicare spending per patient at the hospital level. Payments for specific clinical categories include those for aortic aneurysm, kidney/urinary tract infection, and spinal fusion. Payment are also reported for major diagnosis related groups such as heart attacks, heart failure, pneumonia, and hip/knee replacement. Another financial measure collected by Hospital Compare is the Hospital Value-Based Purchasing index, which reflects payment adjustments to Medicare reimbursements for critical access hospitals.
In addition, Hospital Compare includes a wide range of patient experience indicators. Information includes hospital safety culture, timeliness and effectiveness of care, complications rate and unplanned hospital visits, and psychiatric unit services. Among variables reflecting the hospital safety culture are use of electronic lab results, use of checklists for safe surgical practices, and a survey of safety culture. Timeliness and effectiveness of care is reported for specific conditions, most notably heart attack, sepsis, flu vaccination, pregnancy delivery, and medical imaging. Furthermore, timing of emergency department care includes median time from arrival to admission, from admission decision to inpatient care, from arrival to discharge, and the percentage of patients who left without being seen.
Furthermore, Hospital Compare data has a Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS) module which contains data from patient surveys about critical aspects of patients’ hospital experiences (communication with nurses and doctors, the responsiveness of hospital staff, the cleanliness and quietness of the hospital environment, pain management, communication about medicines, and discharge information).
The data are well-suited for studying quantity and quality of medical care in response to policy changes. The limited but reliable reimbursement information also allows for studies of price changes. The Hospital Compare Data also allows CMS to construct a quality index which “ranks” hospitals according to a 5 star rating. Therefore, another way to use the data is to combine Hospital Compare with other hospital level data to explore the questions of how hospitals characteristics (such as production capacity) affect hospitals’ performance (such as ratings). Nonetheless, the data cannot be used to evaluate hospital operation per se, as it contains limited information on capacity, such as beds and staffing, and revenue.
Because not all variables are collected throughout the length of the survey, researchers interested in Hospital Compare should first identify availability of variables of interest during the period of study. Furthermore, the questionnaire was changed in the third quarter of 2014, resulting in discontinuity in some questions. Data should be processed separately before and after the switch, and differences reconciled when merging.
For researchers starting to work with this data, some recent publications to consider include Akinleye, McNutt, and Lazariu (2019) on the association of hospital finances and quality of care; Dor, Encinosa, and Carey (2015, 2016) which examine the effect of the Hospital Compare disclosure on hospital pricing; Perez and Freedman (2018) compare Hospital Compare quality rating to social media hospital ratings, finding overall consistency in evaluation, with some inconsistency in risk-adjusted patient safety and clinical quality. Most studies use the dataset in combination with other data.
The Hospital Compare data (current and archival) can be downloaded for free from the CMS website (see Table 1).
American Hospital Association Hospital Survey
By Shooshan Danagoulian
The American Hospital Association conducts an annual survey of its members, generating the AHA Hospital Survey. Data are available from 1980 to the present on an annual basis for the universe of hospitals in the US. Hospitals are identified with an AHA ID and, depending on the centralization of services and physician arrangements within the system, a hospital is identified as a stand-alone entity or part of a group. Also, hospitals are identified according to their geographic location including address, zip code, county, city, and state.
The AHA Hospital Survey offers the most complete overview of hospital operation. The survey includes financial information about insurance and alternative payment models, Medicare and Medicaid utilization, detailed financial revenue and expenses, uncompensated care, revenue by payer type, financial performance, fixed assets, total capital expenses, and IT cybersecurity expenses. Beyond financial information, the AHA Hospital Survey includes detailed hospital characteristics, including organization structure, facilities and services, beds and utilization, as well as detailed staffing information for physicians, hospitalists, intensivists, physician extenders, and nurses by service.
The most recent FY2018 survey includes new modules and modifications. The Population Health module includes information about hospital engagement in remote patient monitoring, diabetes prevention programs, and other community engagement services. Facilities and Services have additional information about co-located specialty services, air ambulance services, on- and off-campus emergency departments, and outpatient clinical sites. The Physician-Organization Arrangement module includes ownership share, organization of physician practice, and the proportion of primary care versus specialty care physician practice. The Alternative Payment Models includes additional information about bundled payment arrangements, payer type, accountable care organizations, and proportion of patient revenue represented by payer type. The survey also delves deeper into hospital affiliation with insurers, with additional questions about provider-owned health plans, hospital partnerships with insurers/health plans, and self-administered employee health plans.
The AHA Hospital Survey is not a free resource, and must be purchased directly from the AHA. Though the complete survey is available for purchase and will be priced at academic rates, a report can be generated at a much lower cost on specific variables and topics. Interested researchers should contact the AHA by completing an interest form (see Table 1 for link) for a direct price quote.
Shooshan Danagoulian is Assistant Professor of Economics at Wayne State University.
Kosali Simon is the Associate Vice Provost for Health Sciences and the Herman B Wells Endowed Professor of Public and Environmental Affairs at Indiana University.
Ian McCarthy is an Assistant Professor of Economics at Emory University.
Sayeh Nikpay is an Assistant Professor of Health Policy at Vanderbilt University.
Zack Cooper is an Associate Professor of Health Policy and Management at Yale University.
Stuart Craig is a Doctoral Candidate in Healthcare Management and Economics at the University of Pennsylvania.
Cong Gian is a Doctoral Candidate in Public and Environmental Affairs at Indiana University.
 The authors would like to acknowledge Deborah Freund, Martin Gaynor, and Michael Richards for helpful comments.
 National Health Expenditure Accounts (NHEA) available at https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/NationalHealthExpendData/NationalHealthAccountsHistorical.html
 Specifically §§1815(a), 1833(e), and 1861(v)(1)(A) of the Social Security Act.
 For example, some hospitals may report bad debt as a negative value, while others may report it as positive value. Another example is that some hospitals appear to report ownership inconsistently across the years, resulting in apparent changes in ownership where there was none.
 Notably, the statements do not include footnotes. Cost reports’ S-10 worksheets contain uncompensated care, the sum of charity care, and bad debt. They have been reliable since 2011. Cost reports’ E Part A worksheets contain the Medicare DSH payments and DSH patient percentage, used to determine eligibility for both 340B program and the Medicare DSH program.
 Finkler, Steven A., and David Marc Ward. Essentials of cost accounting for health care organizations. Jones & Bartlett Learning, 1999.
Herbert, Kyle. Hospital Reimbursement: Concepts and Principles. Productivity Press, 2012.
 Kane, Nancy M., and Stephen A. Magnus. “The Medicare cost report and the limits of hospital accountability: improving financial accounting data.” Journal of Health Politics, Policy and Law 26, no. 1 (2001): 81-106.
 Also see section on Hospital Ownership Transitions in the United States data below.
 Dafny, Leemore. 2009. Estimation and Identification of Merger Effects: An Application to Hospital Mergers. Journal of Law and Economics, 52(3), 523-550.
Dafny, Leemore, Ho, Kate, & Lee, Robin S. 2019. The price effects of cross-market mergers: theory and evidence from the hospital industry. The RAND Journal of Economics, 50(2), 286-325.
Darden, Michael, McCarthy, Ian, & Barrette, Eric. 2018. Who Pays in Pay-for-Performance?
Evidence from Hospital Pricing. Working Paper w24304. National Bureau of Economic Research.
Dranove, David, Garthwaite, Craig, & Ody, Christopher. 2017. How do nonprofits respond to
negative wealth shocks? The impact of the 2008 stock market collapse on hospitals. The RAND
Journal of Economics, 48(2), 485-525.
Schmitt, Matt. 2018. Multimarket Contact in the Hospital Industry. American Economic Journal:
Economic Policy, 10(3), 361-87.
Lin, Haizhen, McCarthy, Ian, and Richards, Michael. 2019 Hospital pricing following integration with physician practices. Working paper.
Dranove, David, Craig Garthwaite, and Christopher Ody. “Uncompensated care decreased at hospitals in Medicaid expansion states but not at hospitals in nonexpansion states.” Health Affairs 35, no. 8 (2016): 1471-1479.
Nikpay, Sayeh, Thomas Buchmueller, and Helen Levy. “Early Medicaid expansion in Connecticut stemmed the growth in hospital uncompensated care.” Health Affairs 34, no. 7 (2015): 1170-1179.
Nikpay, Sayeh. “Characterizing Measurement Error in Hospital Cost Reports.” American Society of Health Economists Meeting. 2018. Available at: https://ashecon.confex.com/ashecon/2018/meetingapp.cgi/Paper/6775
Friedman, Ari B.; D. Daphne Owen and Victoria E. Perez. 2016. “Trends in Hospital Ed Closures Nationwide and across Medicaid Expansion, 2006-2013.” The American Journal of Emergency Medicine, 34(7), 1262-64.
 Zack Cooper, Stuart V. Craig, Martin Gaynor, and John Van Reenen, (2019). “The Price Ain’t Right? Hospital Prices and Health Spending on the Privately Insured.” Quarterly Journal of Economics, 134(1): 51-107.
 Stuart V. Craig, Matthew Grennan, and Ashley Swanson (2018). “Mergers and Marginal Costs: New Evidence on Hospital Buyer Power.” NBER Working Paper 24926.
 Akinleye, D., McNutt, L., Lazariu, V. (2019). Correlation between Hospital Finances and Quality and Safety of Patient Care, PLoS One, 14(8)
Dor, A., Encinosa, W., Carey, K. (2016). Do Good Reports Mean Higher Prices? The Impact of Hospital Compare Ratings on Cardiac Pricing. NBER Working Paper 22858
Dor, A., Encinosa, W., Carey, K. (2015). Medicare’s Hospital Compare Quality Reports Appear To Have Slowed Price Increases For Two Major Procedures, Health Affairs, Vol 34(1)
Perez, V. and S. Freedman. (2018) “Do Crowdsourced Hospital Ratings Coincide with Hospital Compare Measures of Clinical and Nonclinical Quality?” Health Serv Res, 53(6), 4491-506.
 Including whether the hospital or system operates or is part of its own insurance plan.
 Specifically operation margin.