Nursing workload, nurse staffing methodologies and tools: A systematic scoping review and discussion

This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Associated Data

GUID: 7BDA7B10-BF5B-443D-AC30-C22FCEC94572

Abstract

Background

The importance of nurse staffing levels in acute hospital wards is widely recognised but evidence for tools to determine staffing requirements although extensive, has been reported to be weak. Building on a review of reviews undertaken in 2014, we set out to give an overview of the major approaches to assessing nurse staffing requirements and identify recent evidence in order to address unanswered questions including the accuracy and effectiveness of tools.

Methods

We undertook a systematic scoping review. Searches of Medline, the Cochrane Library and CINAHL were used to identify recent primary research, which was reviewed in the context of conclusions from existing reviews.

Results

The published literature is extensive and describes a variety of uses for tools including establishment setting, daily deployment and retrospective review. There are a variety of approaches including professional judgement, simple volume-based methods (such as patient-to-nurse ratios), patient prototype/classification and timed-task approaches. Tools generally attempt to match staffing to a mean average demand or time requirement despite evidence of skewed demand distributions. The largest group of recent studies reported the evaluation of (mainly new) tools and systems, but provides little evidence of impacts on patient care and none on costs. Benefits of staffing levels set using the tools appear to be linked to increased staffing with no evidence of tools providing a more efficient or effective use of a given staff resource. Although there is evidence that staffing assessments made using tools may correlate with other assessments, different systems lead to dramatically different estimates of staffing requirements. While it is evident that there are many sources of variation in demand, the extent to which systems can deliver staffing levels to meet such demand is unclear. The assumption that staffing to meet average need is the optimal response to varying demand is untested and may be incorrect.

Conclusions

Despite the importance of the question and the large volume of publication evidence about nurse staffing methods remains highly limited. There is no evidence to support the choice of any particular tool. Future research should focus on learning more about the use of existing tools rather than simply developing new ones. Priority research questions include how best to use tools to identify the required staffing level to meet varying patient need and the costs and consequences of using tools.

Tweetable abstract

Decades of research on tools to determine nurse staffing requirements is largely uninformative. Little is known about the costs or consequences of widely used tools.

Keywords: Patient Classification Systems, Nurse staffing, Nursing workload, Hospital administration, Workforce planning, Personnel Staffing and Scheduling, Nursing administration research, Operations research, Patient safety, Quality of health care, Validation studies, Workload, Costs and cost analysis, Health care economics and organisations, Hospital information systems, Nursing Staff, Hospital

What is already known about the topic?

There are many studies showing adverse effects of low nurse staffing on patient outcomes.

There has been a longstanding interest in developing systems to determine the required staffing level.

Despite decades of research and a large number of tools, previous reviews have highlighted limited evidence about their use.

What this paper adds

Recent years continue to see reports of new staffing tools and systems. Important sources of variability are neglected in published reports.

Benefits are associated with increased staffing levels but the costs and benefits of using a tool, as opposed to simply increasing staffing, remain unknown.

1. Introduction

Multiple reviews of research have established that higher registered nurse staffing levels in hospitals are associated with better patient outcomes and improved care quality, including lower risks of in-hospital mortality, shorter lengths of stay and fewer omissions of necessary care (e.g. Brennan et al., 2013; Griffiths et al., 2016, 2018b; Kane et al., 2007; Shekelle, 2013). However, beyond providing an injunction to invest in ‘more’ staff, such studies rarely indicate directly how many staff are required. The ability to determine the ‘right’ number of staff, both to employ and to deploy on any given shift, is an imperative from the perspective of both quality and efficiency of care (Saville et al., 2019). In this paper, we consider the evidence base for approaches to measuring nursing workload and tools used to determine the number of nurses that are required for general acute-care hospital wards.

1.1. Nurse staffing levels and outcomes

Low nurse staffing is associated with omissions of essential nursing care (Griffiths et al., 2018b), identified as a key mechanism leading to adverse patient outcomes (Recio-Saucedo et al., 2018). Building on the extensive evidence from cross-sectional studies, recent studies have shown associations at a patient- rather than hospital- or unit-level (Griffiths et al., 2018a, 2019; Needleman et al., 2011b). These include studies involving direct observation of care delivery (Bridges et al., 2019) and studies showing that omissions in care mediate associations between staffing levels and outcomes (Ball et al., 2018; Bruyneel et al., 2015; Griffiths et al., 2018a). While cause and effect cannot be directly inferred from observational studies, the case for a conclusion that low nurse staffing causes harm to patients is increasingly compelling. Perhaps the case is best made by considering the alternative proposition. It seems highly unlikely that there are no adverse outcomes caused by low nurse staffing levels.

Partly as a response to this evidence, policies of mandatory staffing minimums have been much discussed and implemented in a number of jurisdictions, most notably California, USA (Donaldson and Shapiro, 2010; Mark et al., 2013; Royal College of Nursing, 2012). Yet, even where mandatory staffing policies are implemented, patient care needs that cannot be met by the minimum must be identified, and staffing adjusted accordingly. The question of how best to identify the required nurse staffing level remains unanswered.

1.2. Staffing tools and methodologies

Determination of appropriate nurse staffing levels and measurement of workload have been studied since the earliest days of research into nursing (e.g. Lewinski-Corwin, 1922). Over the years, there have been many reviews focussing on methods for determining nurse staffing requirements. All have highlighted major deficits in the evidence. The problem is not a simple lack of published literature. One early review of nurse staffing methodologies, published in 1973, included a bibliography of over 1000 studies (Aydelotte, 1973). However, finding no evidence concerning the relative costs or effectiveness of different staffing methods and little evidence for validity or reliability, the authors concluded “Although the intent of the methodologies is admirable, all are weak” (p. 57) (Aydelotte, 1973).

Subsequent reviews have had to embrace an ever-growing body of research and an increasing number of systems. A review undertaken for the then Department of Health and Social Services (DHSS) in the UK in 1982 identified over 400 different systems for determining staffing requirements (DHSS Operational Research Service, 1982). Despite the volume of writing, evidence to judge the merits of these systems has remained elusive. Writing in 1994, Edwardson and Giovanetti noted the absence of published scientific evidence for a number of systems, such as GRASP or Medicus, which were in widespread use in North America (Edwardson and Giovannetti, 1994). They also noted that although different systems tended to produce results that were highly correlated, they could nonetheless produce substantially different estimates of the required level of nursing staff for a given patient or unit (Edwardson and Giovannetti, 1994).

Fasoli and Haddock reviewed 63 sources (primary research, theoretical articles and reviews) and again found that there was insufficient evidence for the validity of many current systems for measuring nursing workload and staffing requirements, concluding that systems are not sufficiently accurate for resource allocation or decision-making (Fasoli et al., 2011; Fasoli and Haddock, 2010). Other reviews reinforce this pervasively negative picture of the evidence (Arthur and James, 1994; Butler et al., 2011; Hurst, 2002; Twigg and Duffield, 2009). The field is dominated by descriptive reports of locally developed approaches and none of these reviews found any evidence for the impact of implementation of a tool on outcomes for quality of care, patients or staff (Griffiths et al., 2016).

However, the topic remains important. Identifying low staffing as a significant contributor to “conditions of appalling care”, a key recommendation of the Francis Inquiry into the failings of the Mid Staffordshire General Hospital in the United Kingdom was the development of guidance for nurse staffing including:

“…evidence-based tools for establishing what each service is likely to require as a minimum in terms of staff numbers and skill mix.”(p. 1678) (Francis, 2013)

In this paper we aim to give an overview of approaches to measuring nurse staffing requirements for general acute hospital wards, drawing primarily on existing reviews, before presenting a more comprehensive overview of more recent primary research to determine whether (and how) evidence has changed in recent years.

2. Review methods and scope

2.1. Search strategy and approach to review

The sheer volume of material and unanswered questions identified in other reviews makes this a daunting area to summarise. We describe the current review as systematic in the sense that we aim to be explicit about the approach to identification and selection of literature. However, as we primarily aim to map the literature, identifying recent developments, key features and areas of relative strength and weakness, without necessarily giving each study an in-depth critical appraisal, we consider this a scoping review, serving to summarise findings and identify gaps in the knowledge (Arksey and O'Malley, 2005).

We draw selectively on older authoritative sources and reviews to give a general overview and background to the evidence (including the reviews already cited), using the results of our comprehensive searches and review of reviews undertaken for the National Institute for Health and Care Excellence, NICE (Griffiths et al., 2014) as a key source.

In order to identify more recent studies, we searched Medline, CINAHL (key word only) and The Cochrane Library using the terms “Workload”[key word, MESH] or “Patient Classification”[key word] AND “Personnel Staffing and Scheduling” AND “Nurs*”[key word] or “Nursing”[MESH] and limited results using the OVID Medline sensitive limits for reviews, therapy, clinical prediction guides, costs or economics. We checked the sensitivity of this search, which was designed to be specific, using the results of our earlier more comprehensive search (Griffiths et al., 2014) as a test set. We performed additional searches for citations to existing reviews and for other works by the authors of those reviews (since such reviews might be conducted as a prelude to new empirical research). We also undertook focussed searches on databases for works by key authors and searched the World Wide Web using the names of widely used tools. Searches were completed in mid-December 2018. We looked specifically for new reviews published after 2014 (when searches for our 2014 review of reviews were completed) and primary studies published from 2008 onwards, because the most recent review in our review of reviews was published in 2010 (Fasoli and Haddock, 2010). After removing duplicates, we had 392 recent sources to consider.

2.2. Selection of primary research

Consistent with the aims of a scoping review, we took a liberal approach to inclusion for material to review. We included primary studies that described the development, reliability or validity testing of systems/ tools for measuring nursing workload/ predicting staffing requirements; studies that compared the workload as assessed by different measures, or which used a tool as part of a wider study in such a way that it might provide some insight into the validity of tools or another aspect of the determination of nurse staffing requirements; and studies that reported the costs and/or consequences of using a tool, including the impact on patient outcomes. We also included descriptive papers that might not merit the label ‘study’, provided that they included some data. We only included studies that were of direct relevance to staffing on general acute adult inpatient units and so excluded studies focussing exclusively on (for example) intensive or maternity care. However, had we identified material that demonstrated a significant methodological advance or other insight we were open to including it for illustrative purposes.

3. Results

3.1. Overview of approaches to determining nurse staffing levels

There are many methods for determining nurse staffing requirements described in the literature. They are generally classified into several broad types ( Fig. 1 ) although the distinction between these approaches is less absolute than it may appear and terminology varies.

Fig. 1

Major approaches for determining nurse staffing requirements.

Telford's professional judgement method (Telford, 1979), first formally described in the UK in the 1970s, provides a way of converting the shift-level staffing plan, decided using expert opinion, into the number of staff to employ. The method describes calculation of the number of nurses to employ (generally referred to in the UK literature as the nursing ‘establishment’) in order to reliably fill the daily staffing plan (planned roster), making allowance for holidays, study leave and sickness/absence. Conversely, this method can be used to infer the daily staffing plan from the whole time equivalent staff employed by a ward, as illustrated by Hurst (2002). The full ‘Telford’ method provides a framework for wider deliberation, but the judgement of required staffing does not require the use of objective measures to determine need (Arthur and James, 1994), hence it is an example of a ‘professional judgement’-based approach. In recent years, this deliberative approach without formal measurement is reflected in the United States Veteran's Administration staffing methodology (Taylor et al., 2015).

‘Benchmarking approaches’ involve using expert judgements to identify suitable comparators, with the staffing levels compared between similar units to establish requirements. For many years this approach was used by the audit commission in the UK (Audit Commission, 2001) to compare nursing establishments and expenditure between units across hospitals. Although characterised by Hurst (2002) as a distinct method, like professional judgement, benchmarking does not involve any formal assessment of patient requirements for nursing care. Rather, consensus methods and expert professional judgement are often used in selecting appropriate benchmarks and so it could be characterised as a particular form of the professional judgement approach, although such characterisation requires that such a judgement is applied. Furthermore, while the process of comparison with similar wards gives the appearance of objectivity, much depends on how the initial staffing levels were arrived at, and there is ample evidence that perceptions of staffing requirements are often anchored to historical staffing levels (Ball et al., 2019; Twigg and Duffield, 2009).

While accounts of professional judgement and benchmarking exercises often focus on determining establishments, both can also be used to determine a daily staffing plan or shift-level nurse-patient ratio or equivalent (such as nursing hours per patient). In this way they assign a target number of nursing staff or hours per patient or bed (Hurst, 2002), informing staff deployment decisions. Such approaches specify unit types to which a particular staffing level applies, although categories tend to be broad (e.g. intensive care, general medical surgical and rehabilitation). Some more recent approaches to monitoring workload (see below) extend this approach to take a wider view of activity, for example adding in admissions and discharges over and above the patient census, and therefore we term these patient-nurse ratio approaches ‘volume-based’ approaches.

Approaches that appear to set minimum staffing levels per patient, an example of a volume-based approach, are sometimes explicit in stating that additional staffing may be required to meet peaks in demand. For example, the legislation that established mandatory nurse-patient ratios in California includes a stipulation that hospitals also use a system for determining individual patient care requirements to identify the need for staffing above the specified minimum (State of California, 1999). Thus, approaches which seek to determine staffing requirements accounting for individual patient variation in need or other factors driving workload can be used as alternatives to, or in conjunction with, minimum staffing levels based purely on patient volumes.

Whereas volume-based approaches measure variation in workload determined by patient counts, other approaches recognise that patients in a given type of ward may have different care requirements. Edwardson and Giovannetti (1994), offer a typology of three main approaches for determining individual patient need: prototype, task and indicator systems. Hurst also describes three main types: Patient Classification Systems, timed-task and regression-based (Hurst et al., 2002).

Prototype or Patient Classification Systems group patients according to their nursing care needs and assign a required staffing level for each (Fasoli and Haddock, 2010; Hurst, 2002). They use either pre-existing categorisations, e.g. diagnosis-related groups (Fasoli and Haddock, 2010), or bespoke categorisations, e.g. classifications based on levels of acuity and/or dependency groups. The Safer Nursing Care Tool (The Shelford Group, 2014), the most widely used method for determining staffing requirements in England (Ball et al., 2019), is one such system. Patients are allocated to one of five acuity/dependency categories with a weighting (described as a ‘multiplier’) to indicate the required staff to employ associated with patients in each category.

In task (or timed-task) approaches, a detailed care plan, consisting of specific ‘tasks’, is constructed for each new patient and used to determine the required staffing (Hurst, 2002). Each task is assigned an amount of time. The commercial GRASP system, still widely used in the United States, is an example of such a system (Edwardson and Giovannetti, 1994).

As with prototype approaches, indicator approaches ultimately assign patients to categories, in this case based upon ratings across a number of factors that are related to the time required to deliver patient care. These can include broad assessments of condition (e.g. ‘unstable’), states (e.g. ‘non ambulatory’), specific activities (e.g. complex dressings) or needs (e.g. for emotional support or education) (Edwardson and Giovannetti, 1994). The Oulu Patient Classification, part of the RAFAELA system, is one such example. Patients are assigned to one of four classifications, representing different amounts of care required, based upon a weighted rating of care needs across six dimensions (Fagerström and Rainio, 1999). However, the inclusion of some specific activities in Edwardson and Giovennetti's definition of indicator approaches makes it clear that the distinction from task / activity-based systems is not an absolute one. Typically, though, task-based systems take many more elements into account: over 200 in some cases (Edwardson and Giovannetti, 1994).

Hurst also identified regression-based approaches, which model the relationship between patient-, ward- and hospital-related variables, and the establishment in adequately-staffed wards (Hurst, 2002). To obtain the recommended establishment for a particular ward, coefficients derived from the regression models are used to estimate the required staffing. There are relatively few examples, although Hoi and colleagues provide one recent example, the Workload Intensity Measurement System (Hoi et al., 2010). In some respects, regression-based models simply represent a particular approach to allocating time across a number of factors within an indicator-based system, rather than directly observing or estimating time linked to specific activities or patient groups. The RAFAELA system, widely used in the Nordic countries, although based on a relatively simple indicator system, uses a regression-based approach to determine the staffing required to deliver an acceptable intensity of nursing work for a given set of patients in a given setting (Fagerström and Rainio, 1999; Fagerstrom and Rauhala, 2007; Rauhala and Fagerström, 2004).

In these more tailored approaches, the method for determining the required times for patient groups or tasks varies. The literature describes the use of both empirical observations and expert opinion to determine the average time associated with tasks or patient classifications (De Cordova et al., 2010; Myny et al., 2014; Myny et al., 2010). In some cases, there is an explicit attempt to make workload/time allocations based on reaching some threshold of quality. For example, wards contributing to the database from which the multipliers for the Safer Nursing Care Tool are derived must meet a predefined standard for care quality (Smith et al., 2009). Non-patient contact time, for example care planning and documentation or other activities that take place away from the bedside (which are not always easily attributable to individual patients), is dealt with in different ways. All approaches consider this, often assigning a fixed percentage time allocation over and above direct care that has been measured.

While some approaches appear to be more precise than others, using detailed patient care plans at one extreme (timed-task) and apparently assuming all patients have similar needs (volume-based) at the other, all use average time allocations, with an unstated assumption that when summed across tasks and patients, individual variation can be accommodated.

3.1.1. Staffing decisions and the use of tools

A number of different decisions can be made using staffing systems and tools, with decisions operating in different time frames ( Table 1 ). Nursing managers must decide in advance how many nursing staff to employ (often referred to as the nursing establishment) and how many nursing staff to deploy each shift, either as a fixed daily staffing plan or in response to immediate demand. Accounts of indicator and task approaches often focus on measuring immediate need (and implicitly deploying staff to meet such need) rather than determining an establishment to fill planned rosters. These are separate but inter-related decisions, which all rely on being able to quantify nursing workload. The distinction is sometimes unclear in published accounts and the relationship between these uses tends to be implicit rather than explicit.

Table 1

Uses of staffing systems and tools.