Developing criteria for Cesarean Section using the RAND appropriateness method

Background Cesarean section rates are increasing worldwide, and a rapid increase has been observed in Iran. Disagreement exists between clinicians about when to use cesarean section. We aimed to identify the appropriateness criteria for the use of cesarean section in Iran. Method A consensus development study using a modified version of the RAND Appropriateness Method (RAM). We generated scenarios from valid clinical guidelines and expert opinions. A panel of experts participated in consensus development: first round via mail (12 members), second round face-to-face (9 members). We followed the RAM recommendations for the development of the scenario lists, rating scales, and statistical analyses. Results 294 scenarios relevant to cesarean section were identified. 191 scenarios were considered appropriate, of which 125 scenarios were agreed upon. The panel found cesarean inappropriate for 21% of scenarios, and 'equivocal' for 14% of scenarios. Conclusion RAM is useful for identifying stakeholder views in settings with limited resources. The participants' views on appropriateness of certain indications differed with available evidence. A large number of scenarios without agreement may partly explain why it has been difficult to curb the growth in cesarean section rate.


Background
Healthcare systems face significant challenges in response to changes in population needs and increasing costs. Studies show that a large proportion of healthcare offered may be inappropriate or unnecessary, ranging from 15 to 40 per cent in different countries and healthcare settings [1,2].
Different methods and tools have been developed to measure the appropriateness of care and develop valid criteria and recommendations for healthcare providers [3][4][5]. Most evidence originates from high income countries and it may not be possible for low and middle income countries to generate the expertise and resources required for the development of valid clinical guidelines [6]. Human interpretation plays an important role in the development of guidelines, and cultural and health system characteristics influence the way the evidence is interpreted and put into recommendation [4,7].
One of the most commonly performed surgical interventions is cesarean section. According to the WHO, a maximum of 15 per cent of deliveries have medical indications for cesarean section [8] and rates above this are unsuitable and unnecessary, imposing financial burden and clinical risks on patients and healthcare systems. The cesarean section rate has risen considerably over the past few decades: from less than 7% in the 1970 s to over 25% in 2003, causing major concerns for health policy makers [9]. Cesarean section comprised around 25, 26, 31, 31, 32 and 35 per cent of all deliveries in China, Canada, Australia, the United States, Taiwan and Italy respectively [10][11][12][13]. Figures are higher in South America: cesarean section comprises more than 50% of all deliveries performed in private hospitals in Chile, Argentina, Brazil and Paraguay [14]. The increase is multi-factorial and medical, legal, cultural, and economic concerns may have played roles in it [11,15,16].
In Iran, data published in 2005 suggested that cesarean section constituted 47% of all deliveries (around one million) in the country, 52% of deliveries in Tehran and 64% of deliveries in the private sector [17]. This was a significant increase over the 35% section rates reported for 2000 [18].
Little has been previously done in Iran for developing evidence-based criteria or consensus statements for the management of cesarean section. The most notable example was a protocol published by the Ministry of Health and Medical Education in 2004, but it lacked details and information expected in clinical guidelines [19]. It did not change the trend and the cesarean section rate in the country continued to rise despite the protocol. Recently, a quality improvement intervention in a Tehran hospital resulted in reduction of cesarean sections [20], suggesting that systematic approaches may yield positive outcomes.
Considering the magnitude of the problem in low and middle income countries in general, and in Iran in particular, we aimed to identify the appropriateness criteria for the use of cesarean section from the viewpoint of professional stakeholders in the country. The results of this study may benefit policy makers and clinicians in Iran as well as in other low and middle income countries.

Method
The RAND Appropriateness Method (RAM) is an explicit approach for the assessment of the appropriateness of care. The method involves developing sets of clinical scenarios or criteria. Decisions are then made about the level of care or service that is appropriate for those scenarios and criteria. The method was designed in the 1980 s by the RAND and the University of California in Los Angeles (UCLA) and has been used in many studies in North America and Europe [4,21]. It has been frequently used for the development of appropriateness criteria in surgical care and investigative procedures [4,[22][23][24]. We conducted our study in 2008 in Tehran. We followed a modified version of the RAM as explained below.

Generating scenarios
First we searched the literature to identify available clinical practice guidelines and evidence summaries. We identified fourteen documents with relevant scopes [25][26][27][28][29][30][31][32][33][34][35][36][37][38]. We screened these documents and selected three guidelines and evidence summaries that provided a relatively comprehensive coverage of issues relevant to cesarean section [34][35][36]. The three selected documents had been developed as part of an established guideline development program or by recognized institutions. Then one author (AR) conducted a short workshop on development and appraisal of clinical guidelines. Three panel members and three authors used the validated Farsi translation of the AGREE (Appraisal of Guidelines for Evaluation and Research) tool (Additional file 1: Table S1) [39,40] to appraise the guidelines with comprehensive coverage and selected two clinical guidelines as a result [34,35]. In the next step, two obstetricians & gynecologists (BHR, SH), one midwife (FH) and a health service researcher (RO) extracted potential scenarios about the cesarean section from the selected clinical guidelines. We aimed to develop as short a list of scenarios as possible which were 'comprehensive', 'mutually exclusive' and 'homogeneous' [21]. Each scenario comprised of a few words or a short sentence capturing the main clinical features that identified the patients. The scenarios were categorized within main potential indications for cesarean section for further analyses and tabulation purposes.

Panel members and setting
The panel members comprised of twelve individuals: nine obstetricians & gynecologists and three midwives. The panel members were from the Ministry of Health and Medical Education (one), four different public medical universities (nine), a non-profit medical school (one), and a private hospital (one). One participant exclusively worked in the private sector, and one exclusively in the public sector. The rest worked both in public and private hospitals.
Out of twelve invited members, ten responded to the first round. Those ten were then invited to participate in the second round, out of which nine participated.

Consensus development
In the first round we sent the list of scenarios along with a summary of the clinical guidelines, the scoring system and the definitions to the panel members. The panel members were asked to give their opinions about each scenario for cesarean section on a scale ranging from one (totally inappropriate) to nine (totally appropriate). They were asked to consult the scientific resources provided for them while giving their opinions. The purpose of the literature review and the provision of evidence-based clinical guidelines and evidence summaries to the panel members was to provide them with an up-to-date summary of the best available evidence about the indications for cesarean section. The panels were not limited to the guidelines as they were required to use their own professional judgments as well as the evidence presented to them.
We then collated the views of the panel members and summarized the views in a format suitable for feedback so that each member received a summary of the panel view as well as a reminder of the scores that the member had assigned to each scenario. Then the panel members were invited to a second round, (face-to-face panel meeting), to view the feedbacks, and review and discuss their opinions. Nine members attended the second round (one day meeting from 8.00 to 17.00) and all the scenarios were reviewed and discussed. The panel members decided to add a limited number of further scenarios at this stage. Final decisions were recorded in specific forms.
As recommended by the RAM, we asked the panel members to rate the appropriateness of each indication (score them) based on their own professional judgment instead of what they perceived to be the views of other respected clinicians while considering the possible outcomes resulting from their decisions. The panel members were asked to consider average pregnant women presented to average clinicians in usual settings of care relevant to each indication discussed [21].

Statistical analysis
We used the median scores for reporting the results of the first panel to the second panel. The scores were divided into three groups: appropriate (score = 7-9), equivocal (score = 4-6), and inappropriate (score = 1-3). If the median score fell into any of the above groups, it was considered as such (e.g. if a scenario's median score was 8, the scenario was considered as 'appropriate'). A further condition had to be met in order to reach agreement: if the minimum and maximum scores were ignored, all other scores must fall in the same scoring group. Additionally, we compared the rate of agreement between the two panels using weighed Kappa values and frequency charts.

Results
We generated 276 scenarios for cesarean section in the scenario generation phase of the study. We divided the scenarios into thirteen indications as used in the selected guidelines [34,35] to make the assessment task easier for the panel members (Table 1) [21]. Table 1 shows that preterm delivery and mother characteristics comprised the highest proportion of potential scenarios of cesarean section. In total over 60 per cent of the scenarios were considered as appropriate, but the agreement was reached on 88 (31.9%) of them. Only 31 scenarios were inappropriate at this stage. Comparing the indications, 'abnormal presentation' and 'cephalic pelvic disproportion' contained the highest number of appropriate scenarios with agreement.
The panel members added a further 18 scenarios to the list in the 2 nd consensus development round, all of which were relevant to the 'chronic distress' indication. Table 2 shows the rate of appropriateness and agreement of panel members in the second round. It is seen that among the 294 scenarios of cesarean section, 191 (65%) were considered as appropriate, amongst which 125 (42.5%) scenarios were agreed upon. Table 3 Table 4 comprises only those 276 scenarios that were considered in both rounds and shows the effects of the second panel on the decisions made (weighted kappa value = 0.53). Comparing the round 2 results with round 1, we observed a reduction in the proportion of 'equivocal' scenarios (from 28% to 15%) with a similar increase in the proportion of 'inappropriate' scenarios (from 11% to 23%). In the case of appropriate scenarios, however, these differences were small (from 168 to 174 scenarios- Table 4).

Discussion
In evidence-based medicine the question arises about what should be done if there is insufficient evidence for a procedure routinely performed in practice [41]. Formal consensus development (including RAM) provides a timely and efficient solution when evidence is insufficient [42] while questions remain about the validity of recommendations based on such methods [43]. Evidence-based clinical guidelines often lack flexibility and may not provide enough details for clinicians when making decisions about individual patients [44]. In our study, using RAM, we tried to overcome this limitation by developing scenarios representative of the patients seen by clinicians in practice.
To the best of our knowledge this is the first study that has used the RAND Appropriateness Method for cesarean section indications. In our study, cesarean section was considered to be appropriate in 191 (65%) potential scenarios, of which agreement was reached for 125 (42.5%) scenarios. As expected, the participants agreed on the appropriateness of using cesarean section in a large proportion of scenarios presenting as chronic distress, abnormal presentation, hemorrhage and premature rupture of membrane, and fetal anomaly. They also agreed on the appropriateness of over half of repeat cesarean scenarios.
After two rounds of consensus development there were still 106 scenarios in which no agreement was reached or the results remained equivocal. This may demonstrate the ambiguous nature of decision making on whether the cesarean section is indicated for an individual pregnant woman. It may partly explain why it has been so difficult to curb or slow the growth in cesarean section rate around the world [8,[11][12][13][14][15]17,18].

Advantages and limitations
Tan et al [43] have described the complexities and limitations of using RAM. Many of their criticisms equally apply to other consensus development methods, such as Delphi [45]. It should be noted that consensus methods are most useful where there are disagreements or variation in practice, and reliable evidence is limited. In these circumstances formal consensus methods are valuable and their use is inevitable.
There were heated discussions in our panel on whether it is indicated to conduct cesarean sections for repeat cesarean patients. Based on their personal experiences, members believed that the evidence on the benefits of trial of labor over repeat cesarean is dependent on a context with good quality pre-hospital and hospital care and may not apply to Iran's conditions. This was in contrast with the results of a systematic review of relevant evidence [46]. It concluded that repeat cesarean was not indicated as a routine practice, mostly based on non-randomized trials originating from high resource countries [46]. In our opinion, while high quality evidence on the issue is lacking, the panel members may have been somehow justified to doubt the benefits of  trial of labor over previous cesarean section. A qualitative study in Canada reported that clinicians had similar concerns about trial of labour over previous caesarean section [16]. Further research originating from low and middle income countries is required. We spent time on familiarizing the panel members with the method and attracting their valued cooperation. Membership of the panel involved open discussions of personal views and practices, and that certain practices might not be supported by evidence or by other panel members. We also selected the members from different backgrounds and settings to improve comprehensiveness of the views [47,48]. As an advantage, we used the AGREE tool for selection of the evidence sources [39]. The AGREE tool has been used extensively for appraising clinical practice guidelines for different conditions including obstetrics care [49]. Using the AGREE tool provided a chance for the panel to reach a shared understanding of the evidence before embarking towards consensus building.
RAM usually results in a long list of scenarios [48]. To ease the use of its results it may be possible to develop user-friendly software, or to categorize the scenarios into indications and packages in the format of clinical guidelines. The RAM guidelines suggest that on average 150-200 scenarios can be rated in an hour once the panel members are used to the process [21]. In our experience a dedicated team of panel members reviewed and rated 294 scenarios in one long-day second-round meeting.

Dystocia of soft tissues
Arrest of dilatation: delayed long phase (more than 3 hours in nullipara and more than 1 hour in multipara) with appropriate contraction, no CPD (A) Arrest of dilatation (more than 2 hours in nullipara & multipara) with appropriate contraction, no CPD (I) Failure to progress: nulliparous, insufficient dilatation of cervix (< 1.2 cm/hr), appropriate contraction, no CPD (I) Failure to progress: multiparous, insufficient dilatation of cervix (< 1.5 cm/hr), appropriate contraction, no CPD (A) We used the weighted kappa values to compare the results of the two rounds of the study. It should be noted that the interpretation of kappa values here differs from other agreement studies. In a consensus development study, the investigators seek to improve agreement via changes in the views of the panel members. Hence high values of kappa are not sought. Still its measurement is useful as it quantifies how the views have changed during the study.

Implications
Our results will help decision makers in identifying misconceptions on the benefits of cesarean section and focusing their efforts on changing the views of the clinicians. For example the participants agreed that cesarean section was appropriate for many repeat cesarean and cephalicpelvic disproportion scenarios, despite recent evidence and the recommendations of some evidence-based guidelines against use of section in these scenarios [34,35,46]. While our study focused on developing criteria for a wider use in the country, our approach may also benefit obstetricians and midwives working in hospitals to develop corporate strategies. It will require methodological support and group work, while it will help in generating shared views and understanding. In a way, the process will be similar to 'participatory guideline development' that has been shown to be effective in changing professional practice [50].
Other variables such as payment method, medico-legal issues and patient preferences may affect provider practice and views on conducting cesarean section [16]. According to previous studies different factors cause high cesarean rates in Iran. They include factors that affect women preference (e.g. increasing women level of education, employment and age at marriage and decreasing intended number of deliveries) [18,[51][52][53], provider behavior and clinical factors (e.g. repeat cesarean, dystocia, CPD and physician preference) [51][52][53] and health system factors (e.g. health insurance coverage, delivery at private hospital) [51].
In Iran, as in many low and middle income countries, health system regulatory mechanisms are insufficient, and the fee-for-service payment provides further income if cesarean section is performed. It is also generally perceived that women prefer cesarean section over vaginal birth. In such context it is hardly surprising to see the current high rates of cesarean section. Also there is a growing culture of suing doctors because of malpractice claims, and this may fuel cesarean section rates as a form of 'defensive' medical practice. Hence our panel members may have been more lenient towards section (e.g. advocating repeat cesarean) than it might be observed in other countries.
The results of this study can contribute to the development of national guidelines for use in the country. Obviously the impact of implementing such guidelines will depend on many factors, including using effective implementation strategies [44]. The guidelines more likely affect provider and health system factors. They may also help clinicians in effective communication with pregnant women when they request cesarean section.
For certain scenarios, however, agreements on appropriateness may not result in reducing variation in practice. For example it may be easy to agree with cephalic-pelvic disproportion scenarios, but these are difficult to measure and implement in practice.

Conclusions
The RAM should be used more widely in low and middle income settings and in other areas of healthcare or other patient groups where controversies exist or the practice varies. It also has the added value of developing a level of ownership by the providers if they see that their peers and relevant stakeholders are adequately represented in the process. The results of this study can be used for developing national guidelines, conducting research to assess whether the criteria are followed in practice, and whether their application can curb the growing rate of cesarean section in all countries.

Funding
The study received funding from the Deputy of Research of Tehran University of Medical Sciences, contract number 86-04-27-6592.

Ethics approval
The study was approved by the Ethics Committee of the Tehran University of Medical Sciences

Additional material
Additional file 1: Table S1: AGREE criteria for appraising the quality of clinical practice guidelines 32