Content validation of educational materials on maternal depression in Nigeria

Background This study describes the content validation process of the already developed English and Yoruba (poster and leaflet) and Yoruba (song) maternal depression educational materials in Nigeria. Methods This study is a cross sectional study which is a part of a larger study on training and supervision of Primary health care workers. Study utilized health professionals’ judgement for content validation, and maternal-child health clients’ evaluation for face validation with the use of Suitability Assessment of Materials (SAM). Six bilingual professionals validated both English and Yoruba version of materials (Song has only Yoruba version) and 50 clients evaluated each Yoruba material. Validity Index was calculated by formula and inter-rater agreement using intra-class coefficient (ICC) was analyzed on Professionals’ ratings. ICC, ‵t′ test and Pearson correlation were analyzed on professionals’ rating versus randomly selected six clients’ rating. Descriptive statistics, and fisher exact test were used for other statistical analysis with SPSS version 25. Results The mean age of the professionals for poster was 44.3 ± 6.0 years, for leaflet 39.8 ± 7.2 years, for song 43.8 ± 8.4 years. For maternal child health clients, mean age is: 30.7 ± 5.4 years for poster; 31.3 ± 5.2 for leaflet and 29.0 ± 5.1 for song. Outcomes of bilingual professionals’ validation are validity index: English {leaflet (0.94), poster (0.94)}, and Yoruba {leaflet (0.94) poster (0.94) and song (1.00)}. More than 80% clients rated the suitability of each material as superior. There is no significant relationship between clients’ sociodemographic characteristics and their ratings across content, literacy demand and cultural appropriateness domains of the three materials on fisher exact test. The inter-rater agreement among the professionals is excellent on leaflet and song ICC > 0.8, but it is weak on the poster ICC < 0.6. There is no inter-rater agreement on all the three Yoruba materials, but a negative linear correlation was found on the leaflet between the professionals’ ratings and the randomly selected clients’ ratings. ‵t′ test found no statistical difference in the ratings of the professionals and clients only on song material. Conclusion This study shows the process of validation of the English and Yoruba versions of the educational materials. This process should be leveraged in the content validation of other maternal-child health education materials in Africa.

Professionals in the field of interest are recommended to participate in content validation while end users are recommended to participate in face validation [2]. The agreement of rating among the professionals is the strength of the validation index. The agreement can be in form of quantifying consensus (content validity ratio) [3] or measuring proportional agreement (content validity index) [4] measuring inter-rater experts' agreement (Cohen's coefficient kappa (k) or Intra class correlation) [5] 0r the use of Delphi technique [6] The application of content validation has been found to be relevant in most studies on Information Educational and Communication (IEC) materials [7,8], and it is commonly conducted in studies from the high-income countries [9].
In the low-and middle-income countries (LMICs), print educational materials are widely used to address prevailing health conditions [10][11][12], but studies showing their development and validity processes are scare. Although, the use of targeted educational materials for health education is emphasized in the primary health care guideline in Nigeria [13], but not all prevailing health condition is being addressed in the primary health care routine health education. Maternal depression is one of them [14]. Maternal depression is a common condition which requires health education [15][16][17] recommended intense enlightenment through health communication to address maternal depression in LMICs. In line with this recommendation, and the misconceptions about maternal depression which is common among the Yoruba community in Nigeria [18], there was a need to develop and validate materials which provide correct health education on maternal depression. So far, standard guidelines for validation of educational materials are scarce, but guideline on validation of scales and research instrument are available [2]. Studies which make use of methodological research designs are available. They show the details of process of development of educational materials and the validation with focus on either suitability [19][20][21]; relevance [7]; content and readability or understanding of audio-visual materials [22] and adequacy [8] . These studies have used various instruments including Suitability Assessment of Materials (SAM) [23], Comprehensibility Assessment of Materials CAM [24], Adapted Questionnaire [25] and Patient Educational Materials Assessment Tool (PEMAT) [22]. Studies from Nigeria often end the development of educational materials process at pretest, the materials are rarely validated [26].
In our study, we carried out content validation as found in literature [2,7], and we validated print and song materials on maternal depression. The primary goal is to show the content validation process, and to make validated health education materials on maternal depression available for health educators in Nigeria and elsewhere.

Study design and setting
This is a cross sectional study embedded in a parent study which implemented and evaluated training and supervision of health talk delivery on maternal depression. The parent study was carried out among primary health care workers in Ibadan, Nigeria. This study was set to validate educational materials used for the parent study.
This validation study took place in two of five local government areas (LGAS) in Ibadan metropolis. Primary health clinics (phcs) were used in the two LGAs. Primary health clinics do not offer laboratory testing, 24 h service, minor surgery, and admission, in addition to the outpatient services. The primary health clinics serve up to 5000 people, and it only offers outpatient services. They are referral services for mobile health posts.

The materials for validation
Three educational materials (poster, leaflet (Yoruba and English version), and a song (Only Yoruba version) on maternal depression which were developed by researcher A. O are shown in Figs. 1, 2, 3, 4, 5, 6 and 7. Figure 3 shows researcher's name written at the back page. A written consent for publication has been taken on any identifying images used in this study. During the process of development of the materials, the maternal-child health clients of fifth to seventh grade pretested them, and evaluated the materials in terms of understanding, clarity, and cultural appropriateness. The content of these materials was built on the concept of health belief model [27] with a clear definition of maternal depression, risk factors which address perceived susceptibility and consequences of maternal depression. The content validation process of these materials, in this present study, was guided by "The best practice for developing and validating scales for health, social and behavior research and [1,2].

Content of song 1
Every pregnant woman and nursing mother!!!! Listen to this instruction, an instruction from the health workers.
Depressed mood in pregnancy and after delivery carries a great risk for mother and child.
Fatigue, isolation, loss of interest in pleasurable things. It carries a great risk for mother and child.
Quickly see the health workers for help.

Content of song 2
May depression does not disrupt my joy. I will not allow depression to disrupt my joy.  It is dangerous for me and my child.
I quickly see the health workers for help.

Characteristics of study participants and selection procedure
The study participants included health professionals and maternal child health service users/clients (nursing mothers and pregnant women). To be included as a professional participant, the individual should have one of the following professional backgrounds: mental health, maternal-child health, public health nursing in mental health, health promotion, child health and health communication. Professionals who were not bilingual in Yoruba and English language were excluded. Eighteen professionals (six per material) were recruited through snowballing [8]. The recommended number of professional for validation is > 5, the more the better the robustness of rating [1] but it all depends on the availability of professionals in the field of interest. To be included in this study as a maternal-child health users, the client should be accessing routine care at the selected primary health clinics, be able to read Yoruba language and should be available in the waiting area to attend to education sessions. Those who were in a hurry to leave or had a sick child were excluded.
A convenient sample size of 50 was taken that provided a power of 93% at 95% confidence level, margin of error 5%, assuming a loss to follow up of 10% to estimate that 90 ± 20% of women rating the materials as suitable [7].
From two selected LGAs, three clinics with the highest patient load according to the clinic record during the period of the study were purposively selected. The three clinics (A, B, C) were selected from Ibadan North LGA and three clinics (X, Y, Z) from Ibadan Northeast LGA. The clinics names were written and rolled into paper balls that were placed inside a container. Three empty boxes had 'poster' , 'leaflet' and 'song' labels written on them. Someone who was not part of the study was invited to randomly pick the first set (A, B, C) into the each of the empty boxes. The same process was repeated for (X, Y, Z). The 50 clients for each material (25 clients from each clinic) were selected randomly as follows: Paper balls labelled "yes" and "no" (50 each) were placed in a box and each eligible client at the clinics was asked to pick one. The clients who picked 'yes' each day, and who consented to participate were recruited. This process was repeated on every clinic day until 50 clients for each material in each clinic were recruited.  Another set of six professionals rated English leaflet and Yoruba leaflet. Another set of six professionals rated Yoruba song (song has no English version). The professionals were given copies of the print materials and the tool (Suitability Assessment of Materials) for 1 week. They used the SAM tool to score the English print educational materials on the six domains and submitted. To limit rating bias, after 1 month of submission, the professionals were also given Yoruba translated SAM for 1 week to rate the Yoruba version of the print educational materials, and the Yoruba maternal depression educational was song was given to the designated professionals on their mobile phones.

Face validation of Yoruba version maternal depression educational materials by maternal-child health users
Face validity is the "degree that respondents or end users [or lay persons] judge that the items of an assessment instrument are appropriate to the targeted construct and assessment objectives" [1]. Three trained Research Assistants (RAs) were assigned each to the each of the three Yoruba materials in Figs. 2,5,6, and 7, they were trained in consent taking and the administration of SAM tool using cognitive interview. The RAs used the SAM to ask maternal-child health users questions and they recorded the rating/comments on the materials. The process lasted for 1 week on the poster and song, but the leaflet took up to 2 weeks because it has more content and pages.

Measurement and data processing
The primary outcome variable for this study is the validity index of the materials (poster, leaflet, and song) as rated by professionals, and the intra-class correlation (agreement) among the professionals' rating; face validation: Rating of adequacy of all the three educational materials with focus of analysis on the association bewtween content, literacy demand, cultural appropriateness, and sociodemographic characteristics among the maternal child health service users; and the agreement (intraclass correlation) between professionals' ratings and clients' rating.
Validity index measurements This validity index was computed on the professionals' rating of the educational materials with use of the 6 items on SAM (content, literacy, graphics, layout, simulation, and cultural appropriateness) as applicable to each material. The SAM originally has a rating of 2 for superior, 1 for adequate and 0 for not suitable. For the computation of validation 1, and 2 are regarded as 1 = adequate, and 0 = not adequate [8]. Validity index was calculated by formulas: I-CVI = Number of professionals who rated an item as adequate/ the total number of professionals; S-CVI/UA = I-CVI rated 1/total no of items [28]. There are two kinds of Content Validity Index (CVI) [29]: Face validation among maternal-child health clients Descriptive analysis was used to analyze the frequencies of the socio demographic characteristics of the client participants and their rating on the suitability assessment of the Yoruba version of the materials. The SAM score rates ≥70% (Superior: 2); ≥40-70% (adequate: 1) and ≤ 40% (not suitable:0). Fisher exact test was used to assess the association of the sociodemographic characteristics of clients and suitability rating on content, literacy demand, and cultural appropriateness with p-value of significance set at 0.05. According to [30], these three domains: content, literacy demand, and cultural appropriateness of SAM are regarded as the most important domains. If all other domains are scored superior, but these three domains are not suitable, the materials are expected to be revised.
Inter-rater agreement among professionals Inter-rater reliability analysis was carried out using the intra-class correlation (ICC) within the ratings of the six professionals for each material (English and Yoruba version) using SPSS. The agreement among professionals on each material is an indication of validity [25]. The ICC > 0.8 shows very strong agreement.
Inter-rater agreement among the professionals and clients' ratings Inter-rater reliability using intra-class correlation (ICC) was carried out on the rating of each Yoruba material among the professionals and clients using SPSS 25. Six clients for each material were randomly selected from the clients' data. This is possible because both groups utilized SAM tool to rate the materials. The ICC > 0.8 shows very strong agreement. In the study of [31], they carried out inter-rater reliability using intraclass coefficient to check the agreement between the parent-teacher pairs in their ratings of pupils. This present study also utilized ‵t‵ test (p < 0.05 signifies no statistical difference), and Pearson correlation (r < 0.5 is a weak correlation while p > 0.05 signifies no linear relationship exist) to analyze the agreement between professionals and clients' ratings.

Ethical consideration
This study is a part of parent study "Effect of training and supervision of maternal depression inclusive health education delivery among primary health care workers in Ibadan, Nigeria" which received an ethical review approval form the Ministry of Health, Oyo state Nigeria ref. no AD 13/479/2016. Written consents were taken from the professionals and the maternal child health service users who participated in the study. The consent contained the information about the study and voluntary nature of participation. The consent also assured participants of confidentiality and data protection. No names of individuals were collected, but codes were used as identifiers on the measuring instruments.  Socio demographic characteristics of maternal child health clients who evaluated educational materials Table 2 show the sociodemographic characteristics of maternal child health clients who participated in the rating of the suitability assessment of poster, song, and leaflet on maternal depression. Their mean age is 30.7 ± 5.4 years for poster, 31.3 ± 5.2 years for leaflet and 29.0 ± 5.1 years for song. Fifty participants rated each material. The category of pregnant women and nursing mothers were distributed equally across the 3 materials. Poster had all Yoruba tribe participants while song and leaflet had other tribes represented but majority are Yorubas 88 and 86% respectively. Majority of the participants achieved post-secondary school educational level (post grade 9) respectively as follows for Poster 38(76%), leaflet 30 (60%) and song 28(56%).

Sociodemographic characteristics of professionals for content validation of maternal depression education materials
The rating of adequacy of the content, literacy demand and cultural appropriateness of the Yoruba version educational materials among the maternal-child health clients Table 3 shows fisher exact analysis of the socio demographic background of the mater-child health client participants, and the domains of content, literacy demand and cultural appropriateness. The clients rated all the materials as superior: Poster 44 (88%), leaflet 45(90%), and song 50(100%). There is no significant relationship between the domains of content, literacy demand, cultural appropriateness and sociodemographic characteristics of the participants across poster, leaflet, and song materials. Regardless of the status of the socio demographic characteristics of the clients, they rated all the materials as superior.
On the comment column of the SAM, clients made no suggestion on the improvement of the materials, but they expressed what they liked about the materials as follows. "The leaflet needs patience to read it. It has a   Table 4 shows the Item Level Content Validity Index (I-CVI) for poster, leaflet, and song by different groups of six bilingual professionals who rated each material. The ratings of the professionals for the English and Yoruba version of leaflet and poster are the same despite that the rating was carried out at 1 month interval. Song has the highest I-CVI of 1(excellent content validity). The Scale level content validity/ Universal Agreement S-CV/AU among the professionals is 0.83 (> 0.8) for all the materials (excellent content validity). The professionals rated the suitability of all the materials as superior, and on the scale of 0-10 (as stated on the SAM tool), they got an average rating of 8 for poster, 9 for song and 8 for leaflet.

Interrater agreement among mental health professionals on the English and Yoruba versions of print educational materials and Yoruba version of song using intra-class correlation (ICC)
The mental health professionals rated the English and Yoruba version of poster and leaflet the same, hence interrater agreement for the English version is the same for the Yoruba version of the print materials. Table 5 shows the agreement among the raters for the three materials is statistically significant p < 0.01. All the raters had inter-rater agreement of ICC above 0.75 on the adequacy of the leaflet, while there was a weak inter-rater agreement for poster. There was no absolute agreement among all the raters on the adequacy of the poster. The agreement between expert 2 and 5; 6 and 1,2,3,4, have an intra-class coefficient < 0.6. The song and leaflet material have no disagreement in the suitability rating of all the professionals. The agreement across all the professionals on leaflet and song is very strong with ICC ≥8.0. Table 6 shows the agreement of rating of randomly selected six clients for each of Yoruba poster, Leaflet and Song with the rating of the six fixed professionals for each of the Yoruba poster, leaflet, and song. It shows no inter-rater agreement between the ratings of the professionals and clients on the poster with an intraclass-correlation of 0.30, and the p value is >0.05. On the leaflet and song, there is no agreement in their ratings because the intra-class correlation coefficient is negative, and the p value >0.05 is not significant. Table 7 shows independent ‵t′ test shows that is a statistical difference between the ratings of professionals and clients on poster and leaflet, while there is no significant difference found in their ratings on song. Meaning professionals and clients rated song the same way. Table 8 shows the Pearson correlation between the ratings of the professional and the ratings of the clients. On the poster, there is no significant relationship in their ratings p = 0.174. On the leaflet, there is a significant relationship in the ratings of the professionals and the clients p < 0.05, but the correlation is a high negative linear relationship − 0.837. On the song, there is no significant relationship in the ratings of the two groups p = 0.679.

Discussion
This study validated English version of already developed poster and leaflet, and Yoruba version of the poster, leaflet, and song on maternal depression. It also shows the content validation process. All these materials have excellent validity index > 0.8 based on professionals' ratings. Likewise, based on clients' evaluation, > 80% rated each material's suitability as superior using the same SAM tool as the professionals. Sociodemographic characteristics did make clients' ratings differ on the SAM's domains of content, literacy demand and cultural appropriateness with the Fishers exact analysis. These are the domains which the author of SAM regards as the most important in clients' evaluation.
We did not come across any other study in Nigeria that validated such education materials as ours. Meanwhile, a systematic review on the effect of print educational materials on professional practice and health outcome found Table 5 Inter-rater agreement among the professionals on the rating of English and Yoruba poster and leaflet, and Yoruba version of song using intra-class coefficient  only one study that validated print educational material in low-and middle-income countries [10]. In the developed countries, many studies are available, of which one of them was used as a guide for our study. The study is found in Brazil, and it combined the development and validation process of educational material on nutrition in pregnancy in one study [7]. Although, our study is only based on validation process not the development part of the educational materials. The authors of nutrition in pregnancy booklet utilized SAM's rating and found validity index > 0.8. Other studies that utilized SAM and reported findings which are consistent with ours include a study in Portugal on adolescents [8] and in Washington [32], they both achieved excellent validity index (> 0.8) also. The process of content validation in this study is consistent with other studies [7,8], but different from two other study which made use of four-point Likert scale as against SAM tool [9] and the one which made use of Delphi techniques among expert or judges [25]. These two studies utilized binomial test and Kappa to determine the agreement in the ratings of the experts. In all these studies, experts or judges or professionals were engaged in the validation process because they understand the phenomenon of focus [2]. In our own study, we made use of professionals who are stakeholders of maternal depression education; child health, health promotion, mental health, public health in mental health, maternal healthchild health, and health communication professionals. The agreement among these experts is the main thrust of content validation [2]. Our study measured inter-rater agreement on the Scale-Item Level Content Validity Index/Universal Agreement (S-ICVI/AU). The outcome is > 0.8 for each material, which signifies excellent agreement. This process is consistent with [7] in Brazil on nutrition in pregnancy booklet. This Brazilian study likewise made use of S-ICVI/AU and found excellent agree-ment> 0.8. We also utilized inter-rater reliability of the professionals' ratings using intra-class coefficient (ICC) which shows that their agreement is weak on the adequacy of the poster< 0.6. The ICC for leaflet and song shows excellent coefficient, the agreement of ratings among the professional is strong; ICC > 0.8. This same inter-rater reliability was used [33] in Brazil on prevention of metabolic syndrome among adolescent education material. The study considered inter-rater agreement among the professionals like ours and excellent agreement > 0.8 was achieved. It is not enough to utilize S-ICVI/AU alone, the inter-rater reliability shows the  clear-cut consistencies of the ratings on who agrees with who among the professionals. The evaluation of materials by the end users is as important as that of the professionals. If the target users find materials not adequate during face validation, the process of expert validation could be a waste [9,25], because the materials are made for the consumption of the users. In our study, the professionals and the clients utilized the same instrument to rate the educational materials, but their scores show that clients rated materials higher than professionals. The proportion of ratings among the users was considered as face validation. However, the possibility of an agreement in the ratings of the professionals and the clients was examined. Six clients out of the fifty clients were randomly selected for each material to compare with the six professionals for each material. This process of examining the agreement between professionals and clients' ratings is a rare occurrence in the field of educational material development, but our study attempted it. The comparison of the professionals' and clients' ratings shows a statistical difference on poster and leaflet, but on the song the difference is not statistically significant on the ′t‵ test analysis. On the interrater reliability, the intra class coefficient does not show any agreement across all the materials among the two groups of raters. Pearson correlation analysis only shows a negative linear relationship on the ratings of the leaflet. It shows no relationship on the ratings of the poster and song. This may imply that the translated SAM tool has a potential for difference when two groups of raters with different socio-demographic background make use of it for rating educational materials.
Similarly, lack of agreement was spotted in the findings of [25] between target users and the professionals' opinions. They had different opinions regarding educational materials for orthognathic surgery. Hence, the authors complied with the users' opinion. Although, the author noted that over simplicity may have occurred in the ratings of clients. This over simplicity could explain the much higher rating of clients than the professionals' in our own study also, but our study reckons with the judgement of professionals (restrictive rating of the materials) because of the sensitivity of the condition, 'maternal depression' . However, a study utilized ELAN, a German tool for rating children's vocabulary. The teacher group and the parent group with similar educational background rated children vocabulary, and agreement was found in their rating on the interrater reliability [31]. High Pearson correlation was also found on their ratings. This ELAN study throws light on the possibility of sociodemographic characteristics (educational background) playing a major role in agreement or disagreement in the ratings among two groups.
The highest qualification of the professionals in our study was FWACP (Fellow of West Africa College of Physician) and the lowest is Registered Nurse. All the professionals achieved tertiary education, and they have the technical know-how in their fields of practice. There is less disparity in the reliability of their ratings across the three materials. These findings agree with those studies which utilized SAM ratings among nurses with minimum of 15 years experience in women's health or related field as professional raters [7], and other study with participants having higher qualifications such as professors and PhD holders as professional raters [8]. Their tertiary educational background and expertise in the field of interest contributed to the similarities in their ratings. The minimum educational qualification of the client raters in our study is 9th grade. Hence, this could be the reason for the disparity. However, the clients found the materials acceptable.

Limitation and strength of this study
Our study has some limitations. All print educational materials are expected to reach the readability level of fifth grade educational qualification [34][35][36] . Majority (76%) of our clients had post-secondary education (post ninth grade), more women tend to achieve schooling beyond 5th grade in the study area. However, the Yoruba translated versions of the materials have taken care of the low literate people, and the development process of the materials which is not reported in this study involved maternal-child health clients with fifth to seventh grade. This study has the strength of showcasing process of validation of educational materials available in local language. It also provides evidence to the needless effort of comparing the ratings of people with different education background (clients and professionals) because their ratings will most likely not agree.

Conclusion
Our study found that the poster, leaflet, and song educational materials on maternal depression have excellent content validity among professionals and acceptable by clients. These materials can therefore be recommended for use in maternal child health clinics for educating clients. Researchers intending to develop educational materials in Africa can leverage this validation process for developing locally relevant materials.
Abbreviations SAM: Suitability Assessment Materials; I-CVI: Item level-Content Validity Index or Scale level-Content Validity Index; S-CVI/Ave: Scale level -Content Validity Index/ Average; S-CVI/UA: Scale level -Content Validity Index/Universal Agreement.