An application of CRC method was conducted using three cross-sectional data sources, which were the national birth registrations, the Ministry of Public Health (MOPH) standard health databases, and hospital-based survey data during the years 2008 to 2013. The study was approved after full review by the Committee on Human Rights Related to Research Involving Human Subjects of the Faculty of Medicine Ramathibodi Hospital (ID 12–55-01) and the Department of Health, the Ministry of Public Health (ID 027). All data owners officially granted access to databases. Pregnant women were included in our study if they were aged 15 to 19 years at delivery. The outcomes of interest were live births and non-live births. The live birth was defined as a complete expulsion or extraction of a product of conception from mother after 22 weeks of gestation with sign of evidence of life or breath. The non-live births included miscarriage, induced abortion, stillbirth, and other abnormal pregnancies which were defined as follows: Abortion, which included induced abortion and miscarriage, which was defined as any delivery which occurred before 22 completed weeks of gestation. Stillbirth was defined as fetal death after 22 completed weeks of gestation. Abnormal pregnancy included ectopic pregnancy, molar pregnancy and others.
Data sources
Three data sources were used to estimate the adolescent pregnancy rate as follows. First, the National Birth Registration (Source1), is operated by the Bureau of Registration Administration (BRA), the Ministry of Interior. The birth registration is compulsory for all live newborns who are Thai citizens and born in Thailand. The second data source was the MOPH Standard Health Databases (Source2), which included the hospital-based data from the hospitals under the Thailand Universal Healthcare Coverage Scheme. A limitation of this data source is it accounted for only about 80% of all hospitals across the country. To overcome shortcomings of Source1 and Source2, we performed nationwide cross-sectional hospital-based survey (Source3) for the last data source. Pregnancy data of 1321 hospitals providing obstetrics and gynecology services during January 1st, 2008 to December 31st, 2013 were retrieved. A sample size estimation of hospital-based survey was calculated based on estimation of prevalence. This yielded estimated sample size of 29,213 cases. A stratified cluster random sampling without replacement was applied to randomly select sample hospitals across the country. Region and province were considered as stratum and cluster, respectively. All data collection processes were managed by the Data Management Unit (DMU) at the Section of Clinical Epidemiology & Biostatistics, Faculty of Medicine Ramathibodi Hospital, Mahidol University.
Data management
Data were checked according to year of delivery and age at delivery. Any observation was excluded from databases with the following criteria: duplicated pregnancy of the same person and episodes, which were defined as the pregnancy of the same person whose gestational age intervals were less than 24 weeks from previous gestation. Complying with data privacy regulation, the personal identifiable data in all of the three data sources were deidentified with encryption using message-digest algorithm 5 (MD5). The encrypted Citizen Identification Number (CID) combined with date of delivery were used as a unique key for merging the three databases.
Statistical analysis
Numbers of pregnant women were described according to data sources and year of delivery. A proportional Venn diagram of the three data sources and the contingency data according to data sources and year of delivery was constructed. To perform CRC analysis, only data from public hospitals under the Office of Permanent Secretary (OPS) were selected from Source1, Source2, and Source3 based on probability of pregnant women being identified from each data source. Pregnancy records were then stratified into live birth and non-live birth groups according to pregnancy outcomes. The pregnant women with multiple gestations were counted as one per one pregnancy episode. In cases of multiple gestations with mixed birth outcomes (live birth plus stillbirth) the pregnant women were only categorized into the non-live birth group to avoid repeated count.
For live-birth group, a CRC was performed using all three data sources. These data were prepared as aggregated data of number of pregnancies in a 2x2x2x6 contingency table. The first three variables referred to data Source1 (Yes/No), Source2 (Yes/No), and Source3 (Yes/No) whereas the last variable referred to year from 2008 to 2013. A CRC was performed using a Poison regression with log link function. The regression models were constructed based on combination of main effects and two-way interaction between each of the data sources. Year of delivery and the interactions between year of delivery and data sources were also put in the models. Performance of each model was assessed and compared using Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). The parsimonious model was then used to predict missing numbers of pregnant women who were not identified from Source1, Source2, and Source3. The total number of pregnant women was further calculated by combining the predicted numbers with the total observed number of pregnancies.
For non-live birth, only the data from Source2 and Source3 were used because non-live birth had no chance to appear in the Source1. Therefore, the 2-source CRC was performed to estimate the missing cases and thus the total number of non-live birth pregnancies was filled in.
Adolescence pregnancy rate was estimated by dividing the combined estimated total number of pregnant women from group 1 and group 2 with the number of midyear women population aged 15–19 years, which was annually reported by BPS in Thailand public health statistics [20]. All statistical analyses were performed using STATA version 14.0 [21].