We examined the unique effect of individual preparation followed by collaborative learning compared with collaborative and individual learning. Therefore, we divided participants under three conditions based on learning activities: individual preparation for collaborative learning (IP), collaborative learning alone (C), and individual learning (I). The dependent measure was the score on the final test, consisting of comprehension and transfer questions.
Methods
Participants
Before the experiment, we conducted a pilot test with 24 participants, which yielded an effect size of (ηp2 = 0.13). Based on this, we performed a G*Power priori power analysis effect size (ηp2 = 0.13) with an alpha of 0.05 and a power of 0.90. The G*Power analysis recommended that we use 87 participants (29 participants per condition). Therefore, we chose to have at least 29 participants per condition. We recruited 88 undergraduate students at a university in Seoul, Korea, to the experiment in exchange for course credit. The participants (men: 50, women: 38) were of Asian descent, specifically Korean. This specificity was due to the utilization of learning materials written in the Korean language. They were recruited through the online recruiting system of the Department of Psychology. The participants provided informed consent, which was approved by the Institutional Review Board (IRB No.1804/003–005) of the university. Five participants with prior knowledge were excluded from analysis. Therefore, data from 83 participants (mean age = 20.30 years; SD = 2.05) were analyzed.
Each participant was randomly assigned to under one of the three conditions: IP (n = 29), C (n = 30), and I (n = 24). We randomly assigned groups of three or four participants to the IP and C conditions. On the contrary, participants under the I condition worked on problems individually.
Learning material
Participants studied the subject “criminal procedure code and the accusation and charge” using seven pages of written material utilized in prior research [21]. We expected that this topic will minimize the possibility of the influence of prior knowledge on the final test, because the university where participants attend does not offer law courses for undergraduate students. Nevertheless, there could have been participants who were familiar with the selected topic, which may influence the scores [22, 23]. A background knowledge survey using a 7-point Likert scale, ranging from 1 (having no knowledge) to 7 (having expert knowledge), was conducted prior to the experiment. Five participants who reported a point higher than 4, indicating that they had previously studied the topic, were excluded from the final analysis.
Learning activities
The participants under the three conditions drew a concept map that included a summarization of the learning material and three or more questions about the content. The procedure for each condition was as follows: under the IP condition, participants initially worked on a concept map individually for the first 9 min. Three to four students formed a group then collaborated to draw a concept map together for 11 min based on the concept maps of each member. Under the C condition, three to four participants collaborated for 20 min to create a concept map from the beginning. Lastly, under the I condition, participants independently drew a concept map for 20 min.
Test questions
We utilized a total of 10 questions, consisting of six comprehension questions and four transfer questions from Lim and Park, which are directly related to the learning material [21]. The total score was 40 points with 22 points allocated for six comprehension questions and 18 points for four transfer questions. Appendix 1 provides the sample questions for each type.
Below is an example of the predetermined partial points and scoring manual in comprehension questions. For instance, a comprehension question from Appendix 1 asked participants to explain who is entitled to file a complaint. This question was worth 3 points, with partial points assigned as follows: (1 point) “no one to make the accusation,” (1 point) “prosecutors shall designate the person with the right to file a complaint,” and (1 point) “within 10 days upon request by stakeholders.” Raters scored predetermined partial points whenever the keywords from each response were included.
In contrast, answering the four transfer questions required participants to engage in deeper thinking about the learning material. And they were also open-ended and the scoring for these questions considered whether the responses included essential components aligned with the learning objectives, with specific criteria predetermined as part of the scoring process.
Below is an example of the predetermined partial points and scoring manual in transfer questions. For instance, a transfer question from Appendix 1 was worth 5 points, with partial points assigned as follows: (1 point) whether the victim (V) was underaged was considered, (1 point) whether the participants understood that sexual crimes are not offenses subject to complaint, (1 point) whether participants recognized that accusation by the victim is not required for prosecution, (1 point) whether participants recognized that accusation by the legal representative is not required for prosecution, and (1 point) whether the conclusion “the court of appeals will reject D’s claim” was derived. Raters referred to this manual to determine whether each element was present in the responses and assigned scores accordingly.
To ensure scoring consistency and reliability, raters were also trained before scoring the participants’ responses. During training, raters reviewed manuals similar to those described above and practiced scoring using five sample responses. They discussed discrepancies to ensure a shared understanding of the criteria and alignment in their evaluations.
Procedure
Experiment 1 was conducted in a laboratory. Participants were firstly required to take the background knowledge survey. Afterward, they individually studied the learning material for 10 min. They then summarized the content and created at least three questions related to the subject on a concept map for 20 min according to the assigned condition. The experimenter provided no guidance on the collaboration process in order to minimize potential influence. Participants were allowed to review the learning material while creating concept maps. Finally, they took the final test for 15 min without access to the learning material. Figure 1 depicts the procedure of Experiment 1.

Detailed procedures of learning activities for each condition in Experiment 1
Analysis
We conducted an ANOVA and Tukey’s HSD to determine the learning effect of each learning condition. Two raters graded the scores used in the analysis to ensure reliability in grading. The scores used for analysis were initially rated by a single rater, the first author, who specializes in educational psychology. To ensure scoring reliability, a second rater—a practicing lawyer in the Republic of Korea—independently evaluated over 50% of the total responses (42 responses). We then measured the intraclass correlation coefficient (ICC) of the scores graded by both raters. The agreement between the raters, as measured by the ICC, was 0.93. This indicates that the grading of the first rater could be sufficiently trusted. Statistically significant differences were denoted by p-values < 0.05. The effect sizes of ANOVA were confirmed using partial eta square (ηp2).
Results and discussion
Table 1 displays the mean and standard deviation of the scores. ANOVA displayed a statistically significant difference among the three conditions (F(2, 80) = 8.98, p < 0.001, ηp2 = 0.18). Tukey’s HSD shows that there were significant differences between the IP condition and I conditions (p < 0.001) and IP and C conditions (p = 0.033), but no difference between the C and I conditions (p = 0.180). Therefore, participants in the IP condition outperformed those in the C and I conditions.
To determine the cause of the difference in the total score, we conducted ANOVA on comprehension and transfer scores. The results indicated that significant differences exist among the three conditions (comprehension: F(2, 80) = 7.03, p = 0.001, ηp2 = 0.15; transfer: F(2, 80) = 3.35, p = 0.040, ηp2 = 0.08). Therefore, we confirm that individual preparation for collaborative learning leads to better learning outcome to either collaborative or individual learning.
Since our results, like those of previous studies, indicate that individual preparation for collaborative learning is an effective learning method, this approach could potentially be applied to education for health professionals. However, as Experiment 1 used only learning material from law subject with students from various disciplines, it is necessary to conduct an experiment specifically targeting medical and dental students using educational material relevant to medical education. Therefore, Experiment 2 aims to replicate the results using learning material from education for health professional and task with medical and dental students.
link