Early Childhood Learning

Researching Data Modeling with Young Learners

Link copied to clipboard

When you think of teaching about data analysis and data science, what might come to mind is a group of high school or college students working on statistical analysis problems. However, at the New York Hall of Science (NYSCI), teaching data science looks very different: 5 to 8-year-olds and their caregivers exploring the museum through the lens of exhibit designers, asking questions, and collecting, analyzing, organizing, and interpreting data.

Data science and statistical reasoning are now considered essential at the middle, high school, and college level, so much so that the Next Generation Science Standards explicitly list “analyzing and interpreting data” as one of the Science and Engineering Practices (NGSS Lead States, 2013). Historically, research on students’ understanding of data science concepts has focused mainly on children in middle school and above (Makar, 2016). In recent years, this research has expanded to investigate young children’s understanding of data science and data modeling (English, 2012). However, this research is limited in that it is mainly focused on determining how young children could be introduced to data science concepts in schools and classrooms, and the educational strategies that have been developed to date have taken place mainly informal settings.

Project Overview and Research Questions

In a departure from these traditional settings and investigations, NYSCI is employing an ambitious new approach in an initiative entitled, Big Data for Little Kids: Data Modeling with Young Learners and their Families, supported by the National Science Foundation (DRL1614663). We have created a data modeling program and curriculum for children ages 5 to 8 and their caregivers at the museum and studied its impact on children’s learning and family interactions. The concepts we focused on are derived from previous research on data modeling with elementary grade students (Lehrer and Schauble, 2002; Lehrer, Kim, and Schauble, 2007). These studies examined how young children engage with key ideas in data science, such as, 1: You can define your data, 2: You can choose how to represent your data, and 3: You can find answers to your questions by looking at the data.

Centered on these modeling concepts, the research questions driving the project are:

How does the curriculum need to be focused and structured in order to engage children ages 5 to 8 and their caregivers with the target concepts and practices as they work together, over a sustained period of time, in an informal learning setting?
When family groups participate in the six-week workshop series, is there evidence of sustained engagement with the targeted data modeling concepts and practices among children and their caregivers?
When family groups participate in the six-week workshop series, is there evidence that participating children increase their use of positive approaches to learning, such as taking initiative, taking responsibility for solving problems, and actively drawing on available resources to address a need?

Digging Into the Research Methods

Wrestling with such an enormous undertaking, we were faced with two immediate challenges, 1: How do we integrate data modeling concepts into a museum workshop program? and 2: How do we capture relevant evidence to address our research questions?

In terms of the workshop program, the key ingredient to integrating data modeling concepts was leveraging NYSCI’s Design, Make, Play approach. We encouraged families to engage in co-learning experiences by using the museum setting as an invigorating backbone for data exploration and conversation. For more details on how we worked through this challenge, see this blog posting.

Once we figured out the workshop program, we focused on the latter challenge, developing methods for observing how families learned together in the workshop. We began by exploring the relevant literature to understand how other researchers had captured children’s engagement with data modeling and families’ approaches to learning. We decided to adapt observation protocols from previous research that had been used to assess parent-child interactions, children’s learning in formal settings, and qualities of classroom settings, 1: the Mother-Child Interaction protocol (MCI), 2: the Child Observation Record (COR), and 3: the Classroom Assessment Scoring System (CLASS).

Then came the real test: using the protocols during the actual workshops. Researchers recorded observations using these protocols live during the workshop, and we also recorded audio and video of each family for further analysis. We quickly learned that attempting to use three different observation protocols in an active-inquiry-based workshop program was an impossible task for a number of reasons. First, we found it challenging for two researchers to be able to observe all of the families participating in the workshop. Second, one of the protocols was developed for younger children and didn’t work as well with 5 to 8-year-olds in our informal setting. Lastly, when it came to observing the whole classroom, it became clear to us that the informal museum workshop setting did not fit the same mold as the formal classroom criteria the protocols were built to assess.

Rethinking Our Data Collection Activities

Keeping all of these things in mind, we restructured our data collection methods for the second iteration of the program. We discontinued use of two of the protocols (COR and CLASS), revised the protocol focused on how families interact (Parent-Child Interaction), and rather than attempting to do the observations live during the workshops, we decided to code using the recorded video data after the workshops.

During the actual workshops, we focused on having each researcher follow a single family unit to create more of a case-study analysis. We also placed a greater emphasis on talking to the families after each day of the workshop to gather more information about their reflections on the data concepts and activities they worked on that day, as well as feedback on the programmatic elements.

Finally, we developed coding schemes to analyze the conversations between children and their caregivers using the audio and video recordings of the workshop. The data collection methods used in the second iteration proved to be a much more manageable endeavor, and allowed researchers to be more actively involved in talking with families throughout the workshop.

Reflecting on Our Research Process

So what have we learned from all of this?

More isn’t necessarily better.
Having three protocols to look at the data through different lenses seemed like a good idea initially, but once we got a better sense of the workshop dynamics, we realized we could still build a robust evidence base with a more focused and intense use of one main protocol.
Don’t rule out the power of being in an informal setting.
Trying to fit the mold of other protocols that were for formal classrooms did not work well. We needed to be more flexible and mindful in recognizing ways that the museum itself is an asset to why the program was so engaging.
Getting to know your participants doesn’t mean getting in the way of the research.
Some of the most powerful information came from the interactions researchers and participants had during the workshop as we built case studies and during the reflection conversations; being in an informal setting and focusing on one main protocol are part of what allowed for these connections to grow organically.

We hope that our Big Data for Little Kids journey can help illuminate the successes and challenges of conducting research in an informal museum setting. Currently, we are still in the process of analyzing our preliminary findings. Stay tuned for more about what we find out!