Is devoting resources to rigorous evaluation worthwhile?

How do you know that your agency’s policies and practices are effectively keeping children safe and reducing the use of foster care? Generating evidence — that is, information about whether a policy or practice is effective, for whom, and in what context1 — can inform your decisions about policies and practices meant to improve safety and permanency outcomes (for example, preventing removals, lowering repeat maltreatment, and achieving timely permanence). Only a small number of practices in child and family services, however, are considered well-supported by research to improve outcomes. According to the California Evidence-Based Clearinghouse, 31 out of 423 programs listed are rated well-supported by research, and only two of these are rated highly relevant to child welfare.2 Without additional evidence, child welfare agencies are serving families with policies and practices of unknown effectiveness for improving safety and permanency outcomes.

While much is involved in designing and conducting rigorous evaluations that generate evidence about the effectiveness of your policies and practices, this brief focuses the essential elements for doing that. Rigorous evaluations that randomly assign families to a treatment or control group, often called randomized control trials (RCT), provide strong confidence in the findings about a program or practice’s effectiveness. That is why an RCT is often considered the “gold standard” for research evidence. But as we discuss below, there are challenges involved in conducting RCTs in child welfare settings.

Why is it important to generate evidence about effectiveness?

Determining the effectiveness of your policies, programs, or practices can inform decisions about how best to use the limited resources available to provide safe and permanent families for children. Knowing for whom and in what context a policy and practice is not effective is just as powerful as knowing that it is. While multiple methods are available for assessing your policies and practices, an RCT discerns the effects of the policies and practices on the outcomes of interest. For example, if the RCT finds a decrease in removals or repeat maltreatment, or an increase in timely permanence for children, you can have more certainty that this is because of the practice evaluated and not some other factor, such as differences between the families you serve. But even RCT findings that document that a policy or intervention is “effective” with good effect size are not sufficient for determining whether an intervention will be effective in your jurisdiction. Implementation sites often vary greatly, and we need to know why that variation occurs and how agencies can minimize it and achieve consistent positive performance across most implementation sites.3

The following elements will help you generate more evidence about policies and practices that reduce the use of foster care, which in turn will provide your agency and the child welfare field with more information about what works, for whom, and in what context. These elements include building a team with the skillset to conduct a rigorous study, determining whether the policy and practice you want to study is ready for evaluation, ensuring necessary data sharing agreements are in place with other agencies (if applicable), and building on the initial study with more rigorous evaluation (if applicable) and continuous quality improvement (CQI) activities to maintain and improve positive outcomes.

Essential elements of conducting rigorous evaluation

The first element is to start with an internal team or a research contractor or partner that together have the skillset to conduct a rigorous evaluation. This may include staff within your agency; however, if your staff do not have all the necessary knowledge and skills to conduct the study, develop a research-practice partnership with an organization that does. Consider tapping into your existing Title IV-E training program relationship with a university, if applicable, or include other partners with the experience and requisite skillset. Developing a research-practice partnership is more than contracting for a specific evaluation; it involves “long-term, mutualistic collaborations between practitioners and researchers that are intentionally organized to investigate problems of practice and solutions for improving outcomes.”4 A rigorous evaluation might also necessitate a request for proposal (RFP) to help select an evaluator who has the abilities and capacity you need.  Long-term collaboration with your research partners allows for the development of a research agenda that informs your agency’s work as well as fits the interests of the researchers.

After the team(s) with the necessary skillset is acquired, the second element is to identify whether the policy and practice is ready for rigorous evaluation. Readiness requires the existence of clear, written staff behaviors that can be observed and measured. It also requires the existence of clear referral processes for families to the practice and of training and supervision for staff. You should also have clear indications from data that the program is effective with the intended clients. If your policy and practice is ready for rigorous evaluation, plan to conduct the type of study appropriate for your practice — that is, formative evaluation or summative evaluation. The PII Approach: Building Implementation and Evaluation Capacity in Child Welfare (Rev. ed.)5 and A Framework To Design, Test, Spread, and Sustain Effective Practice In Child Welfare6 are examples of phased approaches to rigorous evaluation that involve formative and summative evaluation. The first phase — formative evaluation — determines if the policy and practice works as planned by examining if the outcomes of interest are trending the right direction. The second phase — summative evaluation — is intended to reach a judgment about the effectiveness of the policy and practice based on its impact on long-term outcomes.

Regardless of the type of study (formative or summative evaluation), there may be a need to access additional data collected by partner agencies (e.g., in education or behavioral health) for analysis. This is the third element of evidence generation. A written agreement, such as a memorandum of understanding (MOU), can provide a structure for data sharing.

Lastly, depending on the type of evaluation completed, additional rigorous study may be recommended.5,6 For example, if your organization conducted a formative evaluation that has found outcomes trending in the correct direction, you may consider conducting a summative evaluation to see if the program ultimately achieved its goals. If you conducted a summative evaluation and it found positive outcomes, you should continue implementing the policy and practice and use CQI activities to maintain the outcomes and strengthen implementation. As a reminder, if the formative or summative evaluation found that outcomes were not improved or trending in the right direction, that is also valuable information and should inform decisions about continuing a policy and practice.

Key Components of an MOU

  • Parties to the MOU, with a short description of each organization
  • Purpose of the MOU (i.e., project, objective, and goals)
  • Timeframe
  • Roles and responsibilities of each organization as part of the MOU
  • Resources (e.g., monetary, in-kind, staff time) that each organization commits
  • Products to be produced, with a timeline for delivery of draft and final products
  • Rights to own or use or share the instruments, reports and other products from the collaboration (commonly found in the intellectual property rights section)
  • Conditions for modifying or terminating the MOU
  • Signature of executive from each organization who has authority to commit the organization

Key questions to ask

What evidence do you have that your jurisdiction’s policies and practices are effectively contributing to improved safety and permanency outcomes?

Do you have data available suggesting you have a practice, policy or program ready for an RCT?

How can your jurisdiction’s CQI processes contribute to the evidence about the effectiveness of your policies and practices?

What assurances are necessary within your jurisdiction to increase the comfort level of policymakers, the courts, and child welfare staff with rigorous evaluation methods, such as random assignment, and reduce the likelihood that evidence building efforts will be compromised?


1 Lester, P. (2016). Defining evidence down. Stanford Social Innovation Review 14(3). Retrieved from
2 The Evidence-Based Clearinghouse for Child Welfare. (n.d.). Retrieved from
As of December 6, 2017, 31 out of 423 programs listed are rated well-supported by research.
3 LeMahieu, P.  (2017).  Networked Improvement Communities—What? Why? And A Little How. Presentation to Casey Family Programs, November 8, 2017.
4 Tseng, V., Easton, J. & Supplee, L. (2017). Research-practice partnerships: Building two-way streets of engagement. Society for Research in Child Development, Social Policy Report (30)4. Retrieved from
5 Permanency Innovations Initiative Training and Technical Assistance Project & Permanency Innovations Initiative Evaluation Team. (2013). The PII Approach: Building implementation and evaluation capacity in child welfare (Rev. ed.). Washington, DC: U.S. Department of Health and Human Services, Administration for Children and Families, Children’s Bureau, and Office of Planning, Research and Evaluation. Retrieved from
6 Framework Workgroup. (2014). A framework to design, test, spread, and sustain effective practice in child welfare. Washington, DC: U.S. Department of Health and Human Services, Administration for Children and Families, Children’s Bureau. Retrieved from


Send this to a friend