Painless Program Evaluation: Step by Step Guide to Measuring Outcomes

Painless Program Evaluation: Step by Step Guide to Measuring Outcomes

This guidebook provides an easy-to-follow approach to evaluating program outcomes. Authored by Christina J. Borbely, Ph.D., and produced by the Center for Applied Research Solutions for the California Department of Alcohol and Drug Programs, it is a valuable resource for organizations seeking to measure the success of their programs.

About Painless Program Evaluation: Step by Step Guide to Measuring Outcomes

PowerPoint presentation about 'Painless Program Evaluation: Step by Step Guide to Measuring Outcomes'. This presentation describes the topic on This guidebook provides an easy-to-follow approach to evaluating program outcomes. Authored by Christina J. Borbely, Ph.D., and produced by the Center for Applied Research Solutions for the California Department of Alcohol and Drug Programs, it is a valuable resource for organizations seeking to measure the success of their programs.. The key topics included in this slideshow are program evaluation, measuring outcomes, step-by-step guide, Center for Applied Research Solutions, California Department of Alcohol and Drug Programs,. Download this presentation absolutely free.

Presentation Transcript

1. PAINLESS PROGRAM EVAULATION Step-by-Step Guide to Measuring Outcomes Center for Applied Research Solutions, Inc 771 Oak Avenue Parkway, Suite 3 Folsom, CA 95630 (916) 983-9506 TEL (916) 983-5738 FAX

2. PAINLESS PROGRAM EVAULATION Step-by-Step Guide to Measuring Outcomes Facilitators: Kerrilyn Scott Christina Borbely Produced and Conducted by the Center for Applied Research Solutions, Inc. for the California Department of Alcohol and Drug Programs SDFSC Workshop-by-Request January 13, 2005 Authored by Christina J. Borbely, Ph.D. Safe and Drug Free Schools and Communities Technical Assistance Project

3. Objectives Facing Fears Program Evaluation What-ifs & What-to-dos Review Guidelines General & SDFSC Evaluation Guidelines Identifying Outcome Indicators Dealing with Design Choosing Instrumentation What Factors To Consider Types of Item & Response Formats Putting It All Together Compiling An Instrument Developing a Finished Product

4. Facing Fears Program Evaluation What-ifs Youth Service Providers Meet ambiguous requirements from a treetop Evaluate stuff hopping on your left foot

5. Program Evaluation What-ifs What if resources are limited? What if the program shows no positive impact on youth? What if we thought we could utilize the CHKS data for our countyand can not? What if we changed our program design along the way?

6. CYA Deal with likely culprits that effect outcomes of program. 1. Programming or program implementation. 2. Program evaluation design and implementation.*

7. Guidelines to Observe SDFSC Program Evaluation Guidelines General Guidelines for Program Evaluation Also GPRA (federal) CalOMS/PPGs (California)

8. DOE Recommends: SDFSC Evaluation Guidelines Impact . Performance measures must include quantitative assessment of progress related to reduced violence or drug use. Frequency . Periodic evaluation using methods appropriate and feasible to measure success of a particular intervention. Application . Results applied to improve the program; to refine performance measures; disseminate to the public. *These guidelines are taken directly from the USDoE Guidelines for SDFSCA .

9. General Guidelines for Program Evaluation Logic-model-based Research-based measured outcomes area a direct extension of the mission and are achieved through the programs activities. Outcome-based Measure degree to which their services create meaningful change. Participatory- be an informed participant in the evaluation process

10. More general guidelines Valid & Reliable Instruments measure what they purports to measure & do so dependably. Utilization-focused - Generate findings that are practical for real people in the real world to help improve or develop services for underserved youth. Rigor Incorporate a reasonable level of rigor to the evaluation (e.g. measure change over time).

11. Federal-level Requirements GPRA The Government Performance and Results Act (GPRA) indicators for reporting success levels of their programs. A number existing instruments include these indicators. The Center for Substance Abuse Prevention provides instruments designed for adults and youth. GPRAtool.pdf

12. CA State-level Requirements CalOMS/PPGs The California Outcomes Measurement System (CalOMS) is a statewide client-based data collection and outcomes measurement system. Performance Partnership Grant (PPG) are requirements for prevention outcome measures

13. Identifying Outcome Indicators Risk & Protective Factors as Indicators Individual vs. Community Level Indicators Indicators with Impact

14. Indicators Are Your Guide: Follow them Forward Never work backwards! Select instruments based on your indicators NOT indicators based on your instruments. Indicators can be categorized as risk and protective factors.

15. A Risk & Protective Factors Framework Resiliency: the processes operating in the presence of risk/vulnerability to produce outcomes equal to or better than those achieved in no-risk contexts. Protective factors may act as buffers against risks Protective factors may enhance resilience (Cowan et al, 1996)

16. Risk & Protective Factors as Indicators Risk and protective factors associated with ATOD use and violence* Aggressive and disruptive classroom behavior predicts substance use, especially for boys Positive parent-child relationships (ie bonding) is associated with less substance use. Adolescents with higher levels of social support are more likely to abstain from or experiment with alcohol than are consistent users. School bonding protects against substance use and other problem behaviors. Ready access to ATOD increases the likelihood that youth will use substances. Policy analysis indicates that the most effective ways to reduce adolescent drinking includes, among other things, zero tolerance policies. Employee drug use is linked with job estrangement and alienation. * CSAP Science-based Prevention Programs and Principles

17. Risk & Protective Factors Models Gibson, D. B. (2003) CSAP 1999

18. Many outcome domains and multiple phrases that refer to a common domain. Frequent use of certain terms within the field. Risk and protective factors fall into different outcome domains. OUTCOME DOMAINS: You say tomato

19. Protective Factors Similar/Same Terms Life skills Social competency Personal competency Attitudes Individual/interpersonal functioning Sample Indicator Score on prosocial communication scale

20. Risk Factors Similar/Same Terms Delinquency Behavior problems violence Sample Indicator # of fights reported on school record last year

21. Individual versus Community Level Indicators The more diffuse the strategy, the more difficult to see an impact at the individual level Assess individual outcomes when services are directly delivered to individuals Assess community outcomes when services are delivered in the community

22. Community Level Indicators 1 st : Define community as narrowly and specifically as possible. Community can be: stores in a given radius; policies in a local town; residents in a specific sector 2 nd : Defined as short to intermediate term indicators. Community level indicators can be: # of letters written to legislators # of AOD related crimes, deaths, or injuries

23. Identifying Your Indicators Research informs links between services and outcomes. Use existing research to assess what outcomes might be expected. See Resources section Develop short term, intermediate, and long term indicators

24. Countdown to impact? Measure an impact that can be expected based on your services Teaching conflict resolution? Measure conflict resolution ability, not general social skills. Providing information on effects of alcohol use? Measure knowledge of alcohol effects, not heroin use.

25. Use no change in ATOD use/Violence as indicator of impact Indicator: The incidence of participating youths physical fights will not increase over time. Use comparison of ATOD use/Violence rates to national trends as indicator of program impact Indicator: Compared to the national trend of increasing rates of ATOD use with age, rates among participating youth will not increase.

26. What the future holds Indicator Targets & Thresholds Identifying levels of predicted outcomes

27. Guide: Step 1 Kids today! Review of Evaluation Logic Models Introducing Program A Listing Your Outcome Indicators

29. Program A Primary Substance Use Prevention Targets adolescents and parents of adolescents Afterschool (youth); Evening/week (adult) CBO Site location: local schools Staff: majority are school staff: aides/teachers

31. Your Programs Indicator List

32. Program A YOUTH Indicator List

33. Optimizing Evaluation Design Assigning Priority Increasing Evaluation Rigor

34. Assigning Priority to Evaluation Components More evaluation resources for program components with more service intensity pre-post test designs Fewer evaluation resources for program components with fewer services record attendance rate at community seminar

35. Design Options to Increase Rigor Incorporate experimental design (if possible) OR Control groups (requires some planning) Comparison groups (easier than you think!) A multiple assessment schedule with follow-up data points, such as a 6 month follow-up, increases evaluation rigor.

36. Choosing Instrumentation: Abstract Concepts to Concrete Practices

37. Factors to Consider for Evaluation Tools Key Concepts for Measurement Reliability Validity Standardized vs. Locally-developed Items Item and Response Formats

38. Resources that report reliability & validity PAR Psychological Assessment Resources NSF Online Evaluation Resource Library More resources listed on pages 155-156 of Planning For Results OR See the PPE Resources section.

39. IS THAT INSTRUMENT RELIABLE & VALID (AND WHO CARES IF IT IS)? Reliability: A reliable measure provides consistent results across multiple (pilot) administrations . Validity: The extent to which an instrument measures what it is intended to measure, and not something else.

40. Who Cares If It Is Reliable & Valid? You Do! You want to be certain that the outcomes are not a fluke Reliable and valid instruments are evidence of a rigorous program evaluation and inspire confidence in the evaluation findings

41. Is It Reliable? The number that represents reliability, officially referred to as Cronbachs Alpha () , will fall between .00 and 1.0. Rule of thumba reliable instrument has a coefficient of .70 or above (Leary, 1995). Think of a reliability coefficient as corresponding with an academic grading scale: 90-100 A excellent 80-90 B above average 70-80 C average/sufficient 70 and below D less than average

42. Is it Valid? Using CONSTRUCT VALIDITY involves testing the strength of the relationship between measures it should be associated with (convergent validity) AND measures it should not be associated with (discriminant validity). Trends are reported as correlation coefficients (r) (ranging from (+/-) .00 to .10). For reference, to validate a depression instrument it is compared to measures of sadness & happiness: Positive correlation (r=.83) indicates that the two independent scores increase or decrease with each other; as depression scores increase, sadness scores increase. Negative correlation (r=-.67) indicate that the two independent scores change in opposite directions; as depression scores increase, happiness scores decrease.

43. TRICKY TRICKY! Reliability & Validity Can Be Sticky! Instruments can be highly reliable but not valid. Reliabilty AND Validity are context- specific!

44. Target Practice Not reliable or valid Reliable, not valid Valid, but not reliable RELIABLE AND VALID

45. Looking It Up Find the name of measure (include version, volume, etc.) __________________________ Record the details of the reference (author, title, source, publication date) __________________________ Seek other potential references cited in the text or bibliography __________________________ Identify details about the population tested (sample) # of people (sample size) _____________________ ethnicities _____________________ languages _____________________ socio-economic status (SES) _________________ other details _____________________ Locate statistics on the measures reliability Overall reliability _____________ Any subscales __________ Report information on the measures validity (e.g. type of validity tested, results from validity tests) _____________________

46. Measure: Attitudes Toward Drug Use Description: Seven questions from the Student Survey of Risk and Protective Factors/Respondent and Perceived Family Attitudes Toward Drug Use. Target Population: General population of students in grades 6, 8, 10, and 12 Construct(s): Attitude Toward Use Respondent: Self Mode of Administration: Pencil and paper self-report Number of Items: 7 Burden Estimate (hours): Nominal Available languages: English and Spanish Reliability: 0.88 Validity: High concurrent validity with drug and alcohol use and delinquency. Source: Social Development Research Group University of Washington 9725 3rd Ave. NE, Suite 401 Seattle, WA 98115-2024 206-685-3858

47. Types of Instruments Standardized vs. Locally-Developed Formats Response Options Subscales

48. TO USE STANDARDIZED OR LOCALLY DEVELOPED INSTRUMENTS? (THAT IS THE QUESTION. ) Consider pros and cons Also an option: Combining standardized measures or scales with a few locally developed items into one instrument.

49. Standardized Instruments PROS CONS Already constructed! Lots of content choices! May not tap into novel/unique aspects specific to your program Psychometrics have already been established (valid & reliable) May not have been tested/normed with your projects population (e.g. age or racial group) Easy to compare results across projects, to national scores, etc .

50. Locally Developed Instruments PROS CONS No cost Time consuming to develop (i.e. pilot testing for reliability & validity, etc.) Able to measure unique program features Difficult to compare to other programs, similar curriculums, national standards, etc. May be redundant with already existing measures

51. 32 Flavors and then some Instruments come in many formats, such as: Questionnaires,surveys, checklists Interviews Focus groups Observations Response options run the gamut Yes/no Continuum Open-ended

52. Package Deal: Instruments That Come With Curricula Tend to measure knowledge (not necessarily behaviors or attitudes) Consider extent to which the curriculum developers measure aligns with indicators you have identified as outcome goals.

53. Buffet Style Instrumentation: Something for Everyone! Use subscales Combine standardized measures with a few locally-developed items Use scales from different standardized measures Do a survey & an interview Assess the youth & the parent

54. Guide: Step 2 Identify Criteria Existing Instruments CHKS CSAP

55. What Works for You Identify your criteria for a measure Consider: Required elements of evaluation Is it appropriate for your population (age, ethnicity, language, education level, etc) Cost Research based? Psychometrics available? Time required for completion Scoring

56. Program A Instrument Criteria

57. Existing Instruments CHKS CSAP Core Measures Index See Resources section for more!

58. California Healthy Kids Survey Module A: Demographics & Core Areas Module B: Resilience and Youth Development Module C: AOD, Safety (including violence & suicide) Module D: Tobacco Module E: Physical Health Module F: Sexual Behavior (including pregnancy and HIV/AIDS risk)


60. All Together Now Instrument design pointers Administering your instrument

61. HARD HAT ZONE: Compiling a Complete Measure Keep track of the origin of all the individual components (measures, scales, items). Record of each components source whether you came up with the question yourself or its a scale from a broader instrument. Useful when for program evaluation report or if need to replicate or explain your methodology.

62. Word To The Wise: Subscales In order to maintain the integrity of your instrument, you must preserve the reliability and validity of each component. Dont change wording in items or response options . You might really really want to. But dont. Dont subtract items from subscales. Resist the temptation. It really does matter. Do use relevant subscales. These are predetermined clusters of items, e.g. subscales of an aggression instrument are aggression towards people and aggression towards property. Pick and choose subscales if the complete measure exceeds your needs. Make sure the scale is appropriate for your population!

63. Simplify & Streamline Dont duplicate items! (unless you mean to) Recording date of birth, gender, and race in the program registration log? Dont include these items in your survey. Dont over-measure! Using a conflict resolution AND a problem-solving scale? Be sure that they are differentiated enough to add unique information on your program impactor else select the ONE scale that best targets your construct of interest.

64. Organizing items Start off with simple (non-threatening) questions , like age, grade, gender, etc. Break it up .Avoid grouping all the sensitive items (e.g. ATOD use) at the beginning or end of the instrument. End on a positive (or at least neutral) tone. Consider ending with a items on hopes for the future or how I spend my free time. Item to item fluidity is important for ease and accuracy of the respondent. Also, make sure changes in response option format are easy to follow.

65. Anything you can do to make the instrument look appealing will go a long way. This is not a test! Interesting font? Colored paper? Funny icons? A comic strip between sections? Lookin good

66. Tellem What To Do: Instructions Use common everyday language to say what you mean. Customize to your target population. Include information about participation being voluntary & confidential Indicate why completing the measure is valuable.

67. Writing Items Be precise (not vague) What do you think about drugs? What do you think about underage consumption of alcohol? Be unbiased (not biased) Do you think hitting another person is mean and horrible? In your opinion, is it okay to hit another person?

68. Ask ONE question at a time Do you smoke and drink? Yes/No Have you ever smoke cigarettes? Yes/No Make hard questions easier to answer How many alcoholic beverages (6oz servings) do you drink each week? ____ Which of the following best describes how many alcoholic beverages (6oz servings) you drink each week? (check one) __None __1-2 __3-5 __More than 5 Avoid confusing negative phrases If a classmate hits you, should you not tell the teacher? Yes/No If a classmate hits you, would you tell the teacher? Yes/No

69. Maximize Potential Findings Create/Use a sensitive instrument Make room for nuance in response Do you yell at your child(ren)? Circle one: Yes/No OR Do you yell at your child(ren)? Circle one: Never/Rarely/Sometimes/Often Watch for reverse-coded items I like school. Strongly agree/Agree/Disagree/Strongly disagree My classroom is nice. Strongly agree/Agree/Disagree/Strongly disagree My teacher is mean. Strongly agree/Agree/Disagree/Strongly disagree

70. Collecting Data Once or Twice? How to Phrase It.

71. Try Your Hand

72. Guide: Step 3 Choosing an Instrument

73. Choosing An Instrument Checklist


76. Developing A Finished Product Anticipating Next Steps Administration Issues

77. Anticipating Next Steps Make response forms easy on the eye . Keep in mind that someone will have to review response sheets in order to analyze results. Consider a trial run (i.e., pilot test) for the final instrument. Grab a few young people or parents (not participants) who can help you out. Changing the instrument after (pre-test) administration is not too cool.

78. Administration: Rules of the game Collecting data from minors IRB Approval Confidentiality Proctoring

79. DETAILS DETAILS: Administration Do you have the resources necessary to administer the instrument? Paper and pencils? Interviewers? Appropriate setting? Are the administration instructions clear (to the participant and the administrator)? What level of proctoring is appropriate?

80. Guide: Step 4 Survey Administration

81. Identify youth participants eligible for data collection. Criteria for eligibility? When will data be collected? pre:_________________post:_________________ Who will administer the instrument? pre:_______________post:_________________ Who has the materials necessary for instrument administration(s) (enough copies of measures, pens, pencils, etc)? pre:_________________post:_________________ Are copies of the instruments available in appropriate languages (e.g. English, Spanish, etc)? How long will it take for survey to be completed by participants? ________________ Who is responsible for gathering materials and completed instruments after administration? pre:_________________post:_________________ Survey Administration Checklist

82. Finally You now know how to: Identify appropriate outcome indicators for your program Evaluate instruments based on your measurement criteria Assess reliability & validity of measures Construct an optimal instrument Conduct data collection with your instrument.

83. The End. (woo hoo!)