They say a photo is worth a thousand words and this spectacular one taken on the SAS campus following our first ever North Carolina DataDive in April 2017 captures the energy and enthusiasm of the over 100 local data scientists, computer engineers and technologists that donated their time and talent that weekend. In partnership with our participating nonprofit partners and with generous support from SAS, they analyzed massive datasets, built statistical models, and created visualizations and algorithms to help each organization advance their work in a wide range of issues from community development to job readiness to fighting illegal tobacco sales to combatting hate crimes.
Uncovering hate crime trends to inform public policy
A screenshot of Anti-Defamation League’s work in progress tool showing hate crimes by group at a national level. .
You know how powerful findings from a DataDive can be when the event gets highlighted in testimony before the Senate Judiciary Committee only a few days later.
Founded in 1913, the Anti-Defamation League (ADL) is a premier civil rights/human relations organization that, in its own words, seeks to “ensure justice and fair treatment for all.” The organization is constructing a map of hate crimes throughout the United States based on hate crime data from the FBI in order to surface trends and provide analysis to use as a tool to inform policy makers and the public.
Using FBI Hate Crime data on reported hate crimes from 2004-2015, Data Ambassadors Chris Hemedinger and Lucia Gjeltema led a team to construct additional features for the map. They also expanded the tool by complementing the FBI’s hate crime data with incident news articles, state-level data on hate crime laws and policy as well as other demographic and contextual data from open sources.
The team uncovered a number of interesting findings. For example, when a state has a hate crime data collection statute, hate crimes are reported more frequently. It also appears crimes follow cyclical trends by days of the week and season, with most crimes being committed on Fridays and Saturdays and in the spring and summer. ADL’s CEO Jonathan Greenblatt highlighted the work that led to these findings in his testimony at the May 2017 Senate Judiciary Committee hearing, “Responses to the Increase in Religious Hate Crimes.”
Understanding relationships between tobacco and health using spatial modeling
A screenshot of identified “triple offender” retail outlets illegally selling tobacco to minors in Indiana with locations plotted on Google Earth.
Research has shown that tobacco use and misuse remains the leading cause of preventable deaths in the United States. Counter Tools is a five-year-old nonprofit startup out of the University of North Carolina Gillings School of Public Health with a mission to advance place-based public health or increasing public health at a local level and based on the needs of that specific community. A primary focus of the organization is combatting the negative health impacts of tobacco marketing in brick and mortar retail stores. Counter Tools works to combat the negative health impacts of tobacco by equipping everyday people with tools and instructions to act as “citizen scientists” to collect data from retail stores in their communities.
Led by Data Ambassadors Louis Potok and Brian Spiering, the team dug into Counter Tools’ growing database of more than 40,000 store assessments, conducted by citizen scientists, that provides information on in-store tobacco product availability, pricing, placement, promotion and advertising.
They developed a tool to identify retail outlets that are (1) in violation of selling to minors (2) in violation of Assurances and Voluntary Compliance (AVC) agreements signed by their corporate parent and (3) located within 1,000 feet of a school. The team tied individual retail outlets to their corporate parents to better identify which parent companies are frequently in violation.
These and other tools developed by the volunteer team will help Counter Tools in its efforts to understand the relationship between tobacco availability or exposure to tobacco marketing and negative health outcomes so it can take action to keep communities healthy.
Habitat for Humanity International
Using affiliate data to understand network impact
The red shaded region represents a group of often overlooked albeit successful Habitat for Humanity International affiliates.
Data Ambassadors Mustafa Kabul, Jordan Meyer and Julia Kuznetsova led a team to help Habitat for Humanity International better understand the impact its affiliate network of over 1,300 nonprofits is having on local communities. While Habitat for Humanity International is obviously known for its construction projects, its network also completes 12,000 community projects annually focused on everything from building gardens to financial literacy programs and are typically undertaken in partnership with community groups, police stations, or local faith groups. Habitat for Humanity wanted to better understand the impact these interventions have on communities and develop a data-driven framework to understand the role and impact of its affiliates network as whole.
Using an internal database of U.S. affiliates’ construction and non-construction interventions, the team was able to help Habitat for Humanity see a new way of understanding affiliate success. They identified groups of very successful non-construction based affiliates that are often overlooked.
The team recommended Habitat for Humanity consider non-construction programs in the measure of success, especially for locations such as Alaska where construction costs are higher and weather makes construction projects more prohibitive. All this plus the team’s analysis on donors and social media will help inform Habitat for Humanity’s business planning and resource allocation decisions going forward.
Optimizing the delivery of employment training and life skills workshops
The top chart displays the most important features predictive of job readiness, while the bottom chart shows the associated correlation coefficients used by the predictive model.
StepUp Durham’s mission is to help adults and children transform their lives through access to employment and life skills training and aims to be the premier resource in Durham County for vulnerable communities seeking to develop stable careers. They wanted to understand what factors influence participants to register, attend a workshop, complete a workshop, find a job, and continue engagement through participation in their Life Skills program.
Using over seven years of participant and donor data, a team led by Data Ambassadors Natalia Summerville, Jinxin Yi, and Zeydy Ortiz was able to identify the several important features that will help to predict job readiness of program participants. As shown in the chart above, these included things like a participant’s criminal record, access to transportation, history of substance abuse, the class sizes of workshops attended, homeless history, financial and food assistance history, and age. For example, the team found that having access to transportation and participating in small class sizes led to better outcomes. Interestingly, those with prior criminal records were correlated with greater successful outcomes. The team speculated but did not prove that those with a criminal past may come to StepUp more motivated to succeed when compared to the general group.
The team created a model with 85% accuracy to predict positive outcomes from the various programs and isolate variables for success so that all individuals seeking full-time work can obtain the skills needed to attain gainful employment.
Thank You SAS and Project Partners!
As mentioned, this was our first DataDive in North Carolina and we are thankful to SAS for making it possible. Not only did their campus provide the perfect location for amazing photos like the one above, but the DataDive itself was largely staffed by SAS volunteers that donated their time and talent to help these organizations. We also want to give a special shout out to our four fantastic project partners for their time and dedication to making this event such a success. We’re pleased to be continuing work with some of these organizations so stay tuned for more updates to come!
Source: DataKind – A North Carolina DataDive with SAS