Mapping Lawful Permanent Residents with the Immigrant Legal Resource Center

Guest blog by Ozzie Liu, DataKind DataCorps Volunteer & Senior Analyst at Casper 

I recently volunteered with DataKind and had the chance to lead a team of fellow volunteer data scientists on a six-month long pro-bono project for the Immigrant Legal Resource Center’s (ILRC) New Americans Campaign to analyze and visualize the demographics and geography of lawful permanent residents eligible for naturalization, allowing Campaign partners to be more effective in their outreach and naturalization work.

I’m always interested in using my data science skills to make a difference in the community around me. So when I heard about DataKind’s mission to harness the power of data with impactful projects to solve social and humanitarian problems, I knew I wanted to be part of it! In the fall of 2016, I had the opportunity to work on a project with the ILRC to help immigrant communities pursuing U.S. citizenship receive legal assistance to aid in the process. As a second generation immigrant, this topic is near and dear to my heart. I have family members that have been long time green card holders but never took the step in applying for citizenship, which means  they are missing out on possible civic engagement and additional benefits. Moreover, the new administration’s stance against immigrant groups has made this work even more urgent.

The Partners: ILRC and New Americans Campaign

The Immigrant Legal Resource Center (ILRC) works with immigrants, community organizations, legal professionals, law enforcement, and policy makers to build a democratic society that values diversity and the rights of all people. The ILRC’s mission is to protect and defend the fundamental rights of immigrant families and communities. Led by the ILRC, the New Americans Campaign is a diverse nonpartisan national network of respected immigration organizations, legal services providers, faith-based organizations, immigrant rights groups, foundations and community leaders. The Campaign transforms the way aspiring citizens navigate the path to becoming new Americans. It is committed to connecting lawful permanent residents (LPRs) to trusted legal assistance and critical information that simplifies the naturalization process.

The Goal

The ILRC wanted to be able to visualize the geographic and demographic makeup of LPRs that are eligible for naturalization so that New Americans Campaign partners could identify those that might require the Campaign’s assistance.

The Data

There is a good amount of data around immigrants and LPRs in the U.S. including sources such as:

There are also several research organizations who work extensively with this immigrant population, and their work was tremendously helpful to our project. These partners include:

Although much of the data around immigrants and LPRs is public and available to the ILRC and its partners, it is not easy to navigate the many disparate websites or download large spreadsheets of numbers. We wanted to help make this information more accessible to the organizations that are serving this immigrant community.

Our team leveraged the excellent research that has already been done by CMS and CSII, which estimates the demographic and characteristics of immigrant populations at a detailed geographic level using PUMA. We then compared this with the New Americans Campaign’s internal data to gauge the effectiveness of past outreach and to find new opportunities for local partners.

Visualizing LPRs

After cleaning and wrangling the data, we started the visualization process. We wanted to develop an easy-to-use map that would show the appropriate characteristics of lawful permanent residents in corresponding PUMA area. Once we converted our data with PUMA’s shape files into a GeoJSON, we were able to create a visualization using D3.js. The resulting map, shown above at the start of this post, depicts the number of lawful permanent residents (LPRs) by PUMA. Locations in which there are brighter colors represent more LPRs.

Next, we attempted to add functionality and usability to the map using an off-the-shelf platform such as CARTO:

Interactive map of Spanish speakers and education levels using CARTO

We extensively tested tools such as CARTO, Mapbox, and Tableau that generated great looking visualizations, but we were concerned about the maintenance cost to the partner and the limitation of free tier levels that require opening up the Campaign’s internal data to the public. With help from DataKind’s in-house team of data scientists, we were able to develop a fully functional web app that uses Leaflet.js to serve as an interactive map that looks great and is flexible.

Interactive map and tool showing LPRs, in the NYC metro area, with a Bachelor degree or higher

Above is the final version of the app that we provided to the ILRC and New Americans Campaign partners. The left hand side is an interactive map that shows the raw number of LPRs given a selected characteristic by PUMA areas. On the right is a detailed view of each characteristic of the selected areas.

Wrapping up

After the project was completed, I had the chance to join Campaign partners in Chicago for the national New Americans Campaign Conference, where I led three workshops to unveil and demo the tool we had developed to all the attendees. I encouraged everyone to get out their laptops and smartphones, and actually play with the map as I presented the features. We then discussed the ways this tool could potentially help Campaign partners be more effective.

Me leading a live demo of the tool at the 2017 New Americans Campaign Conference in Chicago

Potential Impact

The feedback from the partners after the workshops was overwhelmingly positive (I even got some hugs). For some, this was the first time that the data was presented in a way that they could both understand and immediately use.

Together we brainstormed some of the ways that this tool could help the partners:

  • Now that areas where certain demographics and characteristics were once unclear, were now apparent, more targeted and effective outreach could be planned. One partner in Florida had planned to make a 6-hour roundtrip drive, twice a week, to be able to reach out to a specific immigrant group in the area. After using the map at the workshop, he discovered that a population with the same demographics he wanted to reach out to existed just 30 minutes away from his office.
  • Partners could use this tool to better prepare for offsite events or office visits, target outreach and plan necessary resources. For example:
    • Partners would be able to know in advance if they are serving an area where there is not a high-level of fluency in English, so that they could have translators on site.
    • Partners could identify LPRs that are younger and more computer savvy, and introduce them to Citizenshipworks, a “TurboTax” like program that can expedite the naturalization application form-filling process.
    • Partners could find LPRs that have lower income, as they may qualify for a fee-waiver.
    • Partners could develop more strategic planning to expand to new areas and seek funding.

It’s been a wonderful experience for our DataCorps team to work with the ILRC and I am very excited to see how the New Americans Campaign partners will use our tool to advance their work!

Source: DataKind – Mapping Lawful Permanent Residents with the Immigrant Legal Resource Center

An Open Source Tool for Disaster Relief

(Source for Above Map)

During their March DataDive, DataKind DC and Catholic Charities USA (CCUSA) partnered to create a map to help support the organization’s efforts  for disaster assistance to individuals and families before, during, and after a tragedy hits.  

CCUSA provides the holistic and compassionate care needed to help individuals impacted by disaster recover and move forward. They wanted to create a map to better target mitigation, preparedness, relief and recovery projects in order to best serve communities that are both at greatest risk for disasters, most overlooked, and/or are ineligible for FEMA assistance.

DataKind DC used the Center for Disease Control and Prevention’s (CDC) Social Vulnerability Index and a proprietary natural disaster dataset (generously donated by ATTOM Data Solutions) to develop this map using Mapbox, R, and D3.js. Version 1.0 was released this past August, just in time to help understand vulnerable populations in the wake of Hurricane Harvey.

Understanding Vulnerable Populations and Hurricane Harvey

The map showed that Houston, along with many of the counties in Hurricane Harvey’s path are socially vulnerable, in particular the eight counties along the coastline where Texas issued mandatory evacuations. Every single one of these counties were found to be at least in the 80th percentile for Crowding as well as the 80th percentile for Speak English Less than Well, meaning that these counties are more vulnerable than at least 80 percent of other counties across the United States with respect to each of those two categories.

The majority of the population in most of the counties are also primarily Minorities, Single Parent Households, and Aged 17 and Under, again in at least the 80th percentile for each of these categories/demographics. In addition, Aransas County is in the 96th percentile for populations Aged 65 and Over and the 93rd percentile for Disability. Understanding these percentiles helps local relief agencies provide the assistance needed for their particular community. For example, if the community is in a high percentile for Speak English Less than Well, it is critical to identify the primary languages and find volunteers who are fluent in those languages for any community outreach.

Houston shows similar vulnerable areas, but many of the census tracts within Harris County are particularly vulnerable with Socioeconomic Status and Minority Status/Language. 34 census tracts are in the 90th percentile for at least 9 of the 15 different categories of vulnerability. The most vulnerable census tracts are in extremely high percentiles for the following categories: Aged 17 and Under, Below Poverty*, No High School Diploma, Minority, and Crowding.

While the coastal counties seem to be especially vulnerable when it comes to transportation and communication concerns, Houston has many people that will be susceptible to post-hurricane issues, such as receiving aid for rebuilding their homes and returning to their lives.

*Below Poverty is the percentile of persons below the poverty line. A higher percentile indicates a greater number of people below the poverty line.

Our Partners

Working with CCUSA has been a wonderful and inspirational experience. The map has already been used to help local relief agencies better identify and understand social vulnerability in their communities to inform planning around disaster response. With the unforgiving weather and devastating events the world has experienced in this year alone, from the earthquakes in Mexico to the hurricanes in the Atlantic and the wildfires in the West, we’re hoping that this open source tool will be able to help the CCUSA and other organizations reach and support even more individuals and communities in need.

In the aftermath of Hurricane Maria, local relief agencies wanted to use this map to help efforts in Puerto Rico as well. Thanks to a quick response from the team and our partners at MapBox, we were able to update the map in a few hours to include coverage of Puerto Rico. It is projects like this, with the potential to save lives and improve welfare, that really drive and motivate our volunteers to be their best.

If you’d like to support the communities impacted by Hurricanes Irma, Harvey and Maria, donate here. A hundred percent of the funds raised will go directly towards disaster relief efforts.


Source: DataKind – An Open Source Tool for Disaster Relief

DataKind Singapore's DataDive on Philanthropic Giving and Volunteerism

Hot on the heels of our April DataDive, DataKind Singapore hosted its third DataDive in late August for National Volunteer & Philanthropy Centre (NVPC). We were a cozy group of around 30 volunteers who worked together to help NVPC gather actionable insights for its platform.

About National Volunteer & Philanthropy Centre (NVPC)

NVPC promotes a giving culture in Singapore by inspiring more volunteerism and philanthropy. NVPC hosts, a one-stop platform for local nonprofits and those looking to give back. On, organizations create campaigns to raise funds or recruit volunteers, while donors can find ways to contribute to causes they care about and volunteers can find meaningful ways to donate their time and skills.

Choy Yee Mun, Assistant Director of NVPC says, “the group of DataKind volunteers have been one of the most passionate and professional group of volunteers I have met. They have provided us with numerous useful and actionable insights and recommendations, which we will either be implementing or continuing in further projects with DataKind.”

Analyzing Browsing History

Previously, NVPC had created donor profiles and volunteer profiles based on user accounts. In the DataDive, we hoped to use Google Analytics data to develop personas based on browsing history for website visitors that have not created an account to volunteer or donate.

We looked at the browsing history of visitors with user accounts to see if the causes they browsed on matched those they had declared in their user profiles, but we didn’t find much overlap. One idea for future analysis post-DataDive would be to infer cause preference based on the number of cause pages clicked and the amount of time spent on those pages. With that information, we could derive which segments are interested in which causes. 

We also found a few thousand cases where users that had initially created a account and volunteered or donated would then revisit the platform within three months, but forget their password and stop using the platform entirely. One actionable insight for NVPC to consider is re-engaging these affected users.

By analyzing visitors’ browsing paths between the pages, we found that most visitors tend to follow the same paths from the homepage to the volunteer page. An onboarding process could be introduced to guide new users step by step and propose other possible user journeys on the website. We also found that ~75% of the users re-login after 16 days, so this might be a potential insight for electronic direct mail marketing.

Finally, we have also made recommendations on future Google Analytics data to extract for analysis.

Analyzing Campaign Effectiveness

NVPC had also wanted to look into factors associated with campaign successes so we reviewed all the data available on the various platforms. We tried to focus our efforts on three areas – organization reputation, campaign descriptions, and individual users’ browsing history from Google Analytics. 

We found interesting relationships between certain campaign variables and their eventual success. In campaigns where a personal story was shared, we found that stories related to hospices, elderly care and disabilities (topics 6 and 12 in the chart above) tend to receive a larger donation.

In addition, campaigns with customized impact messages tend to draw a higher median donation. This may be because custom messages give donors a more specific and transparent view of how exactly their donation will be used.

We also looked at the effect of the campaign duration, but found some inconsistencies in the available data. An idea for future work with NVPC could be to build models that could predict the success of a campaign prior to its launch so campaign creators could build and enhance it to achieve their desired funding goal.


It’s A Wrap! Get Involved with DataKind Singapore

A huge thanks to everyone that participated in the DataDive helping NVPC better engage potential donors and volunteers. If you’re local, we’d love to see you at the next DataDive or Meetup. Sign up to get involved!

Source: DataKind – DataKind Singapore’s DataDive on Philanthropic Giving and Volunteerism

DataKind Bangalore’s Third DataDive: Charting Success in Social Good

DataKind Bangalore’s third DataDive was indeed special in many ways. First, it brought together four mission-driven nonprofits with a diverse set of interests—from human rights to healthcare to education. Second, data visualization was the predominant theme of the event. And, most importantly, this was the biggest DataDive ever in Bangalore with more than 120 participants. Read on to learn more about the most anticipated event of the year at DataKind Bangalore.

Advancing Maternal Health with Antara Foundation

A prototype of the map of health centers in the block of Kishanganj, Rajasthan, grouped by the health indicators of pregnant women and infants.

Started in 2013, Antara Foundation promotes maternal and infant health. They built a network of healthcare workers in the state of Rajasthan to deliver timely medical assistance to new and expectant mothers in rural corners. Powered by three groups of frontline workers, Antara accelerates the reach of maternal care in several districts.

To better utilize all the data collected by its frontline workers and track changes in periodically updated data, Antara partnered with DataKind Bangalore. The goal of this collaboration was to create meaningful dashboards to group rural health centers by the health indicators of pregnant women and infants, analyze gaps in health centers, identify locations that need immediate medical attention, and track changes in key indicators.

Team Antara: The eight-member team sliced, diced, and trimmed the data set for use with interactive dashboards and maps.

Our team of volunteers started with a sample data set and data dictionary. At the DataDive, we were able to identify data discrepancies, trim the data set, and define the boundary of each village. We are now on our way to building the first prototype of dashboards based on Node.js and Geojson, and we look forward to continuing work with Antara to help them predict and prepare for high risk pregnancies and better immunization for children.

“I got an outsider’s perspective on how all the data can be used.”

Harsh Vardhan Sahni, Antara Foundation


Helping CHRI Monitor the Spread of Legal Aid

A prototype of the interactive map that will help CHRI track the spread of legal aid in India

Since 1987, Commonwealth Human Rights Initiative (CHRI) has been championing the cause of right to information and justice. Headquartered in New Delhi, India, this nonprofit sought DataKind Bangalore’s help to create an enhanced, data-driven reporting tool to monitor the availability of legal aid across Indian states.

One of CHRI’s prime objectives is to ensure free access to legal aid for all Indian citizens, as mandated by the state. DataKind Bangalore’s expertise in data visualization helped in building a prototype of an interactive map with CHRI’s data that displays the availability of different types of legal resources in each state—with an option to perform comparative analysis.

Team CHRI—a group of expert mentors and passionate newbies—is set to mark another milestone in DataKind Bangalore’s growing expertise in data visualization. 

DataKind Bangalore’s volunteers are excited about all the progress made at the DataDive, including a thematic regrouping of the data and wireframes of the interactive map, and look forward to continued collaboration with CHRI.

“I have understood the whole process of transforming raw data into a visual map.”
— Raja Bagga, CHRI



Building a Scorecard for Assessing the Quality of Education

Prototype of dashboards showing performance of schools in different districts.

Karnataka Learning Partnership (KLP) is dedicated to developing a public platform for all the stakeholders involved in the cause of providing access to education in the state of Karnataka. A long-time partner of DataKind Bangalore, KLP wanted to assess the quality of public primary education by analyzing the data collected from three different sources—feedback from the community, result of contests held at rural towns, and district-level data available on the DISE portal.

Our volunteers took on the challenge of exploring the data and determining how each school, village, and district can be compared. At the DataDive, the initial phase of data exploration started with the help of Superset—an intuitive data exploration and visualization tool.

 Team KLP members are all smiles after a productive weekend of data exploration.

Our volunteers are now busy building comprehendible dashboards that will give KLP the complete flexibility to compare performance indicators across multiple schools in Karnataka. 

“This year’s DataDive presented a great opportunity to learn Python and Superset.”
—  Arpita Panda, DataKind volunteer for KLP


Helping Pollinate Energy Identify Urban Poor Communities

Sample map of Bangalore highlighting urban poor communities

Pollinate Energy is passionate about improving the quality of life in urban poor communities with the help of innovative, affordable products like Sunking Home—an environmental friendly solar-powered lamp—and Envirofit Cookstove—an efficient cooking appliance that runs on clean fuels. 

Pollinate Energy partnered with DataKind Bangalore to address the challenge of detecting various urban poor communities in Bangalore via satellite images and other data proxies.

Pollinate Energy was using Google Maps to identify urban poor settlements, but this was a manual, time-consuming process that was challenging due to duplicate data and a lack of validation mechanism.

At the DataDive, our volunteers identified two different approaches to addressing the challenges:

  • Scanning satellite images for markers that indicate the presence of urban poor communities
  • Using geocoding to perform an exploratory data analysis

Geocoding and transfer learning with a bit of computer visionteam Pollinate shakes it all up. 

With satellite images, the team used computer vision techniques to extract the HSV format of the images, and then applied thresholds on the resulting images to highlight the areas that contain specific markers of urban poor communities.  

For the exploratory data analysis, the team used the undersampling technique to weed out affluent neighborhoods. The findings from this analysis will help Pollinate Energy expand its operation in Bangalore. 


“It was amazing to explore state-of-the-art architectures in deep learning at this DataDive.”
—  Ashwin Vasan, DataKind volunteer for Pollinate Energy


Thank You

This year’s DataDive was made possible by our enthusiastic volunteers, mission-driven nonprofit partners, and ThoughtWorks—our venue sponsor. Here is a note of thanks to all of you!

Join us

We would love to see you at our events! If you’re local, join our Meetup to get involved and follow us on Facebook and Twitter for updates and announcements.

Source: DataKind – DataKind Bangalore’s Third DataDive: Charting Success in Social Good

October Webinar: Fighting Corruption in the Extractives Industry

 Photo credit: JB Bodane

What if we could use satellite imagery to automatically detect illegal mining from afar or use anomaly detection to flag contracts that differ from standard ones and may include potentially unfair or questionable language? Until now, the dominant use of data in the extractives industry has been reserved for descriptive and summary analytics, but new opportunities to leverage more advanced techniques like machine learning and artificial intelligence are just emerging thanks to the groundwork laid by Omidyar Network investees and others.

During our quarterly Learning Out Loud webinars, we’ll share insights uncovered on projects currently underway with Omidyar Network investees to leverage data science in the fight against corruption in the extractives industry.

Learning Out Loud Webinar #1

Get caught up on our first webinar above, then join us for our second in October where we’ll be sharing our learnings exploring datasets and prototyping data science solutions aimed at improving transparency in the extractives industry.

Tuesday, October 24, 2017
8am Pacific / 11am Eastern
Register >

All are welcome – please share with anyone you think may be interested!

Source: DataKind – October Webinar: Fighting Corruption in the Extractives Industry

Leveraging Tableau in DataKind Projects

Data visualization is a key step in many of our projects. As our founder and executive director Jake Porway says in this SSIR piece on data storytelling, “the true power of data comes from conveying the ‘so what’ behind the numbers, inspiring people to probe new questions, and using it for rigorous statistical inquiry.”

In addition to its general support of DataKind, the Tableau Foundation also generously grants DataKind a certain number of licenses to use Tableau software on their projects and our volunteers have used around the world for everything from helping organizations reach more children in need, understand homelessness or even fight crime.

Delving into Child Poverty Data

DataKind UK created an interactive visualization of child poverty as pictured above for the North East Child Poverty Commission in the North East UK. The Commission used this visualization for education, advocacy, and to inform policy as they work to improve the lives of poor children. In an article highlighting Tableau’s role in the project, Dr. Deborah Harrison, Coordinator of the North East Child Poverty Commission, explained,

“DataKind and Tableau have taken a complex piece of data and turned it into something more user-friendly and meaningful. The solution makes it easier to see child poverty ‘hotspots’, for example where child poverty levels are particularly high or where they have increased over time. Our goal is for local authorities to use this tool to enhance their existing knowledge of local child poverty levels, helping them to target their responses quickly and accurately.”

Improving Access to Education by Supporting Tutors

Using Tableau, DataKind UK mapped the engagement and milestones of their volunteer tutors to help The Access Project understand what makes for a successful pairing between tutor and student. The Access Project wants to adapt their processes to improve the volunteer experience and more effectively help motivated state school students access top universities.

Finding Children In Need

Shooting Star Chase, a UK-based children’s hospice, wanted to understand which geographic areas were most in need of their children’s hospice services. DataKind UK used Tableau to map public data to locate children suffering from life-limiting diseases, streamlining referral paths and ultimately saving up to £90,000 for children’s hospices around the country.

Understanding Patterns in Homelessness

DataKind San Francisco joined forces with the Community Technology Alliance to explore data from the Monterey County homelessness assistance program. With visualizations built in Tableau, DataKind San Francisco explored the factors that correlate with a homeless family’s successful transition from supported to stable housing. These insights will help Monterey County better allocate resources to support more families in overcoming homelessness.

Mapping Data to Fight Crime

DataKind Bangalore used Tableau to generate heat maps showing crime locations and plot trends from Bangalore Police Department data. Insights from this exploratory analysis, such as most crime hotspots in the city have empty space adjoining them, could guide future deployment of policemen and thus reduce crime in the city.

Share Your Examples

What great data visualizations have you seen being leveraged for social good? Share your links in the comments. If you’re in the New York area, learn more and do good by attending the upcoming Tableau software user group meeting September 19th that will generate donations for the Community Foodbank of New Jersey or find a DataKind meetup near you!

Source: DataKind – Leveraging Tableau in DataKind Projects

Happy Birthday, Chapters!

Today’s solar eclipse isn’t the only bit of global news going on – it’s also the day five of our six Chapters turn three! We are just as awestruck at all they’ve achieved. In August 2014, we launched five new Chapters in DC, San Francisco, Bangalore, Singapore and Dublin to join DataKind UK that had launched the year prior. Since then, they have recruited thousands of community members and supported organizations large and small, furthering DataKind’s mission around the world.  

Let’s take a quick look back:

Year One – Around The World in Six DataDives (Aug 2014 – Aug 2015)

Within six months of launching, all six Chapters hit the ground running by each hosting a DataDive in their local community. From DC to Dublin, from Leeds to Bangalore, from San Francisco to Singapore, hundreds of you around the world lent your expertise in six DataDives worldwide to fight global warming, stamp out corruption in Nigeria, address inequalities in Leeds and San Francisco, treat malnourished children in South Sudan, support civic engagement in Bangalore and much much more.

Year Two – Building & Sharing Practice (Aug 2015 – Aug 2016)

After building such strong communities, Chapters began taking on new challenges. DataKind Bangalore went beyond weekend DataDives to take on long-term DataCorps projects, DataKind DC began a long-term partnership with American Red Cross to predict and prevent home fires across the U.S. and DataKind Singapore shared its work onstage at the first ever Strata + Hadoop World Singapore.

Year Three – Innovation and Recognition (Aug 2016-Present)

In their third year, Chapters show no signs of slowing. DataKind UK has been developing a framework to help nonprofits understand their organization’s level of data maturity so they can do even more. DataKind San Francisco, with its over 2000 Meetup members, engages Core Volunteers to execute its DataDives. True to their trailblazing ways, DataKind Dublin’s most recent DataDive caught the eye of the newly appointed Taoiseach (the Irish Prime Minister) Leo Varadkar and Minister for Data Protection Dara Murphy – both of whom attended to learn more about the potential for data science to be used for good. 

In short, our Chapters are amazing and we can’t wait to see what year four will bring (or year five for DataKind UK!).  And it’s all thanks to the amazing people who take the time to make it happen by lending their best talents to their communities.  Chapter Leaders, this blog’s for all of you! Thank you for donating your time to building DataKind’s global community and expanding our work around the world.


Get Involved

 DataKind staff and Chapter Leaders at the 2016 Chapter Summit, testing out fancy eyewear ahead of the eclipse.

If you’re located in a DK location, be sure to sign up for your local Meetup group to get involved!

Source: DataKind – Happy Birthday, Chapters!

#GivingTuesday DataDive Findings Report

If philanthropic giving helps fuel social change, how can we increase it? This was one of many questions we dove into alongside the 92nd Street Y, Bill & Melinda Gates Foundation and over 100 data science volunteers at a New York DataDive held at Facebook this past March.

Check out the findings in our recently released report >

Launched in 2012 as a reaction to Black Friday – a day in the U.S. of intense shopping and spending after Thanksgiving – #GivingTuesday is a global movement that reaches millions of people in nearly 100 countries worldwide. While #GivingTuesday’s reach has grown significantly over the past five years, philanthropic giving in the U.S. still has not risen relative to GDP. If the philanthropic community could increase it by even 1%, the impact would be massive.

Very fitting for a weekend focused on philanthropic giving, data philanthropy – the donation of private datasets – fueled the analyses. Four volunteer teams dug into datasets brought together for the first time from an unprecedented data collaboration of 36 corporate and nonprofit data providers.

The findings outlined in this report show the impact that machine learning and predictive technology can have in boosting the nonprofit sector. Learn more in this blog from the 92nd Street Y and download the report for the full story.

Join Us In Seattle for Our Next DataDive

But wait, there’s more! We’re thrilled to be hosting another DataDive to delve further into #GivingTuesday and philanthropic giving. This time we’re heading west to Seattle, August 4-6. We are looking for data pros of all backgrounds to roll up their sleeves and work side by side with experts from #GivingTuesday, USA for UNHCR and Bill & Melinda Gates Foundation to explore these questions and find ways to further fuel social change.

Join us at the next DataDive >

Source: DataKind – #GivingTuesday DataDive Findings Report

How to Become a Data Driven Charity

By Emma Prest of DataKind UK and Lauren Bernard of NCVO
This post originally appeared on on the KnowHow Nonprofit blog.

Most of us agree that data is important in any organisation. We need to collect and analyse data to estimate the demand for our services, understand who our users are, find out which services are working for which people, and much more. In fact, there are few areas of work where the smarter use of data doesn’t make us more effective. And the future holds out even more opportunity – the ability to predict need and effectiveness, and to use data to design and innovate new services.  But first you need to know where you stand at the moment, and how to move forward in a way that suits your organisation and its existing capabilities.
Research into data maturity shows that there are seven key areas that you need to consider when improving your organisation’s use of data. And perhaps unsurprisingly, the most important are people and culture! This guide looks at those key areas, and suggests the signs that you should look out for to determine if you are on the right track.

1-Get buy-in from leadership

The attitude of leadership is one of the most essential ingredients to becoming data driven. If your leadership team sees data as a vital resource and is able to incorporate past, present and forward-looking data into business planning and decision making, then you’re on the right path. If the leadership team tends to make decisions based on individual experience, anecdote or gut feeling, then there’s work to do!

A willingness to invest in increasing the organisation’s capacity to work with data is a first step. In order to do this, a charity will need a broad range of people with data expertise and understanding, from admin roles through to board level.

2-Nurture an evidence based culture

Becoming more data driven is ultimately about changing a culture and inspiring your colleagues to be interested in data. In our research some respondents saw data as the responsibility of ‘someone else’, whereas in more data savvy organisations data was seen as a team effort and a critical asset for every part of the organisation.

An organisation that is hungry for feedback and strives for continuous improvement is more likely to embrace data. A subtle pointer is the way that questions about data are asked. If staff ask (positive!) questions that challenge practices and preconceived notions, as opposed to just looking for data to support and confirm existing beliefs, then you are culturally well placed to become more data driven.

Being able to share data and results across the organisation is another essential element. But be aware that both internal and external data sharing requires strong data protection and security practices. There must be regular training, and trustees and senior management should be aware of current legislation and best practice.

3-Build skills in-house

To collect and manipulate data you need the right skills in house (or at the other end of an email). Not only do you need staff who can conduct the right analysis, but your colleagues need a certain level of data literacy to understand the results produced.

In organisations that are more data mature there is often a dedicated person or team in charge of data, with skilled data people across other teams and departments. With the right in-house data chops you become the experts in your sector that other organisations turn to and use as a resource, building your credibility and influence.

4-Invest in tools

Ensuring you have the right tools is hard. Software requires updating; technology changes and better products come on the market; migrating databases and training staff in a new tool is always a headache. And yet, analytical infrastructure is a priority if you want to do more with data. Ongoing investment in tools, systems and infrastructure is key.

Charities that regularly and easily join up different data sets or store data in a singly accessible database are ahead of the pack.

Some organisations even make data accessible to all staff enabling them to explore the data themselves (this also means they no longer rely on the data guy or gal to run reports for them). Dashboards are a common way of democratizing data.

5-Get to know the data you hold

The starting point here is to know what data you hold across the charity. Once you’ve done a data audit and have a data inventory, it is worth reviewing how meaningful, relevant and useful that data is. Do you really need all of it? What are you missing?

Understanding the quality of your data sets and what kind of analysis can (and cannot) be done comes next. Charities that are more data mature monitor their data to check it is complete, accurate and valid. They have tools and systems for cleaning and maintaining it. They are able to join the data up to conduct analysis across teams. Their staff and volunteers are trained in data collection and understand why it matters. Where possible, data collection is automated.

Charities increasingly compare their data with other organisations’ data to benchmark their performance and they look to open data to enrich their internal data sources.

6-Be clear on what you want your data to achieve

Common uses of data in the voluntary sector include measuring outcomes and impact; monitoring the success of campaigns; reporting on staff and volunteer performance; demonstrating the need for your work; making the case to funders for new services/ products/ campaigns; and running financial models and donor retention. Data analysis is also often part of influencing policy makers, and developing robust evidence to build credibility and influence.

Less commonly, charities run analyses to understand how to make services more efficient; differentiating between approaches – what’s working and what’s not; testing assumptions and understanding client behaviours; analysing user groups to better understand their needs; targeting and optimising services/ products/ campaigns to suit those needs. As a sector we are moving towards a world where charities can predict needs, behaviour and outcomes, maximise income and provide more targeted solutions. 

7-Think about how best to conduct your data analysis

Charities tend to run quarterly reports, which often involves trend analysis of activities and finances. But increasingly charities are not just looking backwards. They are forecasting and predicting to plan for the future.

Some charities use advanced analytics such as clustering, root cause analysis, A/B testing, network analysis or text analytics. Data is brought together in automated ways to provide organisation wide analyses.

Moreover these charities don’t just run analyses every few months. They are able to do it in near real time. They also think about how best to communicate findings to different audiences, whether through simple reports or whizzy data visualisations. 

Further information

Like acquiring any new skill, using data better involves phases of progression, starting with the building blocks and moving up to more advanced stages. If you would like to take your organisation on a data journey check out the Data Maturity Model produced by DataKind UK and Data Orchard showing the seven key themes for being data driven, across five stages of maturity.

Source: DataKind – How to Become a Data Driven Charity

A North Carolina DataDive with SAS

They say a photo is worth a thousand words and this spectacular one taken on the SAS campus following our first ever North Carolina DataDive in April 2017 captures the energy and enthusiasm of the over 100 local data scientists, computer engineers and technologists that donated their time and talent that weekend. In partnership with our participating nonprofit partners and with generous support from SAS, they analyzed massive datasets, built statistical models, and created visualizations and algorithms to help each organization advance their work in a wide range of issues from community development to job readiness to fighting illegal tobacco sales to combatting hate crimes.

Anti-Defamation League
Uncovering hate crime trends to inform public policy

A screenshot of Anti-Defamation League’s work in progress tool showing hate crimes by group at a national level. .

You know how powerful findings from a DataDive can be when the event gets highlighted in testimony before the Senate Judiciary Committee only a few days later.

Founded in 1913, the Anti-Defamation League (ADL) is a premier civil rights/human relations organization that, in its own words, seeks to “ensure justice and fair treatment for all.” The organization is constructing a map of hate crimes throughout the United States based on hate crime data from the FBI in order to surface trends and provide analysis to use as a tool to inform policy makers and the public.

Using FBI Hate Crime data on reported hate crimes from 2004-2015, Data Ambassadors Chris Hemedinger and Lucia Gjeltema led a team to construct additional features for the map. They also expanded the tool by complementing the FBI’s hate crime data with incident news articles, state-level data on hate crime laws and policy as well as other demographic and contextual data from open sources.

The team uncovered a number of interesting findings. For example, when a state has a hate crime data collection statute, hate crimes are reported more frequently. It also appears crimes follow cyclical trends by days of the week and season, with most crimes being committed on Fridays and Saturdays and in the spring and summer. ADL’s CEO Jonathan Greenblatt highlighted the work that led to these findings in his testimony at the May 2017 Senate Judiciary Committee hearing, “Responses to the Increase in Religious Hate Crimes.”

Counter Tools
Understanding relationships between tobacco and health using spatial modeling

A screenshot of identified “triple offender” retail outlets illegally selling tobacco to minors in Indiana with locations plotted on Google Earth.

Research has shown that tobacco use and misuse remains the leading cause of preventable deaths in the United States. Counter Tools is a five-year-old nonprofit startup out of the University of North Carolina Gillings School of Public Health with a mission to advance place-based public health or increasing public health at a local level and based on the needs of that specific community. A primary focus of the organization is combatting the negative health impacts of tobacco marketing in brick and mortar retail stores. Counter Tools works to combat the negative health impacts of tobacco by equipping everyday people with tools and instructions to act as “citizen scientists” to collect data from retail stores in their communities.

Led by Data Ambassadors Louis Potok and Brian Spiering, the team dug into Counter Tools’ growing database of more than 40,000 store assessments, conducted by citizen scientists, that provides information on in-store tobacco product availability, pricing, placement, promotion and advertising.

They developed a tool to identify retail outlets that are (1) in violation of selling to minors (2) in violation of Assurances and Voluntary Compliance (AVC) agreements signed by their corporate parent and (3) located within 1,000 feet of a school. The team tied individual retail outlets to their corporate parents to better identify which parent companies are frequently in violation.

These and other tools developed by the volunteer team will help Counter Tools in its efforts to understand the relationship between tobacco availability or exposure to tobacco marketing and negative health outcomes so it can take action to keep communities healthy.

Habitat for Humanity International
Using affiliate data to understand network impact

The red shaded region represents a group of often overlooked albeit successful Habitat for Humanity International affiliates.

Data Ambassadors Mustafa Kabul, Jordan Meyer and Julia Kuznetsova led a team to help Habitat for Humanity International better understand the impact its affiliate network of over 1,300 nonprofits is having on local communities. While Habitat for Humanity International is obviously known for its construction projects, its network also completes 12,000 community projects annually focused on everything from building gardens to financial literacy programs and are typically undertaken in partnership with community groups, police stations, or local faith groups. Habitat for Humanity wanted to better understand the impact these interventions have on communities and develop a data-driven framework to understand the role and impact of its affiliates network as whole.

Using an internal database of U.S. affiliates’ construction and non-construction interventions, the team was able to help Habitat for Humanity see a new way of understanding affiliate success. They identified groups of very successful non-construction based affiliates that are often overlooked.

The team recommended Habitat for Humanity consider non-construction programs in the measure of success, especially for locations such as Alaska where construction costs are higher and weather makes construction projects more prohibitive. All this plus the team’s analysis on donors and social media will help inform Habitat for Humanity’s business planning and resource allocation decisions going forward.  

StepUp Durham
Optimizing the delivery of employment training and life skills workshops

The top chart displays the most important features predictive of job readiness, while the bottom chart shows the associated correlation coefficients used by the predictive model.

StepUp Durham’s mission is to help adults and children transform their lives through access to employment and life skills training and aims to be the premier resource in Durham County for vulnerable communities seeking to develop stable careers. They wanted to understand what factors influence participants to register, attend a workshop, complete a workshop, find a job, and continue engagement through participation in their Life Skills program.

Using over seven years of participant and donor data, a team led by Data Ambassadors Natalia Summerville, Jinxin Yi, and Zeydy Ortiz was able to identify the several important features that will help to predict job readiness of program participants. As shown in the chart above, these included things like a participant’s criminal record, access to transportation, history of substance abuse, the class sizes of workshops attended, homeless history, financial and food assistance history, and age. For example, the team found that having access to transportation and participating in small class sizes led to better outcomes. Interestingly, those with prior criminal records were correlated with greater successful outcomes. The team speculated but did not prove that those with a criminal past may come to StepUp more motivated to succeed when compared to the general group.

The team created a model with 85% accuracy to predict positive outcomes from the various programs and isolate variables for success so that all individuals seeking full-time work can obtain the skills needed to attain gainful employment.

Thank You SAS and Project Partners!

As mentioned, this was our first DataDive in North Carolina and we are thankful to SAS for making it possible. Not only did their campus provide the perfect location for amazing photos like the one above, but the DataDive itself was largely staffed by SAS volunteers that donated their time and talent to help these organizations. We also want to give a special shout out to our four fantastic project partners for their time and dedication to making this event such a success. We’re pleased to be continuing work with some of these organizations so stay tuned for more updates to come!

Source: DataKind – A North Carolina DataDive with SAS