DataKind Bangalore’s Third DataDive: Charting Success in Social Good

DataKind Bangalore’s third DataDive was indeed special in many ways. First, it brought together four mission-driven nonprofits with a diverse set of interests—from human rights to healthcare to education. Second, data visualization was the predominant theme of the event. And, most importantly, this was the biggest DataDive ever in Bangalore with more than 120 participants. Read on to learn more about the most anticipated event of the year at DataKind Bangalore.

Advancing Maternal Health with Antara Foundation

A prototype of the map of health centers in the block of Kishanganj, Rajasthan, grouped by the health indicators of pregnant women and infants.

Started in 2013, Antara Foundation promotes maternal and infant health. They built a network of healthcare workers in the state of Rajasthan to deliver timely medical assistance to new and expectant mothers in rural corners. Powered by three groups of frontline workers, Antara accelerates the reach of maternal care in several districts.

To better utilize all the data collected by its frontline workers and track changes in periodically updated data, Antara partnered with DataKind Bangalore. The goal of this collaboration was to create meaningful dashboards to group rural health centers by the health indicators of pregnant women and infants, analyze gaps in health centers, identify locations that need immediate medical attention, and track changes in key indicators.

Team Antara: The eight-member team sliced, diced, and trimmed the data set for use with interactive dashboards and maps.

Our team of volunteers started with a sample data set and data dictionary. At the DataDive, we were able to identify data discrepancies, trim the data set, and define the boundary of each village. We are now on our way to building the first prototype of dashboards based on Node.js and Geojson, and we look forward to continuing work with Antara to help them predict and prepare for high risk pregnancies and better immunization for children.

“I got an outsider’s perspective on how all the data can be used.”

Harsh Vardhan Sahni, Antara Foundation


Helping CHRI Monitor the Spread of Legal Aid

A prototype of the interactive map that will help CHRI track the spread of legal aid in India

Since 1987, Commonwealth Human Rights Initiative (CHRI) has been championing the cause of right to information and justice. Headquartered in New Delhi, India, this nonprofit sought DataKind Bangalore’s help to create an enhanced, data-driven reporting tool to monitor the availability of legal aid across Indian states.

One of CHRI’s prime objectives is to ensure free access to legal aid for all Indian citizens, as mandated by the state. DataKind Bangalore’s expertise in data visualization helped in building a prototype of an interactive map with CHRI’s data that displays the availability of different types of legal resources in each state—with an option to perform comparative analysis.

Team CHRI—a group of expert mentors and passionate newbies—is set to mark another milestone in DataKind Bangalore’s growing expertise in data visualization. 

DataKind Bangalore’s volunteers are excited about all the progress made at the DataDive, including a thematic regrouping of the data and wireframes of the interactive map, and look forward to continued collaboration with CHRI.

“I have understood the whole process of transforming raw data into a visual map.”
— Raja Bagga, CHRI



Building a Scorecard for Assessing the Quality of Education

Prototype of dashboards showing performance of schools in different districts.

Karnataka Learning Partnership (KLP) is dedicated to developing a public platform for all the stakeholders involved in the cause of providing access to education in the state of Karnataka. A long-time partner of DataKind Bangalore, KLP wanted to assess the quality of public primary education by analyzing the data collected from three different sources—feedback from the community, result of contests held at rural towns, and district-level data available on the DISE portal.

Our volunteers took on the challenge of exploring the data and determining how each school, village, and district can be compared. At the DataDive, the initial phase of data exploration started with the help of Superset—an intuitive data exploration and visualization tool.

 Team KLP members are all smiles after a productive weekend of data exploration.

Our volunteers are now busy building comprehendible dashboards that will give KLP the complete flexibility to compare performance indicators across multiple schools in Karnataka. 

“This year’s DataDive presented a great opportunity to learn Python and Superset.”
—  Arpita Panda, DataKind volunteer for KLP


Helping Pollinate Energy Identify Urban Poor Communities

Sample map of Bangalore highlighting urban poor communities

Pollinate Energy is passionate about improving the quality of life in urban poor communities with the help of innovative, affordable products like Sunking Home—an environmental friendly solar-powered lamp—and Envirofit Cookstove—an efficient cooking appliance that runs on clean fuels. 

Pollinate Energy partnered with DataKind Bangalore to address the challenge of detecting various urban poor communities in Bangalore via satellite images and other data proxies.

Pollinate Energy was using Google Maps to identify urban poor settlements, but this was a manual, time-consuming process that was challenging due to duplicate data and a lack of validation mechanism.

At the DataDive, our volunteers identified two different approaches to addressing the challenges:

  • Scanning satellite images for markers that indicate the presence of urban poor communities
  • Using geocoding to perform an exploratory data analysis

Geocoding and transfer learning with a bit of computer visionteam Pollinate shakes it all up. 

With satellite images, the team used computer vision techniques to extract the HSV format of the images, and then applied thresholds on the resulting images to highlight the areas that contain specific markers of urban poor communities.  

For the exploratory data analysis, the team used the undersampling technique to weed out affluent neighborhoods. The findings from this analysis will help Pollinate Energy expand its operation in Bangalore. 


“It was amazing to explore state-of-the-art architectures in deep learning at this DataDive.”
—  Ashwin Vasan, DataKind volunteer for Pollinate Energy


Thank You

This year’s DataDive was made possible by our enthusiastic volunteers, mission-driven nonprofit partners, and ThoughtWorks—our venue sponsor. Here is a note of thanks to all of you!

Join us

We would love to see you at our events! If you’re local, join our Meetup to get involved and follow us on Facebook and Twitter for updates and announcements.

Source: DataKind – DataKind Bangalore’s Third DataDive: Charting Success in Social Good

October Webinar: Fighting Corruption in the Extractives Industry

 Photo credit: JB Bodane

What if we could use satellite imagery to automatically detect illegal mining from afar or use anomaly detection to flag contracts that differ from standard ones and may include potentially unfair or questionable language? Until now, the dominant use of data in the extractives industry has been reserved for descriptive and summary analytics, but new opportunities to leverage more advanced techniques like machine learning and artificial intelligence are just emerging thanks to the groundwork laid by Omidyar Network investees and others.

During our quarterly Learning Out Loud webinars, we’ll share insights uncovered on projects currently underway with Omidyar Network investees to leverage data science in the fight against corruption in the extractives industry.

Learning Out Loud Webinar #1

Get caught up on our first webinar above, then join us for our second in October where we’ll be sharing our learnings exploring datasets and prototyping data science solutions aimed at improving transparency in the extractives industry.

Tuesday, October 24, 2017
8am Pacific / 11am Eastern
Register >

All are welcome – please share with anyone you think may be interested!

Source: DataKind – October Webinar: Fighting Corruption in the Extractives Industry

Leveraging Tableau in DataKind Projects

Data visualization is a key step in many of our projects. As our founder and executive director Jake Porway says in this SSIR piece on data storytelling, “the true power of data comes from conveying the ‘so what’ behind the numbers, inspiring people to probe new questions, and using it for rigorous statistical inquiry.”

In addition to its general support of DataKind, the Tableau Foundation also generously grants DataKind a certain number of licenses to use Tableau software on their projects and our volunteers have used around the world for everything from helping organizations reach more children in need, understand homelessness or even fight crime.

Delving into Child Poverty Data

DataKind UK created an interactive visualization of child poverty as pictured above for the North East Child Poverty Commission in the North East UK. The Commission used this visualization for education, advocacy, and to inform policy as they work to improve the lives of poor children. In an article highlighting Tableau’s role in the project, Dr. Deborah Harrison, Coordinator of the North East Child Poverty Commission, explained,

“DataKind and Tableau have taken a complex piece of data and turned it into something more user-friendly and meaningful. The solution makes it easier to see child poverty ‘hotspots’, for example where child poverty levels are particularly high or where they have increased over time. Our goal is for local authorities to use this tool to enhance their existing knowledge of local child poverty levels, helping them to target their responses quickly and accurately.”

Improving Access to Education by Supporting Tutors

Using Tableau, DataKind UK mapped the engagement and milestones of their volunteer tutors to help The Access Project understand what makes for a successful pairing between tutor and student. The Access Project wants to adapt their processes to improve the volunteer experience and more effectively help motivated state school students access top universities.

Finding Children In Need

Shooting Star Chase, a UK-based children’s hospice, wanted to understand which geographic areas were most in need of their children’s hospice services. DataKind UK used Tableau to map public data to locate children suffering from life-limiting diseases, streamlining referral paths and ultimately saving up to £90,000 for children’s hospices around the country.

Understanding Patterns in Homelessness

DataKind San Francisco joined forces with the Community Technology Alliance to explore data from the Monterey County homelessness assistance program. With visualizations built in Tableau, DataKind San Francisco explored the factors that correlate with a homeless family’s successful transition from supported to stable housing. These insights will help Monterey County better allocate resources to support more families in overcoming homelessness.

Mapping Data to Fight Crime

DataKind Bangalore used Tableau to generate heat maps showing crime locations and plot trends from Bangalore Police Department data. Insights from this exploratory analysis, such as most crime hotspots in the city have empty space adjoining them, could guide future deployment of policemen and thus reduce crime in the city.

Share Your Examples

What great data visualizations have you seen being leveraged for social good? Share your links in the comments. If you’re in the New York area, learn more and do good by attending the upcoming Tableau software user group meeting September 19th that will generate donations for the Community Foodbank of New Jersey or find a DataKind meetup near you!

Source: DataKind – Leveraging Tableau in DataKind Projects

Happy Birthday, Chapters!

Today’s solar eclipse isn’t the only bit of global news going on – it’s also the day five of our six Chapters turn three! We are just as awestruck at all they’ve achieved. In August 2014, we launched five new Chapters in DC, San Francisco, Bangalore, Singapore and Dublin to join DataKind UK that had launched the year prior. Since then, they have recruited thousands of community members and supported organizations large and small, furthering DataKind’s mission around the world.  

Let’s take a quick look back:

Year One – Around The World in Six DataDives (Aug 2014 – Aug 2015)

Within six months of launching, all six Chapters hit the ground running by each hosting a DataDive in their local community. From DC to Dublin, from Leeds to Bangalore, from San Francisco to Singapore, hundreds of you around the world lent your expertise in six DataDives worldwide to fight global warming, stamp out corruption in Nigeria, address inequalities in Leeds and San Francisco, treat malnourished children in South Sudan, support civic engagement in Bangalore and much much more.

Year Two – Building & Sharing Practice (Aug 2015 – Aug 2016)

After building such strong communities, Chapters began taking on new challenges. DataKind Bangalore went beyond weekend DataDives to take on long-term DataCorps projects, DataKind DC began a long-term partnership with American Red Cross to predict and prevent home fires across the U.S. and DataKind Singapore shared its work onstage at the first ever Strata + Hadoop World Singapore.

Year Three – Innovation and Recognition (Aug 2016-Present)

In their third year, Chapters show no signs of slowing. DataKind UK has been developing a framework to help nonprofits understand their organization’s level of data maturity so they can do even more. DataKind San Francisco, with its over 2000 Meetup members, engages Core Volunteers to execute its DataDives. True to their trailblazing ways, DataKind Dublin’s most recent DataDive caught the eye of the newly appointed Taoiseach (the Irish Prime Minister) Leo Varadkar and Minister for Data Protection Dara Murphy – both of whom attended to learn more about the potential for data science to be used for good. 

In short, our Chapters are amazing and we can’t wait to see what year four will bring (or year five for DataKind UK!).  And it’s all thanks to the amazing people who take the time to make it happen by lending their best talents to their communities.  Chapter Leaders, this blog’s for all of you! Thank you for donating your time to building DataKind’s global community and expanding our work around the world.


Get Involved

 DataKind staff and Chapter Leaders at the 2016 Chapter Summit, testing out fancy eyewear ahead of the eclipse.

If you’re located in a DK location, be sure to sign up for your local Meetup group to get involved!

Source: DataKind – Happy Birthday, Chapters!

#GivingTuesday DataDive Findings Report

If philanthropic giving helps fuel social change, how can we increase it? This was one of many questions we dove into alongside the 92nd Street Y, Bill & Melinda Gates Foundation and over 100 data science volunteers at a New York DataDive held at Facebook this past March.

Check out the findings in our recently released report >

Launched in 2012 as a reaction to Black Friday – a day in the U.S. of intense shopping and spending after Thanksgiving – #GivingTuesday is a global movement that reaches millions of people in nearly 100 countries worldwide. While #GivingTuesday’s reach has grown significantly over the past five years, philanthropic giving in the U.S. still has not risen relative to GDP. If the philanthropic community could increase it by even 1%, the impact would be massive.

Very fitting for a weekend focused on philanthropic giving, data philanthropy – the donation of private datasets – fueled the analyses. Four volunteer teams dug into datasets brought together for the first time from an unprecedented data collaboration of 36 corporate and nonprofit data providers.

The findings outlined in this report show the impact that machine learning and predictive technology can have in boosting the nonprofit sector. Learn more in this blog from the 92nd Street Y and download the report for the full story.

Join Us In Seattle for Our Next DataDive

But wait, there’s more! We’re thrilled to be hosting another DataDive to delve further into #GivingTuesday and philanthropic giving. This time we’re heading west to Seattle, August 4-6. We are looking for data pros of all backgrounds to roll up their sleeves and work side by side with experts from #GivingTuesday, USA for UNHCR and Bill & Melinda Gates Foundation to explore these questions and find ways to further fuel social change.

Join us at the next DataDive >

Source: DataKind – #GivingTuesday DataDive Findings Report

How to Become a Data Driven Charity

By Emma Prest of DataKind UK and Lauren Bernard of NCVO
This post originally appeared on on the KnowHow Nonprofit blog.

Most of us agree that data is important in any organisation. We need to collect and analyse data to estimate the demand for our services, understand who our users are, find out which services are working for which people, and much more. In fact, there are few areas of work where the smarter use of data doesn’t make us more effective. And the future holds out even more opportunity – the ability to predict need and effectiveness, and to use data to design and innovate new services.  But first you need to know where you stand at the moment, and how to move forward in a way that suits your organisation and its existing capabilities.
Research into data maturity shows that there are seven key areas that you need to consider when improving your organisation’s use of data. And perhaps unsurprisingly, the most important are people and culture! This guide looks at those key areas, and suggests the signs that you should look out for to determine if you are on the right track.

1-Get buy-in from leadership

The attitude of leadership is one of the most essential ingredients to becoming data driven. If your leadership team sees data as a vital resource and is able to incorporate past, present and forward-looking data into business planning and decision making, then you’re on the right path. If the leadership team tends to make decisions based on individual experience, anecdote or gut feeling, then there’s work to do!

A willingness to invest in increasing the organisation’s capacity to work with data is a first step. In order to do this, a charity will need a broad range of people with data expertise and understanding, from admin roles through to board level.

2-Nurture an evidence based culture

Becoming more data driven is ultimately about changing a culture and inspiring your colleagues to be interested in data. In our research some respondents saw data as the responsibility of ‘someone else’, whereas in more data savvy organisations data was seen as a team effort and a critical asset for every part of the organisation.

An organisation that is hungry for feedback and strives for continuous improvement is more likely to embrace data. A subtle pointer is the way that questions about data are asked. If staff ask (positive!) questions that challenge practices and preconceived notions, as opposed to just looking for data to support and confirm existing beliefs, then you are culturally well placed to become more data driven.

Being able to share data and results across the organisation is another essential element. But be aware that both internal and external data sharing requires strong data protection and security practices. There must be regular training, and trustees and senior management should be aware of current legislation and best practice.

3-Build skills in-house

To collect and manipulate data you need the right skills in house (or at the other end of an email). Not only do you need staff who can conduct the right analysis, but your colleagues need a certain level of data literacy to understand the results produced.

In organisations that are more data mature there is often a dedicated person or team in charge of data, with skilled data people across other teams and departments. With the right in-house data chops you become the experts in your sector that other organisations turn to and use as a resource, building your credibility and influence.

4-Invest in tools

Ensuring you have the right tools is hard. Software requires updating; technology changes and better products come on the market; migrating databases and training staff in a new tool is always a headache. And yet, analytical infrastructure is a priority if you want to do more with data. Ongoing investment in tools, systems and infrastructure is key.

Charities that regularly and easily join up different data sets or store data in a singly accessible database are ahead of the pack.

Some organisations even make data accessible to all staff enabling them to explore the data themselves (this also means they no longer rely on the data guy or gal to run reports for them). Dashboards are a common way of democratizing data.

5-Get to know the data you hold

The starting point here is to know what data you hold across the charity. Once you’ve done a data audit and have a data inventory, it is worth reviewing how meaningful, relevant and useful that data is. Do you really need all of it? What are you missing?

Understanding the quality of your data sets and what kind of analysis can (and cannot) be done comes next. Charities that are more data mature monitor their data to check it is complete, accurate and valid. They have tools and systems for cleaning and maintaining it. They are able to join the data up to conduct analysis across teams. Their staff and volunteers are trained in data collection and understand why it matters. Where possible, data collection is automated.

Charities increasingly compare their data with other organisations’ data to benchmark their performance and they look to open data to enrich their internal data sources.

6-Be clear on what you want your data to achieve

Common uses of data in the voluntary sector include measuring outcomes and impact; monitoring the success of campaigns; reporting on staff and volunteer performance; demonstrating the need for your work; making the case to funders for new services/ products/ campaigns; and running financial models and donor retention. Data analysis is also often part of influencing policy makers, and developing robust evidence to build credibility and influence.

Less commonly, charities run analyses to understand how to make services more efficient; differentiating between approaches – what’s working and what’s not; testing assumptions and understanding client behaviours; analysing user groups to better understand their needs; targeting and optimising services/ products/ campaigns to suit those needs. As a sector we are moving towards a world where charities can predict needs, behaviour and outcomes, maximise income and provide more targeted solutions. 

7-Think about how best to conduct your data analysis

Charities tend to run quarterly reports, which often involves trend analysis of activities and finances. But increasingly charities are not just looking backwards. They are forecasting and predicting to plan for the future.

Some charities use advanced analytics such as clustering, root cause analysis, A/B testing, network analysis or text analytics. Data is brought together in automated ways to provide organisation wide analyses.

Moreover these charities don’t just run analyses every few months. They are able to do it in near real time. They also think about how best to communicate findings to different audiences, whether through simple reports or whizzy data visualisations. 

Further information

Like acquiring any new skill, using data better involves phases of progression, starting with the building blocks and moving up to more advanced stages. If you would like to take your organisation on a data journey check out the Data Maturity Model produced by DataKind UK and Data Orchard showing the seven key themes for being data driven, across five stages of maturity.

Source: DataKind – How to Become a Data Driven Charity

A North Carolina DataDive with SAS

They say a photo is worth a thousand words and this spectacular one taken on the SAS campus following our first ever North Carolina DataDive in April 2017 captures the energy and enthusiasm of the over 100 local data scientists, computer engineers and technologists that donated their time and talent that weekend. In partnership with our participating nonprofit partners and with generous support from SAS, they analyzed massive datasets, built statistical models, and created visualizations and algorithms to help each organization advance their work in a wide range of issues from community development to job readiness to fighting illegal tobacco sales to combatting hate crimes.

Anti-Defamation League
Uncovering hate crime trends to inform public policy

A screenshot of Anti-Defamation League’s work in progress tool showing hate crimes by group at a national level. .

You know how powerful findings from a DataDive can be when the event gets highlighted in testimony before the Senate Judiciary Committee only a few days later.

Founded in 1913, the Anti-Defamation League (ADL) is a premier civil rights/human relations organization that, in its own words, seeks to “ensure justice and fair treatment for all.” The organization is constructing a map of hate crimes throughout the United States based on hate crime data from the FBI in order to surface trends and provide analysis to use as a tool to inform policy makers and the public.

Using FBI Hate Crime data on reported hate crimes from 2004-2015, Data Ambassadors Chris Hemedinger and Lucia Gjeltema led a team to construct additional features for the map. They also expanded the tool by complementing the FBI’s hate crime data with incident news articles, state-level data on hate crime laws and policy as well as other demographic and contextual data from open sources.

The team uncovered a number of interesting findings. For example, when a state has a hate crime data collection statute, hate crimes are reported more frequently. It also appears crimes follow cyclical trends by days of the week and season, with most crimes being committed on Fridays and Saturdays and in the spring and summer. ADL’s CEO Jonathan Greenblatt highlighted the work that led to these findings in his testimony at the May 2017 Senate Judiciary Committee hearing, “Responses to the Increase in Religious Hate Crimes.”

Counter Tools
Understanding relationships between tobacco and health using spatial modeling

A screenshot of identified “triple offender” retail outlets illegally selling tobacco to minors in Indiana with locations plotted on Google Earth.

Research has shown that tobacco use and misuse remains the leading cause of preventable deaths in the United States. Counter Tools is a five-year-old nonprofit startup out of the University of North Carolina Gillings School of Public Health with a mission to advance place-based public health or increasing public health at a local level and based on the needs of that specific community. A primary focus of the organization is combatting the negative health impacts of tobacco marketing in brick and mortar retail stores. Counter Tools works to combat the negative health impacts of tobacco by equipping everyday people with tools and instructions to act as “citizen scientists” to collect data from retail stores in their communities.

Led by Data Ambassadors Louis Potok and Brian Spiering, the team dug into Counter Tools’ growing database of more than 40,000 store assessments, conducted by citizen scientists, that provides information on in-store tobacco product availability, pricing, placement, promotion and advertising.

They developed a tool to identify retail outlets that are (1) in violation of selling to minors (2) in violation of Assurances and Voluntary Compliance (AVC) agreements signed by their corporate parent and (3) located within 1,000 feet of a school. The team tied individual retail outlets to their corporate parents to better identify which parent companies are frequently in violation.

These and other tools developed by the volunteer team will help Counter Tools in its efforts to understand the relationship between tobacco availability or exposure to tobacco marketing and negative health outcomes so it can take action to keep communities healthy.

Habitat for Humanity International
Using affiliate data to understand network impact

The red shaded region represents a group of often overlooked albeit successful Habitat for Humanity International affiliates.

Data Ambassadors Mustafa Kabul, Jordan Meyer and Julia Kuznetsova led a team to help Habitat for Humanity International better understand the impact its affiliate network of over 1,300 nonprofits is having on local communities. While Habitat for Humanity International is obviously known for its construction projects, its network also completes 12,000 community projects annually focused on everything from building gardens to financial literacy programs and are typically undertaken in partnership with community groups, police stations, or local faith groups. Habitat for Humanity wanted to better understand the impact these interventions have on communities and develop a data-driven framework to understand the role and impact of its affiliates network as whole.

Using an internal database of U.S. affiliates’ construction and non-construction interventions, the team was able to help Habitat for Humanity see a new way of understanding affiliate success. They identified groups of very successful non-construction based affiliates that are often overlooked.

The team recommended Habitat for Humanity consider non-construction programs in the measure of success, especially for locations such as Alaska where construction costs are higher and weather makes construction projects more prohibitive. All this plus the team’s analysis on donors and social media will help inform Habitat for Humanity’s business planning and resource allocation decisions going forward.  

StepUp Durham
Optimizing the delivery of employment training and life skills workshops

The top chart displays the most important features predictive of job readiness, while the bottom chart shows the associated correlation coefficients used by the predictive model.

StepUp Durham’s mission is to help adults and children transform their lives through access to employment and life skills training and aims to be the premier resource in Durham County for vulnerable communities seeking to develop stable careers. They wanted to understand what factors influence participants to register, attend a workshop, complete a workshop, find a job, and continue engagement through participation in their Life Skills program.

Using over seven years of participant and donor data, a team led by Data Ambassadors Natalia Summerville, Jinxin Yi, and Zeydy Ortiz was able to identify the several important features that will help to predict job readiness of program participants. As shown in the chart above, these included things like a participant’s criminal record, access to transportation, history of substance abuse, the class sizes of workshops attended, homeless history, financial and food assistance history, and age. For example, the team found that having access to transportation and participating in small class sizes led to better outcomes. Interestingly, those with prior criminal records were correlated with greater successful outcomes. The team speculated but did not prove that those with a criminal past may come to StepUp more motivated to succeed when compared to the general group.

The team created a model with 85% accuracy to predict positive outcomes from the various programs and isolate variables for success so that all individuals seeking full-time work can obtain the skills needed to attain gainful employment.

Thank You SAS and Project Partners!

As mentioned, this was our first DataDive in North Carolina and we are thankful to SAS for making it possible. Not only did their campus provide the perfect location for amazing photos like the one above, but the DataDive itself was largely staffed by SAS volunteers that donated their time and talent to help these organizations. We also want to give a special shout out to our four fantastic project partners for their time and dedication to making this event such a success. We’re pleased to be continuing work with some of these organizations so stay tuned for more updates to come!

Source: DataKind – A North Carolina DataDive with SAS

Our New Video on Vision Zero

According to the National Safety Council, traffic collisions cause more than 40,000 deaths and injure thousands of people every year across the United States. Since cities in Sweden started the Vision Zero movement in the 1990s, many U.S. cities are now joining the effort as part of the Vision Zero Network, pledging to reduce these preventable traffic fatalities and injuries to zero in their communities

We recently completed our first ever Labs project, in partnership with Microsoft [link to Microsoft video blog], to support the Vision Zero movement by using data science to improve traffic safety in three cities nationwide.

But how exactly are cities now applying data science in their work to make streets safer? Check out our new video case study to hear directly from New York and Seattle representatives on the impact of the work.


Learn More

Interested in replicating this work in your own city or learning more about the nuts and bolts of how these collaborations happen? Check out the written case study for more resources, including our new Labs Blueprint.

Source: DataKind – Our New Video on Vision Zero

Report Back from DataKind DC’s Sixth DataDive

DataKind DC hosted its sixth DataDive this past March, partnering with four organizations—  Kiva USA, Global Financial Integrity (GFI), Freedom House, and Catholic Charities USA— to leverage data science to advance their missions. Over the course of the weekend, teams of volunteers identified valuable insights from the data. Volunteers have even continued working with Freedom House and Catholic Charities USA beyond the DataDive to develop valuable tools for each organization. Check out highlights from the weekend and learn about how you can get involved with DataKind DC below!


Kiva USA 

Kiva works to alleviate poverty byallowing donors to lend money through interest-free loans to low-income small business owners. The team, led by Data Ambassadors Jonathan Joa, Rachel Wells, Rajit Kavindran and Kyle Ogilvie, analyzed Kiva’s data to explore what types of loans and borrowers are prone to partial loan repayments. They found delinquency rates are highest in the two to four payment period, and that the first 90 days are sufficient to identify defaulting accounts (as shown below).

The team also sought to predict what loans are at risk of low repayment and optimize loan amount offered in order to to maximize repayment likelihood and social impact. For example, they found that businesses with eBay accounts were more likely to default.

All insights were shared in a final report with the Kiva USA partner representative. The findings reinforced a lot of the trends Kiva USA was already seeing as well as helped them think differently for how they might intervene with borrowers and ultimately support even more small businesses.


Global Financial Integrity

Global Financial Integrity (GFI) is working to curtail illicit financial flows by providing better advice to and advocacy for affected countries.  They wanted to collaborate with DataKind DC volunteers in analyzing price anomalies in trade transactions (the most common way to siphon illicit capital out of an economy). The challenge during the DataDive was to identify when goods entering and leaving developing countries were mis-invoiced. While GFI had developed tools for customs officials to determine if a given shipment was an outlier, they had not yet mined the data to look for specific trends. Led by Data Ambassadors Andrew Brooks, Margaret Furr, Minh Mai, and William Ratcliff, the DataKind DC divers assembled a list of the 20 most anomalous commodity transactions. In order for GFI to help customs departments collect more revenue and alleviate poverty, the future goal is to use more sophisticated techniques for outlier detection and expand the scope of countries.  Below, we show some examples of the outliers we found.

Finding #1: Identifying Outliers in Ethiopian Coffee Exports


In this figure, we show the monthly variation in the price of Ethiopian coffee. In red, we show exports from Ethiopia and in blue, we show imports. In green, we highlight outliers.  

Top Outliers


  • Polyether Alchohols
  • Article of Vulcanised Rubber
  • Uncoated Paper and Paperboard
  • Pallets and Pallet Collars
  • Paper and Paperboard Used for Writing
  • Heat Pumps
  • Agricultural or Horticultural Watering Applicances
  • Glazed Flags and Paving, Hearth or Wall Tiles and Mosaic Cubes
  • Beer made from Malt
  • Phosphates of Calcium
  • Rafts, Tanks, Coffer-Dams, Landing Stages, Buoys, Beacons and Other Floating Structures
  • Float Glass and Surface Ground and Polished Glass
  • Electric Conductors for a Voltage
  • Flat Rolled Products of Stainless Steel       
  • “Sheets of Iron or Non-Alloy Steel, Cold-Formed or Cold Finished, Profiled, ‘Rissed'”
  • Lamp Holders


  • Building Bricks
  • Fresh or Chilled Bonless Cutls of Fowls of the Species Gallus Domesticus
  • Tower Cranes
  • “Digital Versatile Discs ‘DVD'”
  • Test Benches for Motors, Generator, Pumps etc
  • Waste and Scrap of Stainless Steel
  • Salt, Denatured IR for other Industrial Uses, INCL REfining
  • Zinc Waste and Scrap
  • Groundnut, Cotton-Seed, Soya-Bean or Sunflower-Seed Oil and their fractions
  • Parts and Accessories for Apparatus and Equipment for Photographic or Cinematographic
  • Waste and Scrap of Copper Alloys
  • Soya-based Beverages
  • Molluscs, fit for Human Consumption
  • Lactose in Solid Form and Lactose Syrup
  • Portland Cement
  • Slaked Lime   

Finding #2: High Variance in Unit Price of Coffee By Importing Country

In the top panel, we show how the unit price of coffee exported from Ethiopia varies depending on the importing country. We present similar results in the middle and bottom panels for exports of coffee from Ghana and Kenya respectively.


Freedom House

An interactive tool built by DataKind DC volunteers that enables Freedom House to select multiple variables to be displayed simultaneously, allowing better understanding of data on civil liberties and political rights around the world.

Freedom House (FH) produces an annual the Freedom in the World (FIW) index, which is a yearly survey and report that measures the degree of civil liberties and political rights around the world. This work is a valuable source for activists and organizations that defend human rights and promote democratic change. Each country’s overall score is available on their website, but the 30 sub-indicators underlying each country’s total score in the FIW index are not as easily accessible or usable. At the DataDive, FH worked with DataKind to help organize and visualize the FIW index and sub-indicator data to make the historical data easier to access, review, and compare, in order to inform strategic planning for their programs and to help human rights organizations make better use of this data. Led by Alex Spancake, Chloe Gordon, and Arati Krishnamoorthy, volunteers used Bokeh, a Python interactive visualization library, to produce an interactive visualization shown above that incorporates the sub-indicator data provided by Freedom House, as well as five additional sources, such as the World Bank. Users can select from an array of economic, demographic, and freedom variables to analyze relationships across countries, regions, and over time. The tool will be used to compare countries’ scores and indicators in order to put pressure on governments to improve governance, rule of law, and human rights.

Since the DataDive, the team has continued to work with Freedom House to post the tool on Freedom House’s website, increase interactivity of their current D3 map, assess the predictive power of different variables on freedom scores, and automate the data preparation and collection process.


Catholic Charities USA

Catholic Charities USA (CCUSA) provides disaster assistance to individuals and families before, during, and after a tragedy hits. They provide the holistic compassionate care that helps individuals recover and move forward. CCUSA wanted to create a map to better target mitigation, preparedness, relief, and recovery projects in order to best serve communities that are both at greatest risk for disasters, are most overlooked, or outright excluded from federal assistance during disasters. The team, led by Rich Carder and Jake Snyder, used the CDC’s Social Vulnerability Index and a proprietary natural disaster dataset (generously donated by ATTOM) to develop this map using Mapbox, R, and D3.js. Volunteers created the alpha version of the Disaster Operations map during the DataDive, and it continues to develop and improve as an open source tool.


Working version of the CCUSA Disaster Operations Map.

During the DataDive, we were lucky to have a small but incredibly talented team of volunteers, some of whom have remained on board in the months since. In addition, the project benefitted from volunteers at Code for DC hack nights, making this a truly collaborative project with input from our amazing DC volunteer community!

CCUSA volunteers working at the DataDive: Lukas Martinelli, Adil Yalcin, Rich Carder, Jake Snyder, Alex Wasson 

The team is using GitHub to collaborate on further developments. If you are interested in contributing or learning more, the source code is located at this GitHub page.


Thank Yous 

Thank you to our DataDive host, Social Tables, for continuing to host us in their beautiful space. A huge thank you to our Data Ambassadors and volunteer teams that donated their time and talent to help four project partners use data to improve the world. 

Join Us!

We would love to see you at the next DataDive or Meetup! Join our Meetup group or follow us on Twitter for the latest news of how you can get involved.

Source: DataKind – Report Back from DataKind DC’s Sixth DataDive

Introducing Our New Labs Blueprint

Thanks to support from the Rita Allen Foundation, we are pleased to share our Labs Blueprint, a new resource documenting our learnings and approach from our first Labs project using data science to improve traffic safety in three U.S. cities.

Labs projects differ from other DataKind projects in that they are designed to address sector-wide challenges instead of a specific organization’s. As these projects look to move the needle on sector-wide issues, our hope is this document can help others learn from and hopefully be able to launch similar projects to drive change. 

Download the Labs Blueprint >

How To Use This Guide

Just as the data science process isn’t linear, neither is this guide. This is actually a series of learning modules that you can move between depending on your interests.

Dive into the Labs Blueprint by clicking on a module below.




Modules one to three provide background about DataKind Labs, data science and our Vision Zero Labs project, while modules four to six are designed for project managers and intermediary organizations and offer in-depth information on how to replicate DataKind’s Labs process and more details on our approach.

This resource is intended to support those interested in replicating or building upon our work to help advance Vision Zero efforts in communities or those looking to create similar scalable data-driven projects to help move the needle on sector-wide social issues. For example:

  • Data scientists or civic tech advocates interested in seeing examples of how you can apply data science for sector-wide impact
  • Social change organizations or local governments  interested in learning about data science and how it can move the needle on sector-wide issues
  • Intermediary organizations that, like DataKind, convene stakeholders on similar tech for good projects and are interested in an “under the hood” look at our work

At the end of every module is a feedback form where you can share your thoughts and questions. We’d love to hear from you. 

Learn More

Check out our full case study for more detail on our Vision Zero Labs project and for additional resources.

Source: DataKind – Introducing Our New Labs Blueprint