Data Science Approach

What is a data science approach?

Data science is a multidisciplinary field of study that focuses on creating, collecting, handling, and analyzing large amounts of data to extract actionable insights. Data scientists use scientific methods to develop algorithms that identify patterns in order to help predict future events, reduce risk, and improve outcomes.

Why use a data science approach to engage men?

Data science holds incredible potential to identify patterns and trends that can help us decide which populations of men to target and how to engage and mobilize them most effectively.

What are the most promising ways to use a data science approach?

Data science helps us make evidence-informed decisions. Large amounts of data can be particularly useful in answering complex questions compared to research that focuses on individuals’ perspectives alone.

Data science can also be well suited for identifying inequity and promoting social justice. That’s because large data sets are often more diverse and thus more closely representative of the true population as compared to research that focuses on smaller, homogenous samples.

As a first step, we need more quality data. At present, the data available in the field of engaging and mobilizing men is limited. We need more collection, storage, and usage of data across non-profit organizations trying to advance the field.

What is an example of putting a data science approach into practice?

One study in Brazil utilized data science methods to understand the most important factors to consider when predicting a mother’s risk of experiencing physical intimate partner violence (PIPV) during pregnancy and post-partum.

Researchers randomly selected mothers with children under the age of 5 months who were in primary health care waiting rooms. 811 mothers were interviewed about a) their experience of PIPV during pregnancy and/or post-partum, b) characteristics of their children (e.g., child’s age, sex, gestational age), c) their own characteristics (e.g., age, education, race), d) their mothers’ and partners’ lifestyle (e.g., tobacco and alcohol use), and e) socio-economic status (e.g., amount of household goods, occupation of family’s main income earner).

Researchers then used this data to calculate which factors significantly predicted PIPV. From there, they could develop a range of characteristics we would expect to see for mothers at various points on the continuum from low risk to high risk, which could help care providers better identify warning signs.

In this way, we can use data science to make evidence-informed decisions about our interventions and our work engaging men in violence prevention.

What else should I know before implementing data science approaches?

Without understanding the limitations of the data sets and the analyses, data science may be used incorrectly, perpetuate biases, present false information, and cause irreversible harm.

It’s important to know that the algorithms we design are only as good as the people who design them and the data we use. For example, if a data set contains more male than female participants, the data and subsequent analyses could perpetuate bias. That’s why algorithms should be designed using equitable data. However, this type of data is, unfortunately, non-existent since we live in a world rampant with disparities. Data scientists can instead build “fake” data sets in order to help design algorithms that don’t accidentally perpetuate bias.

It is also easy to find results that are statistically significant but not actually meaningful with large data sets. For example, consider a hypothetical scenario in which a large data set contains reports of suspected child abuse during the COVID-19 pandemic. A researcher may find a statistically significant drop in how often referrals to child protective services were made in October 2020 and conclude incidents of child abuse dropped. However, the drop in referrals is actually reflective of the fact that children were not in schools, which is where referrals are commonly made. Without that understanding, the researchers could interpret the data incorrectly.

Finally, privacy concerns present a major challenge when utilizing large data sets. Informed and ongoing consent will need to be considered in developing and implementing any data science approach.

Learn more about the responsible use of data with the Data Equity Framework created by We All Count.

Read more about data science approaches:

In addition to the Shift research reports listed earlier, the following resources offer further information on data science approaches: