AdvanceGender
...
 
Classification tree analysis
Sex/gender-sensitive identification of intersectional subgroups with classification trees
What is the goal? 

The primary aim here is to use classification tree analyses based on different sex/gender theoretical approaches for the identification of intersectional subgroups that have a particularly high or low prevalence of health outcomes.

What are classification trees?
How can the goal be achieved?

By means of an algorithm-driven classification process, subgroups are identified that have a particularly high or low prevalence of a health outcome. Various central sex/gender-theoretical concepts can be operationalised through different uses of the (binary) sex/gender variable and further solution-linked sex/gender variables. Moreover, by additionally considering a variety of social categories, intersectionality can be captured in relation to sex/gender. For example, the results of a model that includes the binary sex/gender variable as well as further social categories ("biological sex" model) can be compared with the results from a model that includes solution-linked sex/gender variables instead of the binary sex/gender variable ("gender equality" model). A further model can then be used for comparison. This model would include both the binary sex/gender variable and solution-linked sex/gender variables along with social categories ("gender equity" model).

The exploratory results of different combinations of a binary sex/gender variable, social categories and solution-linked sex/gender variables can then be compared with each other. This comparison can help to identify whether and, if so, how their results differ from a sex/gender-stratified analysis that is often carried out for a sex/gender-comparative presentation. Alternatively, a specific theoretical reference can be established before starting statistical analyses by embedding the entire analysis process in a specific sex/gender theory or perspective. Accordingly, the need to embed epidemiological research more strongly in sex/gender-theoretical concepts can be met.

What are the advantages?

By applying an intersectionality-informed and sex/gender-sensitive analysis strategy, the transfer of theory into statistical analysis can be strengthened: Depending on the operationalisation of the sex/gender theoretical concept through the use of specific variables, the interpretation of results can be facilitated by referring to central sex/gender theoretical concepts.

Classification tree analysis is particularly suitable for exploratory descriptive analyses when a large number of social categories and solution-linked sex/gender variables are to be considered without the ability to make prior assumptions. 

Missing values in the data as well as outliers do not have a significant impact on the classification process. Furthermore, a variety of social categories and solution-linked sex/gender variables can be integrated into the analysis, even if they are highly correlated with each other, since the procedure is free of assumptions about their probability distribution.

The results are in general easy to interpret and suitable for a transfer to decision-makers, for instance when discussing the design of interventions and resource allocation.

What are the challenges?

As a statistical technique from the field of machine learning, there is a risk of over-fitting to the dataset used. Techniques such as pruning, which are applied after the classification process, can counteract possible over-fitting. 

The interpretability of large classification trees can be challenging.

Example from the AdvanceGender project

Further Resources:

  • Hammarström A, Johansson K, Annandale E, Ahlgren C, Aléx L, Christianson M, et al. Central gender theoretical concepts in health research: the state of the art. J Epidemiol Community Health. 2014;68(2):185. https://doi.org/10.1136/jech-2013-202572 undefinedundefinedundefinedundefined

This document was retrieved from the AdvanceGender website (www.advancegender.info).  

Authors:

Emily Mena, Gabriele Bolte (University of Bremen, Institute for Public Health and Nursing Research, Department of Social Epidemiology) on behalf of the joint project AdvanceGender

Suggested citation: Mena E, Bolte G. Gender-sensitive identification of intersectional subgroups with classification trees. In: AdvanceGender Study Group (ed.). Options for gender-sensitive and intersectionality-informed research and health reporting; 2022. (www.advancegender.info).

Contact persons: Gabriele Bolte (gabriele.bolte@uni-bremen.de)

Version: 1.0 (Date: 04.01.2022)