In this post I will try to explain the solution of a problem that we can often encounter when using machine learning supervised learning algorithms. Supervised learning algorithms (classification algorithms) are usually the subject of a classification problem of 3 types. The first one is binary classification, when there are only 2 classes that a instance can belong to. For example, is the gender male or female, is the person sick or not, is the email spam or not.
Our second classification problem is referred to as multi-class classification. Here, the instance has to belong to only one of more than one class (more commonly used in deep learning problems, such as recognizing which object a photo belongs to). Finally, there are multi label classification problems, which are the most common classification problems in machine learning problems.
Accordingly, each instance may belong to more than one class. Therefore, a classification model needs to be created by assigning them to more than one class. However, most supervised learning algorithms are based on a multi-class classification approach. Therefore, different approaches have been developed to model multi-label classification problems. Most of them use the problem transformation approach. For instance, let’s say we have a classification problem where we need to build a multi-label classification model. We have a dataset of a large number of patients with different diseases. These patients have at least one disease, but many of them have more than one disease (diabetes, blood pressure, cardiovascular, cancer, migraine, organ failure, etc.).
One of the most practical and easy ways in multi-label classification problems is to turn the problem into a binary classification problem for each class. For each class, we train the model with the training data set and test whether the data set we have (based on patient information) belongs to the relevant disease (class). When it is understood whether the data set for each class is only related to that dataset, patients ( instances) associated with more than one disease (label) can also be assigned to more than one class. We call this method converting the multi-label classification problem into a multiple binary classification problem.
I hope this was useful.