# br The defuzzification br After feature selection the

2.7. The defuzzification

After feature selection, the fuzzy c-means algorithm does not tell us what information the clusters contain and how that information shall be used for classification. However, it defines how data points are assigned membership to the different clusters, and this fuzzy membership is used to predict the class of a data point [69]. A number of defuzzification methods exist [70,71]. However, in this paper, each cluster has a fuzzy membership (0–1) of all ADH-1 in the image. Training data are assigned to the cluster nearest to it. The percentage of training data of each class belonging to cluster A gives the cluster's membership, cluster A = [i, j] to the different classes, where i is the containment in cluster A and j in

Fig. 9. The fuzzy c-means is wrapped into a black box from which an estimated error is obtained.
Informatics in Medicine Unlocked 14 (2019) 23–33

the other cluster. The intensity measure is added to the membership function for each cluster using a fuzzy clustering defuzzification algo-rithm. Fuzzy c-means allows data points in the dataset to belong to all of the clusters, with memberships in the interval (0–1) as shown in Equation (6).

djk

where mik is the membership for data point k to cluster center i, djk is the distance from cluster center j to data point k and q €[1 … ∞] is an exponent that decides how strong the memberships should be. The FCM was implemented using the fuzzy toolbox in MATLAB.

A popular approach for defuzzification of the fuzzy partition is the application of the maximum membership degree principle, where data point k is assigned to class m, if and only if its membership degree mik to cluster i is the largest. Genther et al. [72] proposed a defuzzification method using a fuzzy cluster partition in membership degree compu-tation. Chuang et al. [73] proposed adjusting the membership status of every data point using the membership status of neighbors. In the proposed approach, a defuzzification method based on Bayesian prob-ability was used to generate a probabilistic model of the membership function for each data point, and the model was applied to the image to produce the classification information. The probabilistic model [74] is calculated as below:

1. Convert the possibility distributions in the partition matrix (clusters) into probability distributions.
2. Construct a probabilistic model of the data distributions as in Ref. [74].

3. Apply the model to produce the classification information for every data point using Equation (7).

Bj

where P ( Ai ), i = 0….c is the prior probability of Ai which can be com-puted using the method in Refs. [74,75], where the prior probability is always proportional to the mass of each class.

The number of clusters to use was determined. This was necessary so that the built model can describe the data in the best possible way. If too many clusters are chosen, then there is a risk of overfitting the noise in the data. If too few clusters are chosen, then a poor classifier might be the result. Therefore, an analysis of the number of clusters against the cross-validation test error was performed. An optimal number of 25 clusters were attained and overtraining occurred above this number of clusters. Table 2 shows the results of the fuzziness exponent using dif-ferent configurations.

The least defuzzification exponent (1.0930) from Table 2 was used to calculate the fitness error for feature selection. The errors were cal-culated using different cluster configurations as shown in Table 3.
The cluster configuration with the least error (as shown in Table 3) was used to select the features for classification. A total of 18 features out of the 29 features were selected for construction of the classifier. The selected features were: nucleus area (the actual number of pixels in

Table 2

Defuzzification fuzziness exponent calculation configurations.

2 Fold cross-validation with 60 reruns
10
Fold cross-validation with 60 reruns

Configuration
Fuzziness exponent
Configuration
Fuzziness exponent

Table 3

Defuzzification fitness error calculation configurations.

2 Fold cross-validation with 60 reruns
10
Fold cross-validation with 60 reruns

Configuration
Fuzziness exponent
Configuration
Classification Error

nucleus; a pixel's area is 0.201 μm2); nucleus gray level (the average perceived brightness of the nucleus); nucleus shortest diameter (the shortest diameter a circle can have, when the circle is totally encircled around the nucleus); nucleus longest diameter (the longest diameter a circle can have, when the circle is totally encircled around the nucleus); nucleus perimeter (the length of the perimeter around the nucleus); maxima in nucleus (maximum value of number of pixels inside of a three pixel radius of the nucleus); minima in nucleus (minimum value of number of pixels inside of a three pixel radius of the nucleus); cy-toplasm area (the actual number of pixels inside the cytoplasm); cyto-plasm gray level (the average perceived brightness of the cytoplasm); cytoplasm perimeter (the length of the perimeter around the cyto-plasm); nucleus to cytoplasm ratio (the relative size of the nucleus to the cytoplasm); nucleus eccentricity (the eccentricity of the ellipse that has the same second-moments as the nucleus region), nucleus standard deviation (the deviation of gray values of the nucleus region); nucleus variance (the variance value of the gray values inside the nucleus re-gion); nucleus entropy (the entropy of gray values of the nucleus re-gion); nucleus relative position (a measure of how well the nucleus is centered in the cytoplasm); nucleus mean (the mean gray values of the nucleus region) and nucleus energy (the energy of gray values of the nucleus region).