Implementation of the Simple Additive Weighting Method in Determining Centroids in the Process of Clustering the Poor in Kakatpenjalin Village, Lamongan Regency Arifan

Clustering is an algorithm in a decision support system that functions to organize an object into groups of data. In the clustering process, of course, a cluster centre is needed by the desired data group. However, the clustering process has a problem. Related research states that the results of KMeans clustering can influence the selection of cluster centre points (centroids). Random selection of cluster centre points can result in different clustering results in the same data group. Not only on KMeans, but K-Medoids also have the same problem. So that to produce a good cluster, you must start by choosing the right centroids. To solve this problem, the Simple Additive Weighting method is used to select the centre point of the cluster. Simple Additive Weighting selects the centre point of the cluster by adding and summarizing the dataset. The summation is done by giving weight to each criterion and each criterion has its alternative value. From this weighted addition, the final value will be obtained. From the sum of SAW, then one of the objects with the highest and lowest values can be taken to serve as the centre of the cluster. From the results of the research conducted, it was found that the determination of the centroid using the SAW ranking can produce better clusters than conventional clustering. From five times of testing, it was found that the cluster results were consistent or there was no change in the cluster members. The location of the poor and rich clusters is easy to identify according to the centroid used, this can happen because by ranking the dataset, it can be seen which data is used for the poor cluster centroid and the capable cluster centroid.


Introduction
Clustering is a process of organizing an object into groups. A dataset is grouped into a data cluster. Fellow members in one cluster have a high level of data similarity, while members outside the cluster have a low level of data similarity. The difference between the clustering algorithm and classification is the absence of target variables in grouping the clustering process (Pramesti et al., 2017). In various clustering methods, K-Means are the ones we find and use frequently. The K-Means method works by separating or grouping data based on the dis-tance between objects or points. K-Means is a grouping algorithm developed by MacQueen in 1967 (Saputra, 2020).
However, the K-Means method has problems in the clustering process. According to literature (Anggodo et al., 2017;Arai & Ridho Barakbah, 2007;Jumadi, 2013;Pratama & Harjoko, 2015;Saputra, 2020) states that the clustering results of the K-Means method are sensitive to the selection of cluster centre points. For each clustering process by randomly selecting the centre of the cluster (centroid), the kmeans method can produce different clustering.
Not only K-Means, but K-Medoids which use representatives of objects as the centre of the cluster also have the same impact. This is also based on the fact that K-Medoids belong to the same type of clustering as K-Means, namely the partition approach or often referred to as partition-based clustering.
From the problems found in the use of the K-Means and K-Medoids algorithms for clustering, a reference is needed to take data as the centre point of the cluster. From the existing problems, Simple Additive Weighting (SAW) is used to select the centre point of the cluster (centroid). The SAW method has an additional function to rank an object, by ranking a dataset, it can be seen the order of the data from lowest to highest. From the ranking value of each object, it can be seen the value of each object, then the object is taken with the highest and lowest ranking values to be used as the centroid.
The basic concept applied to the Simple Additive Weighting method is to add a weighted criterion as well as an alternative rating of the criteria. The SAW method gives weight to each of the criteria and alternatives that have it so that the weighted sum is the final result (Frieyadie, 2016). From this sum, the ranking value of each data will be obtained. The ranking value generated by the object or dataset can be sorted based on the ranking value from the highest to the lowest rank or vice versa. An object that has a greater value can be concluded that the object has a higher priority.

Material and Method
In the clustering process, of course, a central cluster point is needed by the amount of clusters that have been determined. This research combines the SAW method with clustering to create a dynamic and accurate centroids. Broadly speaking, there are stages in combining the SAW method with clustering, namely: The first rank the dataset first so that the data can be seen with the highest to lowest ranking values. Second, take the centre of the cluster based on the highest and lowest ranks. The three initialization of clusters formed and the calculation of cluster members using K-Means or K-Medoids. Figure 1 shows an overview of the research methods used.

Simple Additive Weighting (SAW)
This research method uses Simple Additive Weighting which is used to rank the da-taset. The resulting data ranking is then used as a reference in taking the cluster centre.
Simple Additive Weighting or often held to as SAW is a weighted addition method. In dealing with Multiple Attribute Decision Making, the SAW method is the method most often found and the most widely used as an alternative solution (Utomo, 2015). The concept of summation in the SAW method is to give weight to the criteria and the alternatives that exist for each criterion. Then the SAW method performs the normalization of the matrix (X) into the existing alternative value susceptibility forms. The completion process in ranking the dataset uses the SAW method, namely: 1.
Determine what parameters or criteria will be used in the weighted addition (Ci).

2.
Determine the weighted value for each criterion (W).

3.
Provide alternative rating values for each parameter.

4.
Determine the nature of the criteria. (1) Description: : The rating value of each alternative.
: Attribute values of matrix rows and columns. : The largest value in each parameter. : The smallest value of each parameter. Benefit : If the greater the value is best. Cost : If the smaller the value is the best.
: The weight value of each criterion.
: Normalization of the matrix.

Clustering
Clustering is a process of observing or organizing an object into classes that have similarities between objects (Sindi et al., 2020). The clustering algorithm is different from the classification in that there is no target variable in the clustering process. Clustering is also often used as an initial step in the data mining process when conducting an analysis. The concept of clustering is grouping data based on the number of similarities between objects, therefore clustering is included in the unsupervised learning method (Anggreini, 2019). The logic in the in the basic concept of clustering can be seen in Figure 2.

K-Means
K-Means clustering is a grouping method that aims to break down objects or data collectors into groups. The K-Means algorithm is an algorithm that classifies data into clusters or groups, data that has a high level of similarity will be in the same cluster while data with a high level of dissimilarity will be outside different clusters or clusters (Rahayu et al., 2019). Determination of the object's similarity to the cluster centre point was measured using the Euclidean Distance method. The Euclidean distance method is used based on a comparative study of distance measures on k-means, which shows that the optimal distance measure used in grouping music to mood is Euclidean Distance (Harsemadi, 2018).
By combining the SAW method with the K-Means there is a change in the determination of the cluster centre (centroids), the cluster centre is taken based on the largest and smallest SAW sum or ranking values. The stages of K-Means in Clustering are: 1.
Determine the cluster to be formed. In this study, it formed 2 clusters. 2.
Select the centroids point according to the cluster you want to create. Because using the SAW method in selecting the centroids point, the centroids are taken based on the largest and smallest SAW ranking values. 3.
Each object is grouped according to the specified cluster. The determination of the similarity between objects will be calculated by the Euclidean Distance equation in equation (3). An object will be grouped into a cluster that has the smallest Euclidean Distance value (Anggodo et al., 2017). Each cluster centre is recalculated based on the mean or means value in the resulting clusters. 5.
Repeat steps 2 -4 until the latest and previous clustering results are the same. The clustering process will stop if there are no more changes to the cluster results. The results of the last iteration clustering are taken and used as clustering results (Rahayu et al., 2019).

K-Medoids
K-Medoids is a classical clustering (a priori) partitioning technique that groups a dataset of objects (ni) into groups (k) (Anggreini, 2019). K-Medoids use objects as centroid representatives for each cluster.
By combining the SAW method with K-Medoids there is a change in the determination of centroids. The data used as the centre of the cluster is taken based on the largest and smallest ranking values. The steps for completing the K-Medoids clustering are: 1. Initialize the cluster centre (centroid) as many clusters (k) to be created, the cluster centre is taken based on the highest and lowest results from the ranking. 2. Determination of the object's similarity to the centroid is measured by the Euclidean Distance shown in equation (4).
3. Pick the object for each cluster as the new centroid candidate. 4.
Calculate the distance with Euclidean for each object to each cluster. 5.
Calculate the deviation (S) using (new distance result -old total distance). If the result of S <0, then replace the objects with cluster data to get a new group of k objects as medoids (Sindi et al., 2020). 5.
Repeat steps 3 to 5 until there is no change in medoid. Then to get the k value in data that is in the K-Medoid clustering, it can be selected based on the least Davies Bouldin Index or DBI value (Sindi et al., 2020).

Silhouette Coefficient
Silhouette coefficient is a method used to see the quality and strength of a clustering result, measuring the level of similarity of objects in a cluster. Silhouette coefficient combines separation and cohesion methods, both methods are combined and have their respective functions. The cohesion method has a function to measure how close the distance or how similar objects are in one cluster and the separation method serves to measure the distance of objects in a cluster with other clusters (Pramesti et al., 2017). The steps to resolve the clustering Silhouette coefficient are as follows: The first step is to determine the average distance between one object and another object in the same cluster, using the equation (5).
(5) Description: j = Other data in cluster. A = Number of data in cluster.
= Distance between data i and j in the same cluster. The next step is to determine the average distance between one object and another object that is outside the cluster, or different clusters, then take the minimum value with equation (6).
(6) Description: = The average distance of the i-th data with all objects in other clusters C where A≠C.
(7) Finally, calculate the silhouette coefficient value for each object using equation (8).
(8) Description: a(i) = Average distance of the i-th data in the same cluster. b(i) = The minimum value of the average distance of the i-th data with data in other clusters. max (b(i),a(i)) = Maximum value between a (i) or b(i). The clustering test using the Silhouette Coefficient method has a value range of -1 to 1. The closer to 1 the Silhouette Coefficient value, the better the quality of the clustering results. Conversely, if the silhouette coefficient is getting closer to -1, it means that the results of clustering are getting worse.

Results and Discussion
The data used in this study is data on the poor in the Kakatpenjalin village. This data was obtained through village deliberations which aimed to determine the disadvantaged groups of people. To determine the underprivileged community, a variable that is adjusted based on the SIKS-NG application is used, the variables used in this study include Residential Status, Floor Conditions, Wall Conditions, Vehicles, Dependents, Income and Education. The SIKS-NG application is a management application for the process of improving and proposing new Integrated Database (BDT) data in which there is also a module for repairing and proposing non-PKH Food Social Assistance (BSP) data (Kementerian Sosial Republik Indonesia, 2019). This chapter will explain the results and discussion of the SAW ranking process and the clustering process by taking centroids based on the SAW ranking values and comparing them with conventional centroid retrieval clustering. Using 50 test data.

Simple Additive Weighting (SAW)
The steps for performing ranking calculations using the SAW method are: 1.
Determining Criteria. Determine the parameters or criteria used as a reference in selecting the poor and capable, namely Ci. Shown in Table 1.

2.
Provide Weight Value. Determine the weighted value or level of importance (W) on each criterion. The weights of these parameters (criteria) which will be used in the summation process are shown in Table  2.

3.
Determine Alternative Ratings for Each Criterion. The third step determines the rating of each alternative on the existing criteria, shown in Table 3.

4.
Determining the Nature of Criteria. The fourth step determines the nature of each criterion that will be used in the calculation of the SAW method, shown in Table  4. 5.
SAW Rank Determination. Addition and multiplication is the final process of the SAW ranking process, addition and multiplication of the matrix that has been normalized by the weight of each criterion so    Table 5 Clustering In the clustering process, 2 scenarios of centroids taking will be carried out and 5 trials are conducted. The first scenario will be done by taking the centroids based on the SAW ranking and the second scenario will be done conventionally. In this study, 2 clusters were grouped, namely the poor community cluster and the well-off community cluster.
In the selection of the centroids based on the SAW ranking, the data taken as the poor centroids is the data that has the largest ranking value and the Able cluster past point is the one with the smallest ranking value. The centroids taken based on the SAW ranking are shown in Table 6.

K-Means
In the process of grouping data using the K-Means method, the collection of the cluster centre of the object is only done once at the beginning of the iteration. then the centroids is obtained from the average value of each cluster. The results of 2 scenarios for determining the centroid using the K-Means method are shown in Table 7 and the comparison graph can be seen in Figure  3.

K-Medoids
It is different from K-Means in taking the centre of the cluster. K-Medoids always use a representative of the object to be taken as the centre of the cluster (centroid) in each iteration. In this study, taking the centroid using the SAW ranking value was placed in the first iteration. The results of the 2 centroid determination scenarios in the K-Medoids method are shown in Table 8 and the comparison graph can be seen in Figure 4.

System Testing
System testing will be carried out on 2 cluster centre sampling scenarios using the K-Means and K-Medoids methods, which will be carried out 5 times. To see the Silhouette Coefficient value of clustering results from 2 centroid retrieval scenarios. Taking the centroids point based on the value of the SAW ranking and which is done randomly (conventional).

K-Means Test Results
The test results on the results of the K-Means clustering method are shown in Table  9, the best average Silhouette Coefficient value obtained is 0.425. From 5 trials, the application of the SAW + K-MEANS method resulted in the same (fixed) data group, while K-MEANS had changes in the results of grouping

K-Medoids Test Results
The results of testing the results of the K-Medoids cluster method are shown in Table 10. The highest average Silhouette Coefficient value is 0.412. From 5 trials, K-MEDOIDS + SAW produced the same data group (fixed). Whereas in K-MEDOIDS there is a change in the results of grouping. From the results of the tests carried out, it was found that the results of combining the SAW method with Clustering could produce consistent data groups. The resulting data group also has a definite class location. The results of the grouping always follow where the centre of the cluster is placed, in this study the centre of the poor cluster is placed on C1 so that it can be ascertained that C1 is a group of poor people and C2 is of course a group of rich people. While the results of conventional clustering can produce variable data groups and also have to analyze the results of C1 and C2 to find out where the data groups are classified as poor and capable.  C1  C2  C1  C2  1  30  20  20  30  2  30  20  21  29  3  30  20  11  39  4  30  20  11  39  5  30  20  27  23   Table 8. K-Medoids Clustering Results

Figure 4. Comparison of K-medoids Results
In terms of the quality and accuracy of the clustering results, taking the centre point of the cluster with the SAW rank has an average Silhouette Coefficient value that is higher than the random centroid taking. The average value of the Silhouette Coefficient K-Means+SAW is 0.425 with K-Means 0.338 and K-Medoids+SAW is 0.412 with K-Medoids 0.394.

Conclusion
Based on the testing and evaluation of the method of determining the centroid randomly with the method of determining the centroid based on the SAW ranking value, the following conclusions were obtained: (1) From the test results of two clustering methods K-Means and K-Medoids, it shows that centroid determination based on SAW ranking has a higher Silhouette Coefficient value than random centroid sampling. With an average value of Silhouette Coefficient K-Means+SAW 0.425 with K-Means 0.338 and K-Medoids+SAW 0.412 with K-Medoids 0.394. The Silhouette Coefficient value is used to see the quality or strength of the cluster and see how well an object is placed in a cluster by measuring the distance between ob-jects in a cluster (Anggara et al., 2016). From the results of the existing Silhouette Coefficient values, it can be concluded that the determination of the centroid using the SAW method in the clustering process produces better cluster quality than the determination of the centroid randomly. (2) Determination of centroids based on SAW ranking on the K-Means and K-Medoids method is able to produce consistent data groups compared to the random centroid determination method. From 5 trials, selecting centroids based on SAW ranking was able to produce the same data group (fixed), while random centroid selection could produce variable data groups. (3) Application of Simple Additive Weighting to select the centroid, only suitable for clustering process using 2 clusters. By choosing the centroid of the highest and lowest SAW ranking values only.

Suggestion
For reference in the development of further research, the suggestions put forward by the author are as follows: (1) The use of the SAW method in the clustering process has shortcomings in the centroid search process. The clustering process is slower because it has to rank the da-Journal Of Development Research, 5 (2), November 2021, Pages 85-93 Copyright © 2021, JDR, E-ISSN 2579-9347 P-ISSN 2579-9290 92 taset first to find the centre point of the cluster (centroid). So it is necessary to separate the ranking process and the clustering process, then store the ranking results for each data. Thus, during the clustering process, there is no need to rank first, so the clustering process will be faster.
(2) The selection of centroids with SAW ranking values is only suitable for clustering processes with 2 clusters, so it is necessary to add other methods that are suitable for use in clustering more than 2 clusters.