Implementation of Adaptive Neuro-Fuzzy Inference System and Image Processing for Design Applications Paper Age Prediction

The development of technology today is widely misused by some people who intend to forge paper on documents and books. One way to find out the authenticity of a paper is by knowing its age. The age of paper can be known in several ways: carbon dating, uranium dating, and potassium-argon dating. But these meth-ods still have weaknesses, requiring sophisticated equipment at a high cost, long processes to get results and limited access. To solve this problem, researchers made an application that can identify the age range of a sheet of paper with a faster process, low cost and does not have to be used by laboratory employees alone. The application is a Paper Age Prediction Application made desktop-based, using the MATLAB programming language with the Anfis Sugeno (TSK) Gaussian membership function method. Image processing by taking the average values of C, M, Y, and K from 70 images used as a database and will be trained with ANFIS. The research method uses interviews, observations, and literature studies — the prototype application development method. The test results showed an application success rate in identifying 60 data that had been trained by 100% against 40 that had not been trained by 42.5%.


INTRODUCTION
Paper is a thin and flat material produced by the compression of fibers derived from the pulp. Paper is of high value if it contains valuable data and information, such as paper in books and documents [1]. Over time, paper ages, resulting in it becoming brittle and easily destroyed. The age difference on the paper can be seen from the color change [2].
Research [3] states that there are several ways of determining the age of a paper or manuscript. That is with internet evidence and externe evidence. Internet evidence is an attempt to determine the manuscript's age from its data, such as the number of years recorded, events narrated, and characters mentioned [4]. At the same time, externe evidence is an effort to determine the manuscript's age from factors outside the data in the manuscript. One of them is from the paper material. [5] Carbon dating methods can be used for manuscripts or papers which have organic content [6].
Research [7] states that the changes seen in paper aging are characterized by a yellowishbrown change in paper color around the edges. As technology develops, computers can read an image with the Image method. Processing to get the identity of the image. Therefore the computer can also read a paper image to get the identity of the paper image [8].
Neuro-Fuzzy logic is a hybrid logic combining fuzzy logic with artificial neural networks. Neuro-Fuzzy logic conducts training using neural networks, but the network structure is interpreted with fuzzy rules. Fuzzy Neural Network was introduced by Ishibuchi, a neural network learning method to utilize expert knowledge represented in the form of IF-THEN [9].
ANFIS (Adaptive Neuro Fuzzy Inference System) is a combined inference system of Neural Networks with Fuzzy Inference Systems [10] [11]. ANFIS uses Sugeno or Takagi-Sugeno inference models [12]. It can be said that ANFIS is a method in which, in setting rules, a learning algorithm is used on a set of data. ANFIS also allows rules to adapt [13]. Because ANFIS is a merger between Fuzzy Logic and Artificial Neural Networks, the advantage of ANFIS is that it can complement the characteristics of Fuzzy Logic and Artificial Neural Network that are opposite in terms of learning ability and ability to explain the reasoning process [14] [15]. Meanwhile, the drawback is that the system's success is determined by the data that is the source of learning. To get optimal results, data that has a high level of accuracy is needed [16].
In the previous study, Nurul Hikmah, Department of Electrical Engineering, University of Indonesia 2008, researched ANFIS and image processing on the human eye's retina. [17] also conducted research on ANFIS and image processing in the human iris. The research above found that the ANFIS and Image Processing methods can be used to output in the form of solutions to a problem in identifying images [18].
Suppose we use an application that can predict the age of paper. We do not need to buy expensive sophisticated equipment. In that case, the process will be simpler and faster, and everyone can use this application, not just people with special criteria, such as archaeologists, geophysicists, or other experts [19]. Therefore, in this study, ANFIS and Image Processing methods will be implemented to predict the age of paper in an application [20]. This application can help the expert, and ordinary people predict the age of paper in books, important documents, and other papers.

METHOD Data Collection Techniques a. Observation
This study made observations at the Main Library of UIN Syarif Hidayatullah Jakarta to examine the differences in color and texture seen on the sheets of book paper in the library. Based on observations, the library has books whose paper is the oldest type of HVS paper, published in 1969 [21]- [26]. b. Literature Study Data collection techniques using literature study, namely looking for relevant references to the object to be studied. Reference searches are done in libraries, journals on application design, and application research scientific papers.

System Development Engineering
This research develops the system by conducting the Prototype development method. In this Prototype method, there are four stages of the development cycle, namely Needs Collection and Analysis and Design, Building a Prototype, and the last stage is Evaluation and Testing [27]- [29].

Design
Design a Data Flow Diagram using Power Designer 6 tools and a Flowchart using Microsoft Office Visio 2007 tools. In making a database, Notepad tools are used. Meanwhile, to create an application interface using the MATLAB GUI (Graphic User Interface) tool.

Building a Prototype
Pre-designed application coding in the rapid design stage. Application coding is done using the MATLAB programming language.

Evaluation and Testing
This stage is carried out in three processes: application testing, documentation, and test results analysis. Application testing aims to see the results of the created application, whether it runs well or not.
The tests carried out are black box tests. Blackbox tests are run to observe whether the program has successfully received input, processed, and produced the appropriate output without looking at the application's source code.

Figure 1 Data Flow Diagram Paper Age Prediction Application
The workflow starts with entering the scanned image of the paper you want to test, and then the system will read the image and process the image. The image will be automatically cropped by 100x350 pixels by the system, and then the cropped image is called the Region of Interest (ROI). ROI is the area chosen to calculate the CMYK value. Furthermore, the system will block the ROI on the image to know the RGB value. The RGB value will be converted into a CMYK value, and the average CMYK value will be obtained. This value will then go through a matching process against the Fuzzy Inference System from 60 data trained with ANFIS.

Database Creation
Artificial category paper is deliberately aged paper, soaking white hvs paper in tea or coffee liquid, and then the paper is dried in the sun. The young category paper comes from books published from 1999-2013. The Medium category paper comes from a book published in 1984-1998. The old category paper comes from books published from 1969-1983 years. Training data collection in the form of scanned images of sheets of paper is shown in Figure 1 to Figure 4. The database used in this study contains the average values of C, M, Y, K, and weighting from the training data imagery. To determine training data, the author used scanned images of sheets of paper from books, as many as 60 data with details of 15 per category images. Imaging must be done with a scanner so that each paper has the same lighting. The Paper Image Training Data results per category are shown in Table 1 to Table 4.

Building a Prototype
At this stage, the creation of a pre-designed application using the Matlab programming language and a database in the form of a file.txt using Notepad. Building a User Interface with the Matlab GUI is shown in Figure 5.

Process Crop Region of Interest Block
This process is cutting and splitting images into rectangles with a size of 100x350 pixels. In this process, the ROI part that has been cropped will be blocked by the system with the aim that this part will be the testing area of the image.
In each image, one separation is carried out on the upper left edge of the paper because that part shows the difference in paper color that we can see per category. Then the Region of Interest will be blocked by the system to calculate the average values of C, M, Y, and K. This separation is done to separate the part of the paper that is desired to obtain the identity of the CMYK value, which will later be entered into the main database as training data. Furthermore, the database will be trained and tested with the ANFIS method. Code on the application: C=imcrop(I,[1 1 100 350]);

CMYK Color Feature Extraction
The color extraction process is preceded by taking the original image's red, green and blue values. Furthermore, the value will be converted into cyan, magenta, yellow, and black values, which will later be used as input parameters. Code on the application: To get the image characteristics in a single value obtained by finding the average of each result sum of the values C, M, Y, K divided by the product of pixels. So that an image has the characteristics of the average values of C, M, Y, and K, the average value of CMYK will then be used as a characteristic parameter for ANFIS, where in this stage, a matching process occurs using the Gaussian membership function. c=mean2(C); %obtained an average grade of C m=mean2(M); %obtain the average value of M y=mean2(Y); %obtained the average Y score k=mean2(K); %obtain the average K score

Matching Process
The matching process is the process of input image recognition by FIS, which has previously been tested using training data in the ANFIS editor in the Matlab used for ANFIS design. The results of FIS Training and Testing Against Training Data are shown in Figure 6. The o sign in the test result image below is the training data input, while the red * sign is the output result of the training and testing process with ANFIS. From this process, an average error of 0.14008 was obtained. The smaller the average error value, the better the input data recognition process.

ANFIS Training and Testing Results
From the ANFIS training and testing process against training data, the ANFIS architecture was formed. In the ANFIS architecture, there are five layers where at each layer, there are four inputs, the formation of three input membership functions, the formation of a fuzzy rule base of 81 rules (3 membership functions are raised with four inputs), 81 output membership functions, and one output. The formed architecture of ANFIS is shown in Figure 7.

Output Design
After the training data has gone through the training and testing process, the output can be known by: The result of running the application is shown in Figure 8.

Application Testing
This application testing is carried out by testing in black boxes and testing the success rate of the application. The selected black box testing is functional testing and user acceptance testing. The functional testing scenario is shown in Table 4.

DISCUSSION
The application can correctly identify 77 images and 23 incorrectly identified images from a total of 100 input data. The overall success presentation was 77% with details of the accuracy of the paper-made category 15%, the young category 19%, the medium category 21%, and the old category 22%. According to [30], Based on the results of research and testing that has been carried out, the ANFIS model is very suitable to be used as an artificial intelligence inference model in systems based on automatic inspection, especially testing the quality of PCB chips, because it has been proven that the ANFIS model with the hybrid trapezoid mf model has a very small error rate namely 4.0186e-007 and the level of accuracy for testing the data reaches 99%. Of the 100 training data that were trained, all images were correctly identified. This shows that the percentage of success in identifying data that has been trained is 100%. Meanwhile, from the 40 data that had not been trained or had not gone through the ANFIS training process, there were 17 correctly identified results and 23 incorrectly identified images. The percentage of success in identifying data that has not been trained is 42.5%. From the [31]research, the input used in the study consisted of 60 samples, it can be proven that the application of the ANFIS method used before image processing was carried out at the thermo-gram input gave an error value of 0.6395 in the influence range of 0.5 and reduced the error value to 0.4199 after the thermogram image processing is performed on thermogram data input before ANFIS classification with the same influence range. In data that has not been trained in the old paper category, it has an accuracy of 70%. Medium paper category has an accuracy of 60%. The young paper category has an accuracy of 40%. In the artificial category of paper test data, the application cannot predict correctly. This is because the output value of the paper image FIS that you want to test is not included in the range of artificial category training data, according to [32] research the accuracy value is quite high. ANFIS classification is a fuzzy inference technique in modeling based on input and output data pairs. The error made during training or the difference between the FIS output and the training data is 0.10475 with a recognition ability or accuracy of 67.5%.

CONCLUSION
Based on the research conducted, the author can conclude that reading the identity of each image can be used in image processing by converting RGB colors into CMYK. The Adaptive Neuro Fuzzy Inference System method can be used as a matching tool in predicting the age of the paper against the image data of the paper that has been trained. From the results of testing the input data, this application has an overall success rate of 77%. In predicting the age of the paper against the data that has been trained, the application gets a 100% accuracy value. While in data that has not been trained, the accuracy rate is only 42.5%. The more sample data trained, the higher the accuracy of the identification results. With this application, the age of the paper can easily be predicted based on the range of years the paper was published without having to do laboratory tests.