THE DEVELOPMENT OF DATA WAREHOUSE FROM PAYMENT POINT SERVICES OF IT BUSINESS SOLUTION PROVIDER

The goals of the research were to develop a data warehouse loaded from operational database of payment point service in IT business solution provider. The development scope was data analysis using Online Analytical Processing (OLAP) tools for identifying the trends of transaction and agent registration, and created reports and dashboards. The data warehouse was developed with Kimball method as known as the nine-step design methodology. The data and requirement were collected by observation and interview with Chief Technology Officer (CTO). The data warehouse was analyzed by OLAP tool provided by Pentaho Business Analytics software with additional plugin Pivot4J. The results show that the trends of transaction and agent registration between 2014 and 2015. It can be concluded that developed data warehouse can be used as an analysis tool to know the trends information.


I. INTRODUCTION
The research object is an Information Technology (IT) business solution provider company in Jakarta.It is engaged in IT and outsourcing services such as software development, hardware development, and system integration.Its target markets are the financial sector, telecommunications, and government sectors.It has a customer base that includes most of the banking and financial institutions in Indonesia.The businesses focus on payment point services, procurement and the development of an Electronic Data Capture (EDC), Card Management System (CMS), PIN-pad, Public Key Infrastructure (PKI), and a kiosk machine.
The growth of digital and mobile phone era brings changes to the online and real-time payment of bills and purchases.From this growth, the company provides a payment point services that can run multiplatform (PC, Android devices, and EDC) including payment services and purchase services.The provided payment services are bill payment of telephone (Telkom), electricity (PLN), water (PDAM), subscription TV (Indovision and Aora TV), finance installment (Adira Finance), and payment of postpaid balance (Smartfren and Kartu Halo).Meanwhile, the provided purchase services are electric voucher (IM3, Mentari, SimPati, Kartu AS, XL, Axis, Smartfren, Bolt, and Esia), electricity token (PLN), and Orange TV voucher.
IT business solution provider applies the principle of partnership with business entities and individuals.One partner also is called an agent that can have more than one outlet that spreading in different cities.The agent must register first before using the provided service.The number of transaction on payment point services increases in line with the number of the agent.Due to the increasing number of transactions, IT business solution provider requires an update of the historical data that can be viewed from different dimensions especially the trends in transaction and agent registration.However, the data on all activities occurring in the payment point service is stored in the operational database and still can be changed.Thus, the operational database cannot store historical data and cannot provide the information accurately.The information obtained from operational databases cannot be used by IT business solution provider for analysis to know the occurred trend information of transaction and agent registration.Therefore, there is a need for the data warehouse that can store historical data and provide the necessary information for decision making regarding appropriate promotional strategies based on the trends.
The data warehouse stores the historical data from operational database for decision making purposes (Pardillo, Mazón, & Trujillo, 2010).The data warehouse has distinguished characteristics from the operational database.The data warehouse contains a collection of data that is subject oriented, integrated, time variant, and non-volatile (Abai, Yahaya, & Deraman, 2013).The data warehouse using data models based on multidimensional data model is known as a data cube.The data cube allows the data to be modeled and viewed in multiple dimensions.The data warehouse schema has two types of elements, namely facts and dimensions (Mohammed, 2014).The fact and dimension have the table called as the fact table and dimension table.The fact table contains data about a specific topic that is usually associated with a numerical value (measurable) which is known as a measure.Meanwhile, the dimension table contains data about an entity's perspective.
The tool of the data warehouse is Online Analytical Processing (OLAP).It can be used to analyze the data and provide information that is built on the multidimensional data model (Paskarina & Ayub, 2010).OLAP can be used to analyze the data with many operations, such as roll up, drill down, slice, dice, and pivot.Roll up operation is used to view the data as a whole through data clustering.Drill down operation is to describe the data in more detail so that the obtained information becomes more detailed.Then, slice operation is for dividing data to a certain dimension so that it can focus on the desired perspective.Dice operation is to divide data into two or more dimensions so can focus on the perspective in three-dimensional form.Last, pivot operation rotates the data so that it can provide an alternative data presentation (Paskarina & Ayub, 2010).
Moreover, the implementation of the data warehouse can be done in various fields and has the benefit for the company.For example, the data warehouse developed at the Department of Health in PT Ateja Multi Industri (Paskarina & Ayub, 2010) can produce the needed health information for decision making in the improvement of employee health services, employee health insurance, quota monitoring, and monitoring the health and performance of employees within a department.Then, with the establishment of a data warehouse at Cemerlang Skin Care, it makes the scattered data in various branches integrated and in a more compact form.Thus, the executive can analyze and make decisions more quickly (Darudiato, 2010).According to Oktavia (2011), the data warehouse designed in PT Atlas Transindo Raya helps the executive in the analysis of the report generated by various dimensions and is used as a tool to analyze the trends or tendencies that are currently underway for the company's service delivery through the dashboard.
Furthermore, the data warehouse designed for Farhrenheit Manufacturing Systems by Widianty (2015) can be used by the company in analyzing production cost.It is by comparing the result, production deviation, most resource, allocation, and monitoring stock of raw materials, packing materials, and finished goods.
The data warehouse does not only bring the benefit for the company, but also for the educational and government area.For example, Prihatin (2013) agreed that the data warehouses in Politeknik Negeri Lhokseumawe could be used by the management to look at the trends in the admission of new students and serve as a means of decision making.Then, the data warehouse developed in Faculty of Information Technology of Budi Luhur University by Mulyati, Amini, and Juliasari (2014) can be used by the management as a decision making tool for university and evaluation for lecturers to improve their future performance.Moreover, the data warehouse designed in STMIK Kadiri helps the management to use the academic data and into the information which is required to fill accreditation forms based on various required dimensions (Fadilah, Winarno, & Amborowati, 2016).
In the government area, the data warehouse in China's police intelligence decision system helps the policeman to explore the dynamic crime trends and characteristics from the integrated heterogeneous data sources.It is to help them making the correct judgments (Shen, Liu, Chen, & Wang, 2012).Otherwise, the modelling of data warehouse on food distribution center and reserves in the ministry of agriculture of Indonesia can facilitate monitoring food price in consumers and producers.It is based on different view by simplifying and tracking the flow of distribution between regions.Moreover, it can analyze the fluctuations of food price using historical data (Putra, Fifilia, Christian, & Sudarma, 2015).
From the research that has been mentioned, the implementation of the data warehouse is capable of storing historical data and bringing benefits.It displays the necessary information for making business decisions and can be used to view the trends.The development of the data warehouse is expected to help IT business solution provider in analyzing occurred trends of transaction and agent registration from different dimensions.Thus, it can make appropriate decisions regarding promotional strategies for the payment point services.

II. METHODS
The basis of this research on payment point services of IT business solution provider is originated from the problems formulated into a solution of the existing problems.The formulation of this problem is supported by data collection and the observation and interview with Chief Technology Officer (CTO).Once the data are collected, IT business solution provider finds that it is difficult to know the trends of the transaction and the agent registration based on the transaction history of the payment point services.Therefore, IT business solution provider needs a data warehouse that can store the historical data and provide the necessary information for decision-making in promotional strategies for payment point services.
The data warehouse development is done by using a nine-step design methodology (Kimball & Ross, 2015) and Pentaho software.Furthermore, OLAP is used to analyze the data according to the needs of IT business solution provider and create the reports and dashboards to help IT business solution provider in viewing the trends and making an appropriate decision regarding promotional strategies.The research framework can be seen in Figure 1.

III. RESULTS AND DISCUSSIONS
Data warehouse is developed with Kimball's method or nine-step design methodology.First is choosing the process.The selected process refers to the subject matter of the business process in payment point services of IT business solution provider.There are two business processes that will be used in the development of data warehouse.The first processes are agent registration.The agent registration is carried out before the partner can use the provided services in payment point.IT business solution provider requires agent registration information such as the trend including the number of agent registration based on the location and specific time.Then, the second processes are the transaction.After the registration has been processed, agents can use payment point services.In here, IT business solution provider requires transaction information such as the trend including the number of transactions based on the biller type, product, location, platform, transaction hours, and specific time.
Second is choosing the grain.The selection of grain means determining what is represented on a record in the fact table.Grains are divided into two business processes.
The first processes are one line per agent registration that stores the history of time and agent location.Then, the second processes are one line per transaction that stores the history of time, transaction hours, biller type, product, location, and platform.
Third is identifying and confirming the dimensions.The dimensions are selected according to the needs of the data warehouse for payment point service.Dimensions are required for each business process.The first business process is agent registration consisting of date (dim_date Fourth is to choose the facts.It requires fact tables for agent registration and transaction processes.The first processes are called as fact_agent_registration.It has three attributes, namely id_dim_date, id_dim_agent, and qty (measure).The second processes are called as fact_ transaction.It has five attributes.Those are id_dim_date, id_dim_time, id_dim_biller, id_dim_product, id_dim_ terminal, and qty (measure).
Fifth is to store pre-calculations in the fact table.However, there is no need to store pre-calculations in the fact table.Sixth is to round out the dimension tables.The description of the attributes of the dimension tables is added at this stage.It is shown in Table 1.
Seventh is choosing the duration of the database.The selection of the duration of the database matches the needs of required historical information by IT business solution provider.The duration of the selected database is the previous two years (2014)(2015).The database is adjusted to the operational database (PostgreSQL).
Eighthis to determine the need to track slowly changing dimensions.The need to track changes in the dimensions also called as slowly changing dimension should be determined for the changes of dimension attribute that can happen over the time.The handlings of slowly changing for each dimension are divided into two types.Type 1 is the first dimension.Last is deciding the physical design.At this stage, administrative, backup, and security of the data warehouse are not discussed.It is because the scope of this research is limited to the development of the data warehouse that can be used to help IT business solution provider in knowing the trend.
Based on the results of the data warehouse designed with nine-step design methodology, it obtains two-star schemas.Those are illustrated in Figure 2 and Figure 3.This can be combined to be the fact constellation schema in Figure 4.Moreover, the data warehouse integrates the data from multiple sources of diverse information and transforms it into a multidimensional representation for decision support applications (Gahlot & Yadav, 2014).Therefore, it is necessary to do Extract, Transform, and Load (ETL) in building a data warehouse (El-Sappagh, Hendawi, & El Bastawissy, 2011).The first step in building a data warehouse is to extract the data from multiple data sources to be used in the data warehouse.After the data extraction process is done, the next step is transforming the data.This process involves the transformation of data in the data cleansing process to obtain the accurate, correct, complete, consistent, and unambiguous data.According to Prihatin (2013), the data transformation process is done by combining the data coming from the different source.The final step of the ETL process is loading the data that has been extracted and transformed into the dimension tables and fact tables in the data warehouse.
ETL processes aim to load the necessary data from the operational database into the data warehouse.The operational database and data warehouse use PostgreSQL.The ETL process is performed using Pentaho Data Integration (PDI) software because it can be used to perform ETL processes in various types of databases.One of those is PostgreSQL.
Figure 5 shows all ETL processes.The processes are done to load the necessary data into each of the dimension tables.After all the necessary data are loaded into the dimension tables, the data containing measure and foreign key connected to the primary key in the related dimension tables are loaded into the fact table.Then, Figure 6 to Figure 11 show the ETL processes for each dimension tables.Meanwhile, Figure 12 and Figure 13 show the ETL processes for each fact tables.After the ETL processes are performed as described previously, the historical data are stored in the data warehouse.It is used for the analysis to know the trends information of agent registration and transaction in the payment point service.The analysis is performed using OLAP tool.It is provided by Pentaho Business Analytics software with additional plugin Pivot4J.
Before the data warehouse can be used for analysis using OLAP tool, it needs to create the data warehouse schema that describes the relationship between the dimension tables and the fact table and the used measure.The schemas are designed using Pentaho Schema Workbench (PSW).Those can be seen in Figure 14 and Figure 15.   Figure 21 shows that the number of transactions increased in 2015, and they go up and down for each quarter.
Meanwhile, Figure 22   Figure 24 shows the trend of transaction based on platform.It shows that user most highly interested do any transactions using PC platform in the first quarter (Q1) of 2014, while using Android platform became more interested from the second quarter (Q2) of 2014.
Figure 25 shows the trend transaction based on location.It shows that Jakarta is the most common location of transaction in 2014, and followed by Bandung, Yogyakarta, and Surabaya.In 2015, Jakarta is still the most common location of transaction.Then, it is followed by Balikpapan, Tangerang, Bogor, and Bekasi.Moreover, Figure 26 shows the trend of transaction based on time.It can be seen for overall, a few of transactions happen from 00:00 to 04:59.The number of transactions increases from 05:00.The time span at 09:00 to 09:59 is when transaction most happens.The number of transactions decreases from 10:00 to 13:59, 16:00 to 18:59 and 19:00 to 23:59.
The reports are created using Pentaho Report Designer (PRD) and presented in the PDF format.The agent registration report is created to provide a monthly report and shows the trend based on the location.Figure 27 shows the example of agent registration report.Meanwhile, the transaction report is created to give a monthly report containing the number of transactions and the trend.Figure 28 shows the example of transaction report.
Then, the dashboards are created using Pentaho Community Dashboard Editor.The creation of the dashboards aims to help IT business solution provider in making promotional strategies decisions based on the actual trends.It is in the context of the number of agent registration and transaction.
Moreover, the agent registration dashboard provides a dynamic visual representation of the ratio of the number of agent registration for each quarter in the selected year and top ten locations.Figure 29 shows the example of the agent registration.It shows the selected year is 2015 and the second quartal (Q2).The bar chart in the top ten locations (cities) panel can show the information about top ten most common locations of agent registration in the second quartal of 2015.
Meanwhile, the transaction dashboard is a dynamic visual representation of the several ratios.Those are the ratio of the number of transaction for each quarter in the selected year, a number of transactions based on the platform used, top ten locations, top five billers, top ten products, and the transaction time flow.Figure 30 shows the example of transaction dashboard when the selected year is 2015 in the third quartal (Q3).Then, the dashboard can dynamically update the number of transactions for each panel in this dashboard based on the data in third quartal in 2015.

Figure 1
Figure 1 Research Framework

Figure 4
Figure 4 Fact Constellation Schema

Figure 14
Figure 14 Agent Registration Schema

Figure 16
Figure 16 Agent Registration Analysis with OLAP

Figure 18
Figure 18 Agent Registration Trend Based on Location shows the trend of transaction based on biller type.It can be concluded that Telkom as the most common biller type in 2014, and E-Voucher as the most common biller type in 2015.Then, Figure 23 shows the trend of transaction based on product for E-Voucher biller.It can be concluded that SimPati and XL are two top brands for E-Voucher biller in 2014 and 2015.

Figure 24
Figure 24 Transaction Trend Based on Platform table) and agent location dimension (dim_agent table).The second is the transaction.It consists of date (dim_date table), transaction hours (dim_time table), biller type (dim_ biller table), product (dim_product table), and location and platform dimension (dim_terminal table).
It will overwrite the changed dimension attributes.Three dimensions use this type, namely date dimension (dim_date table), transaction hours dimension (dim_time table), and biller type dimension (dim_biller table).Meanwhile, type 2 is the second dimension.It will create a new record in the dimension table if there is a change in dimension attributes and history data will be stored with a different primary key.Agent location dimension (dim_agent table), product dimension (dim_product table), location and platform dimension (dim_terminal table) use this type.

Table 1
Attributes of Dimension Tables