Improving Scalability of Java Archive Search Engine through Recursion Conversion And Multithreading

Authors

  • Oscar Karnalim Maranatha Christian University

DOI:

https://doi.org/10.21512/commit.v10i1.1653

Keywords:

scalability, recursion conversion, multithreading, java archive search engine, multiprocess

Abstract

Based on the fact that bytecode always exists on Java archive, a bytecode based Java archive search engine had been developed [1, 2]. Although this system is quite effective, it still lack of scalability since many modules apply recursive calls and this system only utilizes one core (single thread). In this research, Java archive search engine architecture is redesigned in order to improve its scalability. All recursion are converted to iterative forms although most of these modules are logically recursive and quite difficult to convert (e.g. Tarjan’s strongly connected component algorithm). Recursion conversion can be conducted by following its respective recursive pattern. Each recursion is broke down to four parts (before and after actions of current and its children) and converted to iteration with the help of caller reference. This conversion mechanism improves scalability by avoiding stack overflow error caused by method calls. System scalability is also improved by applying multithreading mechanism which successfully cut off its processing time. Shorter processing time may enable system to handle larger data. Multithreading is applied on major parts which are indexer, vector space model (VSM) retriever, low-rank vector space model (LRVSM) retriever, and semantic relatedness calculator (semantic relatedness calculator also involves multiprocess). The correctness of both recursion conversion and multithread design are proved by the fact that all implementation yield similar result.
Dimensions

Plum Analytics

Author Biography

Oscar Karnalim, Maranatha Christian University

Information Technology, Faculty of Information Technology

References

O. Karnalim dan R. Mandala, “Java Archives Search Engine Using Byte Code as Information Source,” dalam International Conference on Data and Software Engineering (ICODSE), Bandung, 2014.

O. Karnalim, “Extended Vector Space Model with Semantic Relatedness on Java Archive Search Engine,” Jurnal Teknik Informatika dan Sistem Informasi (JuTISI), vol. 1, no. 2, pp. 111-122, 2015.

B. Croft, D. Metzler dan T. Strohman, Search Engine : Information Retrieval in Practice, Boston: Pearson Education .Inc,

D. Grune, H. E, J. C. H, J. dan K. G. Langendoen, Modern Compiler Design 2nd edition, Springer, 2012.

T. Lindholm, F. Yellin, G. Bracha dan A. Buckley, “The Java Virtual Machine Specification Java SE 8 Edition,” Oracle, California, 2015.

R. Tarjan, “Depth-First Search and Linear Graph Algorithms,” SIAM journal on computing Volume 1 Issue 2, p. 146–160, 1972.

F. M. Carrano dan T. Henry, Data Abstraction & Problem Solving with C++: Walls and Mirrors (6th Edition), Prentice Hall, 2012.

A. S. Tanenbaum, Modern Operating Systems 4th edition, Prentice Hall, 2014.

W. Liu dan T. Wang, “Index-based Online Text Classification for SMS Spam Filtering,” Journal of Computers, vol. 5, no. 6, 2010.

W. Premchaiswadi dan A. Tungkatsathan, “On-line Content-Based Image Retrieval System using Joint Querying and Relevance Feedback Scheme,” WSEAS Ttans. on Computer, vol. 9, no. 5, pp. 465-474, 2010.

C. Bonacic, C. Garcia, M. Marin, M. Prieto, F. Tirado dan C. Vicente, “Improving search engines performance on multithreading processors,” dalam High Performance Computing for Computational Science - VECPAR 2008, Toulouse, 2008.

C. Bonacic dan M. Marin, “Simulation Study of Multi-threading in Web Search Engine Processors,” dalam 20th International Symposium, SPIRE 2013, Jerusalem, 2013.

V. Sklyarov, I. Skliarova dan B. Pimentel, “FPGA-based Implementation and Comparison of Recursive and Iterative Algorithms,” dalam International Conference on Field Programmable Logic and Applications, 2005.

J. Miecznikowski, “Decompiling Java using staged encapsulation,” dalam Reverse Engineering, 2001. Eighth Working Conference, Stuttgart, 2001.

M. Naftalin dan P. Wadler, Java Generics and Collections, United States of America: O’Reilly Media, 2007.

C. Kustanto dan I. Liem, “Automatic Source Code Plagiarism Detection,” dalam ACIS International Conference on Software Engineering, Artificial Intelligences, Networking and Parallel/Distributed Computing, Daegu, 2009.

C. D. Manning, P. Raghavan dan H. Schutze, Introduction to Information Retrieval, Cambridge: Cambridge University Press, 2009.

H. Shima, “WS4J : WordNet Similarity for Java,” [Online]. Available: https://code.google.com/p/ws4j/. [Diakses 24 11 2015].

D. Lin, “An information-theoretic definition of similarity,” dalam Proceedings of the15th ICM, Madison, 1998.

Downloads

Published

2016-05-31
Abstract 833  .
PDF downloaded 623  .