Improving Scalability of Java Archive Search Engine through Recursion Conversion And Multithreading
DOI:
https://doi.org/10.21512/commit.v10i1.1653Keywords:
scalability, recursion conversion, multithreading, java archive search engine, multiprocessAbstract
Based on the fact that bytecode always exists on Java archive, a bytecode based Java archive search engine had been developed [1, 2]. Although this system is quite effective, it still lack of scalability since many modules apply recursive calls and this system only utilizes one core (single thread). In this research, Java archive search engine architecture is redesigned in order to improve its scalability. All recursion are converted to iterative forms although most of these modules are logically recursive and quite difficult to convert (e.g. Tarjan’s strongly connected component algorithm). Recursion conversion can be conducted by following its respective recursive pattern. Each recursion is broke down to four parts (before and after actions of current and its children) and converted to iteration with the help of caller reference. This conversion mechanism improves scalability by avoiding stack overflow error caused by method calls. System scalability is also improved by applying multithreading mechanism which successfully cut off its processing time. Shorter processing time may enable system to handle larger data. Multithreading is applied on major parts which are indexer, vector space model (VSM) retriever, low-rank vector space model (LRVSM) retriever, and semantic relatedness calculator (semantic relatedness calculator also involves multiprocess). The correctness of both recursion conversion and multithread design are proved by the fact that all implementation yield similar result.Plum Analytics
References
O. Karnalim dan R. Mandala, “Java Archives Search Engine Using Byte Code as Information Source,” dalam International Conference on Data and Software Engineering (ICODSE), Bandung, 2014.
O. Karnalim, “Extended Vector Space Model with Semantic Relatedness on Java Archive Search Engine,” Jurnal Teknik Informatika dan Sistem Informasi (JuTISI), vol. 1, no. 2, pp. 111-122, 2015.
B. Croft, D. Metzler dan T. Strohman, Search Engine : Information Retrieval in Practice, Boston: Pearson Education .Inc,
D. Grune, H. E, J. C. H, J. dan K. G. Langendoen, Modern Compiler Design 2nd edition, Springer, 2012.
T. Lindholm, F. Yellin, G. Bracha dan A. Buckley, “The Java Virtual Machine Specification Java SE 8 Edition,” Oracle, California, 2015.
R. Tarjan, “Depth-First Search and Linear Graph Algorithms,” SIAM journal on computing Volume 1 Issue 2, p. 146–160, 1972.
F. M. Carrano dan T. Henry, Data Abstraction & Problem Solving with C++: Walls and Mirrors (6th Edition), Prentice Hall, 2012.
A. S. Tanenbaum, Modern Operating Systems 4th edition, Prentice Hall, 2014.
W. Liu dan T. Wang, “Index-based Online Text Classification for SMS Spam Filtering,” Journal of Computers, vol. 5, no. 6, 2010.
W. Premchaiswadi dan A. Tungkatsathan, “On-line Content-Based Image Retrieval System using Joint Querying and Relevance Feedback Scheme,” WSEAS Ttans. on Computer, vol. 9, no. 5, pp. 465-474, 2010.
C. Bonacic, C. Garcia, M. Marin, M. Prieto, F. Tirado dan C. Vicente, “Improving search engines performance on multithreading processors,” dalam High Performance Computing for Computational Science - VECPAR 2008, Toulouse, 2008.
C. Bonacic dan M. Marin, “Simulation Study of Multi-threading in Web Search Engine Processors,” dalam 20th International Symposium, SPIRE 2013, Jerusalem, 2013.
V. Sklyarov, I. Skliarova dan B. Pimentel, “FPGA-based Implementation and Comparison of Recursive and Iterative Algorithms,” dalam International Conference on Field Programmable Logic and Applications, 2005.
J. Miecznikowski, “Decompiling Java using staged encapsulation,” dalam Reverse Engineering, 2001. Eighth Working Conference, Stuttgart, 2001.
M. Naftalin dan P. Wadler, Java Generics and Collections, United States of America: O’Reilly Media, 2007.
C. Kustanto dan I. Liem, “Automatic Source Code Plagiarism Detection,” dalam ACIS International Conference on Software Engineering, Artificial Intelligences, Networking and Parallel/Distributed Computing, Daegu, 2009.
C. D. Manning, P. Raghavan dan H. Schutze, Introduction to Information Retrieval, Cambridge: Cambridge University Press, 2009.
H. Shima, “WS4J : WordNet Similarity for Java,” [Online]. Available: https://code.google.com/p/ws4j/. [Diakses 24 11 2015].
D. Lin, “An information-theoretic definition of similarity,” dalam Proceedings of the15th ICM, Madison, 1998.
Downloads
Published
Issue
Section
License
Authors who publish with this journal agree to the following terms:
a. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License - Share Alike that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
b. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
c. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
USER RIGHTS
All articles published Open Access will be immediately and permanently free for everyone to read and download. We are continuously working with our author communities to select the best choice of license options, currently being defined for this journal as follows: Creative Commons Attribution-Share Alike (CC BY-SA)