IMPLEMENTATION AND EVALUATION OF RUNTIME DATA DECLUSTERING METHOD OVER SAN-CONNECTED PC CLUSTER

Authors

  • Masato Oguchi
  • Masaru Kitsuregawa

DOI:

https://doi.org/10.47839/ijc.1.2.123

Keywords:

Cluster Computing, Data Mining, Storage Area Network, Runtime Data Declustering

Abstract

In this paper, a PC cluster connected with Storage Area Network (SAN) is built and evaluated. In the case of SAN­connected cluster, each node can access all shared disks directly without LAN; thus, SAN­connected clusters achieve better performance than LAN­connected clusters for disk access operations. However, if a lot of nodes access the same­shared disk simultaneously, application performance degrades due to I/O­bottleneck. A runtime data declustering method, in which data is declustered to several other disks dynamically during the execution of application, is proposed to resolve this problem. Parallel data mining is implemented and evaluated on the SAN­connected PC cluster. This application requires iterative scans of a shared disk, which degrade execution performance severely due to I/O­bottleneck. The runtime data declustering method is applied to this case. According to the results of experiments, the proposed method prevents performance degradation caused by shared disk bottleneck in SAN­connected clusters.

References

T. Tamura, M. Oguchi, and M. Kitsu­regawa: “Parallel Database Processing on a 100 Node PC Cluster: Cases for Decision Support Query Processing and Data Mining”, Pro­ceedings of SC97: High Performance Networking and Computing (SuperComputing ’97), November 1997.

B. Phillips: “Have Storage Area Networks Come of Age?”, IEEE Computer, Vol. 31, No. 7, pp. 10-­12, July 1998.

M. J. Zaki: “Parallel and Distributed As­sociation Mining: A Sur vey”, IEEE Con­currency, Vol. 7, No. 4, pp. 14­-25, 1999.

R. Agrawal and R. Srikant: “Fast Algo­rithms for Mining Association Rules”, Proceed­ings of the Twentieth International Conference on Very Large Data Bases, pp. 487-­499, Sep­tember 1994.

T. Shintani and M. Kitsuregawa: “Hash Based Parallel Algorithms for Mining Associa­tion Rules”, Proceedings of the Fourth IEEE In­ternational Conference on Parallel and Distrib­uted Information Systems, pp. 19-­30, December 1996.

M. Blumrich, K. Li, R. Alpert, C. Dubnicki, E. Felten, and J. Sandberg: “Virtual Memory Mapped Network Interface for the SHRIMP Multicomputer”, Proceedings of the Twenty­ First International Symposium on Computer Ar­chitecture, pp. 142-­153, April 1994.

D. E. Culler, A. A. Dusseau, R. A. Dusseau, B. Chun, S. Lumetta, A. Mainwaring, R. Martin, C. Yoshikawa, and F. Wong: “Parallel Computing on the Berkeley NOW”, Proceed­ings of the 1997 Joint Symposium on Parallel Processing (JSPP ’97), pp. 237-­247, May 1997.

T. Sterling, D. Saverese, D. J. Becker, B. Fryxell, and K. Olson: “Communication Over­head for Space Science Applications on the Beowulf Parallel Workstation”, Proceedings of the Fourth IEEE International Symposium on High Performance Dis tributed Computing, pp. 23-­30, August 1995.

M. Oguchi, T. Shintani, T. Tamura, and Masaru Kitsuregawa: “Characteristics of a Par­allel Data Mining Application Implemented on an ATM Connected PC Cluster’’, Proceedings of the HPCN Europe 1997, pp. 303-­317, April 1997.

Y. Ishikawa, A. Hori, H. Tezuka, S. Sumimoto, T. Takahashi, F. O’Carroll, and H. Harada: “RWC PC Cluster II and SCore Clus­ter System Software – High Performance Linux Cluster”, Proceedings of the Fifth Annual Linux Expo, pp. 55-­62, 1999.

M. Oguchi and M. Kitsuregawa: “Dy­namic Remote Memory Acquisition for Parallel Data Mining on ATM­Connected PC Cluster”, Proceedings of the Thirteenth ACM International Conference on Supercomputing, pp. 246­-252, June 1999.

Downloads

Published

2002-12-31

How to Cite

Oguchi, M., & Kitsuregawa, M. (2002). IMPLEMENTATION AND EVALUATION OF RUNTIME DATA DECLUSTERING METHOD OVER SAN-CONNECTED PC CLUSTER. International Journal of Computing, 1(2), 108-112. https://doi.org/10.47839/ijc.1.2.123

Issue

Section

Articles