An efficient stream-based join to process end user transactions in real-time data warehousing

Loading...
Thumbnail Image

Supplementary material

Other Title

Authors

Jamil, Noreen

Author ORCID Profiles (clickable)

Degree

Grantor

Date

2014-06

Supervisors

Type

Journal Article

Ngā Upoko Tukutuku (Māori subject headings)

Keyword

real-time data warehousing
semi-stream processing
join operator
performance measurement
data processing
information resources management

ANZSRC Field of Research Code (2020)

Citation

Jamil, N. (2014). An Efficient Stream-based Join to Process End User Transactions in Real- Time Data Warehousing. Journal of Digital Information Management, 12, pp.201-215.

Abstract

In the field of real-time data warehousing semistream processing has become a potential area of research since last one decade. One important operation in semi-stream processing is to join stream data with a slowly changing diskbased master data. A join operator is usually required to implement this operation. This join operator typically works under limited main memory and this memory is generally not large enough to hold the whole disk-based master data. Recently, a seminal join algorithm called MESHJOIN (Mesh Join) has been proposed in the literature to process semistream data. MESHJOIN is a candidate for a resource-aware system setup. However, MESHJOIN is not very selective. In particular, MESHJOIN does not consider the characteristics of stream data and its performance is suboptimal for skewed stream data. In this paper we propose a novel Semi-Stream Join (SSJ) using a new cache module. The algorithm is more appropriate for skewed distributions, and we present results for Zipfian distributions of the type that appears in many applications. We present the cost model for our SSJ and validate it with experiments. Based on the cost model we also tune the algorithm up to a maximum performance. We conduct a rigorous experimental study to test our algorithm. Our experiments show that SSJ outperforms MESHJOIN significantly

Publisher

Link to ePress publication

DOI

Copyright holder

Copyright notice

All rights reserved

Copyright license

This item appears in: