Adaptive Chunk Sizing for Cloud Computing to Increase Efficiency
Category Computer Science Friday - October 20 2023, 07:26 UTC - 1 year ago Chalabi Baya and Slimani Yahya have developed an adaptive approach to the size of data "chunks" in cloud computing storage systems to improve efficiency considerably. This approach tailors chunk sizes dynamically based on a set of real-time metrics. Tests have shown a 24% improvement in exection times when compared with fixed chunk-sizing methods and a 96% improvement over random chunk-sizing methods.
Research in the International Journal of Grid and Utility Computing has shown how an adaptive approach to the size of data "chunks" in cloud computing storage systems can improve efficiency considerably. Chalabi Baya of the Ecole Nationale Supérieure d'Informatique in Alger, Algeria and Slimani Yahya of the Universitié de La Manouba in Tunisia have considered the way in which unstructured data is stored as BLOBs (Binary Large Objects), in the cloud. They point out that most data management systems use data chunk sizes equal to a given BLOB but this seemingly simplistic approach belies a problem—BLOB sizes are not all equal.
BLOBs are fundamental components in cloud computing and the issue of size puts obstacles in the way of moving data leading to inconsistent data access across systems, thus reducing efficiency. A reduction in efficiency means energy is wasted in shuttling and storing data. The team points out that there are always compromises to be made in attempting to improve efficiency in computing systems. "As the chunk size affects the bandwidth, if the size of the chunk is small, then the network will be overloaded," the team explains. "On the other hand, if the chunk size is big and data are being accessed concurrently, the response time increases." .
To help overcome the various problems, the researchers have developed an adaptive approach that tailors the chunk size dynamically based on a set of real-time metrics. These metrics encompass factors such as available bandwidth, storage usage, BLOB size, and the frequency of data access.
In tests to compare the new approach with fixed chunk-size methods, the team saw a 24% improvement in execution times and a 96% improvement over the random chunk-size methods. The researchers add that their data-striping technique might also be used with other data management systems. They are planning to test their approach with real-world cloud computing platforms, such as BlobSeer and Hadoop Distributed File System (HDFS).
Share