No description
Find a file
2025-12-20 12:29:14 +01:00
.idea first commit 2025-11-23 19:28:42 +01:00
src Updated the code to be safer with better error handling. 2025-12-07 16:04:26 +01:00
.gitignore Clean up 2025-12-06 19:55:17 +01:00
Cargo.lock New features 2025-12-07 14:54:46 +01:00
Cargo.toml Update Cargo.toml 2025-12-20 12:29:04 +01:00
heimdall-logo.jpg Upload files to "/" 2025-12-07 02:39:13 +01:00
README.md Update README.md 2025-12-07 02:58:54 +01:00

Heimdall: A Multi-Threaded copy program implemented with Tokio Runtime.

Heimdall is a multithreaded data copying tool built on the Tokio runtime, designed to maximize I/O throughput. It uses multiple schedulers to group and distribute data transfers across threads, each with different strategies for ordering and batching workloads. The goal is to minimize overhead and improve performance in high-concurrency scenarios.


Meet the schedulers 📋⚙️:

CFS: Completely Fair Scheduler - (Implemented)

This the simplest and the default scheduler for the program, it essentially groups files towards a given target and issues jobs to each thread separately group wise. The only customizations a user can do here is simply specify the MAX_THREAD. In this most basic scheduling, the average size of the group will be determined by trying to create subsets, where the sum target of each is k, where k is the average of all file sizes. This leads to the total load being identical for all workgroups.

OS: Ordering Scheduler - (In-progress)

The idea here is that this scheduler is an extension of CFS. The files are grouped by size, using the linear regression to figure out what is the acceptable file size within reason, all the outliers will be mixed into the groups at random to make sure that all the large files are submitted early in the coping process to balance overall time.

TS: Temporal Scheduler - (Not yet implemented)

This will be the most ambitious scheduler used in this project, as it will try to use basic machine learning (heuristics) to figure out the best group allocations and group size. The idea here is that it will perform a series of test using dd command to figure out read/write speed of the target disk. Then it will try to create a schedule based on start and end time of the coping process of the file. The initial start and end time will be calculated using the heuristic approach. As the program progresses and gathers additional data, it will attempt to adjust the groups based on learned experiences.