364: File and Data Structures¶
Advanced data organization and persistent storage techniques.
Topics¶
Sequential, indexed, and hashed file organization
B-trees and indexed sequential access method (ISAM)
External sorting algorithms
Record-level locking and concurrency basics
Projects¶
Merge Sort (364/merge.c)¶
Merges two pre-sorted input files into a single sorted output file, illustrating external merge-sort — the classical technique for sorting data too large to fit in memory.
fscanf(infile1, "%d", &i);
fscanf(infile2, "%d", &j);
while (!feof(infile1) && !feof(infile2)) {
if (i <= j) {
fprintf(outfile, "%d\n", i);
fscanf(infile1, "%d", &i);
} else {
fprintf(outfile, "%d\n", j);
fscanf(infile2, "%d", &j);
}
}
/* flush remaining elements from whichever file is not exhausted */
while (fscanf(infile1, "%d", &i) != EOF) fprintf(outfile, "%d\n", i);
while (fscanf(infile2, "%d", &j) != EOF) fprintf(outfile, "%d\n", j);
File Splitter (364/split.c)¶
Reads a stream of integers and splits it into separate files whenever the sequence decreases, producing pre-sorted runs suitable for external merge sort.
outfile = fopen(tempnam(OUTFILE_DIR, s), "w");
while (fscanf(infile, "%d", &j) != EOF) {
if (i >= j) {
/* sequence decreased — start a new sorted run */
fclose(outfile);
outfile = fopen(tempnam(OUTFILE_DIR, s), "w");
}
fprintf(outfile, "%d\n", j);
i = j;
}
Random Number Generator (364/rng.c)¶
Generates N pseudo-random integers to stdout, used to produce test data for the sort and merge programs.
int x = atoi(argv[1]);
for (i = 0; i < x; i++)
printf("%ld\n", random());
Temporary File Manager (364/tmpnam.c)¶
Creates and manages a series of temporary files, each holding a segment of monotonically increasing integers — demonstrating POSIX temporary file creation and file-based data staging.
printf("L_tmpnam = %d\n", L_tmpnam);
printf("TMP_MAX = %d\n", TMP_MAX);
for (i = 0; i < j; i++)
printf("%s\n", tmpnam(s)); /* generate a unique temp filename */
Buffer Formatter (364/buffer.c)¶
Processes binary format strings through a buffer, demonstrating low-level data formatting and buffer management techniques used in record-oriented file I/O.
#define BUF_SIZE 700
/* Dispatch on format specifier and write typed data into output buffer */
void do_string(int length, char ***buf, int fd, char out[BUF_SIZE],
int *data_idx, int *s_idx, int *which, int *data_which);
void do_int (int length, char ***buf, int fd, char out[BUF_SIZE],
int *data_idx, int *d_idx, int *which, int *data_which);
void do_float (int len1, int len2, char ***buf, int fd, char out[BUF_SIZE],
int *data_idx, int *f_idx, int *which, int *data_which);
Knockout Tournament (364/ko.c)¶
Implements a knockout (single-elimination) tournament bracket algorithm, exploring indexed record access and structured file organization.
struct node {
int winner, loser;
int index, lindex;
int winner_index, loser_index;
};
/* Allocate bracket array: max_row rounds × max_col*2 match slots */
temp_array = (int **)malloc((max_row + 1) * sizeof(int *));
for (i = 0; i < max_row + 1; i++)
temp_array[i] = (int *)malloc(max_col * 2 * sizeof(int));