9 minutes
Code Golfing in C
Lately, I’ve grown interested in something called “Code Golfing”. The rules of the game are simple: just like golf, your aim is to use the least amount of characters in your source code. Due to the way C handles data, we have lots of options for shortening our code. Normal variable names and whitespace will be kept to ensure readability because it’s the tricks that are the focus of this post. With that being said, let’s take a look at a program that I have golfed.
The files:
I wrote this program near the start of my CSC 111 course at UVic. The program takes a date as input and spits out the day of the week. We’ll walk through my code in small, bite-sized pieces.
The original program starts like this:
#include <stdio.h>
int main(){
int year;
int month;
int day;
int is_leap_year;
int day_of_week;
printf("Enter date (YYYY/MM/DD): ");
scanf("%d %d %d", &year, &month, &day);
Right off the bat we notice multiple things that can be cut. GCC is a very lenient compiler, and will make generous assumptions if we leave information out. Let’s take a look at my golfed code:
main(year, month, day, is_leap_year, day_of_week){
printf("Enter date (YYYY/MM/DD): ");
scanf("%d %d %d", &year, &month, &day);
Wow, that’s a lot shorter already! GCC allows us to use functions like printf and scanf without including stdio.h
, so we exclude that. We don’t need a return type on main, so we exclude that. The compiler is doing a lot behind the scenes, and although it throws warnings, it does a lot of the work for us.
You may be wondering, “Why are all of the variables parameters of main?” Here’s why: when we declare parameters, they are implicitly typed as int
if we neglect to type them. We can save on characters by leaving out the int
before the variable. How convenient!
Next up: checking if we have a leap year:
if (year % 4 != 0) {
is_leap_year = 0;
} else if (year % 100 != 0) {
is_leap_year = 1;
} else if (year % 400 != 0) {
is_leap_year = 0;
} else {
is_leap_year = 1;
}
Well that’s awfully long. We can trim that down using ternary operators like so:
is_leap_year = (year % 4) ? 0 : (year % 100) ? 1 : (year % 400) ? 0 : 1;
Much better! There’s not much to talk about here, so I’ll move along.
Next up, we check if the user input is something reasonable:
if (year < 1 || year > 10000) {
printf("Error: Invalid year");
return 0;
}
if (month < 1 || month > 12) {
printf("Error: Invalid month");
return 0;
}
if (day < 1) {
printf("Error: Invalid day");
return 0;
}
Again, there aren’t many clever things that we can do. Let’s wrap those printfs and returns into a function:
E(char* s){
printf("Error: Invalid %s\n",s);
exit(1);
}
We’ll put that before main. We haven’t done anything special, just prevented code-reuse.
if (year < 1 | year > 10000) {
E("year");
}
if (month < 1 | month > 12) {
E("month");
}
if (day < 1) {
E("day");
}
That’s much cleaner. You may have noticed that we haven’t checked the upper bounds of each month yet. That will take a lot of boring, repetitive code to do, but we have some clever ways of dealing with it.
if (month == 2) {
if (is_leap_year && day > 29) {
printf("Error: Invalid day");
return 0;
} else if (is_leap_year == 0 && day > 28) {
printf("Error: Invalid day");
return 0;
}
} else if ((month == 1 || month == 3 || month == 5 || month == 7 || month == 8 || month == 10 || month == 12) && day > 31) {
printf("Error: Invalid day");
return 0;
} else if((month == 4 || month == 6 || month == 9 || month == 11) && day > 30) {
printf("Error: Invalid day");
return 0;
}
Wow, that’s a lot of or operators. Let’s see how we can cut down on those…
if (month == 2
&& ((is_leap_year && day > 29)
|| (!is_leap_year && day > 28))) {
E("day");
} else if ("101010110101"[month-1]-'0' && day > 31) {
E("day");
} else if("000101001010"[month-1]-'0' && day > 30) {
E("day");
}
Near the top, we check if it’s a leap year, and check February’s day accordingly. But wait what’s this after? Is that seriously binary in my C program? No, not really. Let me explain.
If we scroll back up to the original code, we used to have a long series of or operators. I’ve now condensed this into an array, where the index is the month and 1’s and 0’s represent true or false. Essentially, the month (minus one) is used as the array index and each month in the year has a 1 or a 0 character in its place. Once the 1 or 0 is pulled out of the string, we subtract the 0 character from it and are left with a numerical 1 or 0, corresponding to true or false. This is an extremely compact way to store a table of true and false values.
Next we have the actual algorithm for calculating the day of the week.
day_of_week = (year % 100);
day_of_week /= 4;
day_of_week += day;
if (month == 1 || month == 10) {
day_of_week += 1;
} else if (month == 2 || month == 3 || month == 11) {
day_of_week += 4;
} else if (month == 5) {
day_of_week += 2;
} else if (month == 6) {
day_of_week += 5;
} else if (month == 8) {
day_of_week += 3;
} else if (month == 9 || month == 12) {
day_of_week += 6;
}
if (is_leap_year == 1) {
if (month == 1 || month == 2) {
day_of_week -= 1;
}
}
day_of_week -= ((year / 100) % 4) * 2;
day_of_week += (year % 100);
day_of_week += 12;
day_of_week = day_of_week % 7;
Let’s break it down. We have some of the algorithm, we have a block of if statements that acts like a switch statement (oh well), we have a conditional, and then the algorithm finishes up. Let’s see how we can shrink that huge block of if’s.
day_of_week = (
(year % 100)
/ 4
+ day
+ "144025036146"[month - 1] - '0'
- (is_leap_year && month < 3)
- ((year / 100) % 4) * 2
+ (year % 100)
+ 12
) % 7;
Look familiar? We used the same trick as last time, but with more than just 1 and 0! Again, that huge block of if statements is nothing more than a table of values, so we store it in a string yet again.
Our next piece of code will seem difficult to shorten, but we’ve got one more trick up our sleeve.
printf("%d/%d/%d: ", year, month, day);
if (day_of_week == 0) {
printf("Sunday");
} else if (day_of_week == 1) {
printf("Monday");
} else if (day_of_week == 2) {
printf("Tuesday");
} else if (day_of_week == 3) {
printf("Wednesday");
} else if (day_of_week == 4) {
printf("Thursday");
} else if (day_of_week == 5) {
printf("Friday");
} else if (day_of_week == 6) {
printf("Saturday");
}
printf("\n");
return 0;
}
Okay, so we’re not just setting values any more. We can’t store strings in a table, so what do we do? Or can we…
If we think about strings in C, they’re just pointers to arrays of chars. What if we could stuff that into a string? Lo and behold, we can! Check this out:
char* string = "!(/7AJQ6 %d/%d/%d: %s\n\0Sunday\0Monday\0Tuesday\0Wednesday\0Thursday\0Friday\0Saturday";
printf(string + 19, year, month, day, string + string[day_of_week]);
}
Okay, so I’ve got some explaining to do.
To understand what I’m doing here, let’s review how functions handle strings in C. The function is handed a pointer to char, and it keeps stepping forwards until it hits a null terminator. With that in mind, we can jump forwards in a string by offsetting the pointer, and we can terminate the string early with a null-terminator. With that in mind, let’s look at that printf statement.
Our first value that we pass to printf is our string, but we jump forward 19 characters and land on the %
sign. Printf keeps going until it hits a null-terminator, getting the string "%d/%d/%d: %s\n"
. That’s kind of weird, but we understand what that string does. Year, month, and day come next, as we’d expect. And last but not least, we have the point of this whole program: the day of the week.
This is where our table comes in handy. You see, when we added that 19 at the beginning of printf, we were skipping forwards by a pre-measured number of characters. The first seven characters on the array will do that for us. The first character of the array is this: !
, an exclamation point. It has a numerical value of 33, so we skip ahead 33 characters and land on "Sunday"
. (
does the same for "Monday"
, and so forth. Essentially, we have created a table of pointers at the beginning of our array. The numerical value of the day of the week determines which pointer we follow, and thus which string printf is fed.
And that’s that! As you can see, we can do some cheeky tricks with arrays and pointers to eke out the fewest possible amount of characters in our code. But be warned: this is not best practice, this is not practical, and this is not efficient. I only program like this for fun and to learn about the language, and I probably wouldn’t use any of these tricks on a serious project. Readability is miles more important than producing “clever” code.
But of course, doing things like this is a fun and interesting way to explore how C can be (and probably shouldn’t be) used. I learned a lot of the techniques in this article from this YouTube video by Bizqwit. I highly encourage you to watch some of his content, he has some interesting programming videos on his channel.
I had a ton of fun writing this, and I hope that you enjoyed reading it. Well, I have exams to study for, so I better wrap up this post here. Thanks for reading!
1738 Words
2019-12-14 00:00 +0000