Why Programming Competitions Could Be The Future (But Probably Won’t)

(Disclaimer: This post features somewhat stronger opinions that the usual so I have to be clear that while I manage the programming competitions in DevCon, the sentiments in this post do not represent the organization’s.)

Algorithm/Programming Competitions: the other type of “Hackathon”.

When you see the word “hackathon” nowadays, it refers to the one where a bunch of people are gathered into a single venue and form groups to work for a couple of days on ideas that they pitched at the beginning and end with a final pitch to see how good their prototype and idea is. Best pitch wins.

On the opposite end there’s the original OpenBSD type of hackathon. People are still gathered in a single venue, but there is no pitching – just coding. It can be a directed coding venture, like in the first hackathon where they worked on some parts of the OS. It can also be a creative, free-form one, like Global Game Jam where game developers make any game for 48 hours as long as it’s somehow related to the theme.

There can also be prizes in the latter, but it’s mostly a fun exercise for developers. And there’s no business stuff and pitching to worry about.

Programming Competitions have nothing to do with these two. It’s only because hackathons have become somewhat famous spectacles recently that people think that “get programmers in a single venue and make them code” automatically makes it a “hackathon”.

A programming competition is a race. First team to make programs that can solve the contest problems wins – it’s as simple as that. No pitching. Strict rules. And a lot shorter than most hackathons.

I believe that Programming Competitions have the most potential out of these three programming events…

Programming competitions are an objective test of programming skill. If your program doesn’t solve the given problem, you won’t get a point for it.

Compare this with college. As weird as it may sound, you can graduate with an IT-related degree without knowing how to program, whether through deception or through plain incompetence of the school. And this happens more than we’d like in the industry.

Let’s imagine programming competitions somehow become famous throughout the country in a few years time. Not basketball-level fame, but maybe soccer-level. Say there’s a persistent online Philippines-only HackerRank-like competition where anyone from the top-tier universities down to the remote colleges can test their skills. Assume that the anti-cheating and user-verification measures are good enough that graduates can put their “programmer level” on their resumes and that employers will honor it.

Imagine what would happen if students are confident of their programming skills that some of them even show them off. Imagine its impact in the industry and the academe.

It’s going to be a crazy world, won’t it?

Certainly a bigger impact than what the other hackathons can do.

…but given what we have right now, there’s very little chance that potential will be reached.

The opportunity is there, but there are hurdles that may be too much for anyone interested in taking it.

First problem: no one cares.

Just a quick background: I’m not a competitive programmer. I solved programming problems (Project Euler, topcoder archives) back in college just for fun. I never practiced for or competed in any live event other than the odd Google Code Jam where I never intended to pass multiple rounds.

But I understand the importance of testing students for their programming skill. After the DevCon Code Challenge Cup, I decided to code a quick online judge in Rails and in mid-2013 started to hold mini-programming competitions in various schools around the country. You can try out a demo here.

And after 2 years of doing this, I’ve had a couple of striking revelations.

One of those was how little people cared about programming. Sure, the people who invited us were really grateful to have the opportunity to test the programming skills of their students, but very few of the organizers were curious about the problems given to the contestants. I would understand the hands-off approach if they were school administrators, but many of those who invited us were IT students and teachers.

None of them reviewed the problems before I gave them to the contestants, checking if the problems were too easy or too hard or simply out of scope of their curriculum. Only a couple of them asked me about the problems after the contest even though that would have been the best time to ask for advice to improve their programming skills.

And there you go, the reason why we’re churning out weak IT graduates: the people in position don’t care about things that they should be passionate about.

The previous problem is compounded by the problem at the other end: a lack of understanding of the relative skill levels of programmers.

Programming isn’t a sport, it’s a skill and a craft. It’s not easy to gauge programmers and we are left to use vague terms like “beginner”, “skilled”, and “expert”.

And anyone who conducts a programming competition to non-competitive programmers will see that “beginner” is not adequate for describing college students. They’re all beginners, but the difference between their skill levels can be great.

Yes, that’s a euphemism for “college students suck at programming”.

Go to the demo link above and read the problems (if it’s still in Practice mode, wait until the Contest mode starts). Even though the problems can be answered by anyone who has just taken CS classes on Java programming and algorithms, in our experience only students from the big 3 universities were able to consistently answer more than 7 questions within the time frame. The next tier of universities can answer around 5, while the rest even struggle with 3.

If aren’t a programmer and don’t know how easy it is to answer just 3 in that list of problems, let’s just say anyone taking a good Programming 101 class should be able to do that in 4-5 lab sessions.

So yeah, most college programming classes are horrible. If you’re going to hold an open (non-invite only, non-ACM-ICPC) programming competition today, you ought to lower the difficulty of the questions. Otherwise, many of your contestants will get very low scores.

You don’t want that to happen. Put yourself in the shoes of someone who either was just curious or was pressured to join the contest. What will they think when they leave the event getting only one or two correct? Will they look forward to the next competition? Will they help promote future events?

Maybe, but probably not.

Even now, I still occasionally get cases where contestants get depressingly low scores (especially if I’m not familiar with the school). We’re lucky that the contests are confined within the schools so we don’t need to worry about negative publicity, but this brings up another point: unless you’re like me who can conduct more than one contest a month, most programming competition organizers do not have the experience of holding contests on a regular basis to learn from their mistakes.

Combine all these three points and you’ll have an idea what will happen when someone conducts an open programming competition in the same manner as the other hackathons.

Venue, logistics, sponsorships… no problem. Then they’ll find people who had past programming competition experience to be the judges and write the problems for them. But since it’s the first time for all of them, the judges will write difficult problems and the organizers will not look at the problems.

Fast-forward to the end of the event. Many teams joined, but only a few of them were competitive programmers. One of the latter won, and the rest were left with a score of 1 or 0.

There won’t be any negative PR (it’s not like it was a bait-and-switch), but there won’t be enough positive buzz either to make a bigger and better follow-up event. Instead of spreading the word and encouraging more people to join in the future, many will just go home and say “eh, at least I tried” and leave it at that.

On to the final point.

When you’re in my shoes as someone who has programmed over half of my life (a good chunk of that professionally) and as someone who recently has conducted programming competitions, you’ll come to hate programming competition-level problems.

I hold programming competitions in schools to show students that they need to understand their programming skill level and work on improving that to the point that they are comfortable with it even under pressure. This is the skill that I want to cultivate into students, and it will do well for them as they enter work.

But there comes a point where the problems are just too tricky that it’s no longer a test of programming skill, but a test of knowing trivia from Computer Science and Math that you will almost never get to use in the industry.

Simply put, high level programming competition skill does not equate to high level software development skill.

High level programming competition also relies on quick hacks – computationally efficient, but ultimately ugly kludges. Good Software Engineering requires you to avoid these hacks lest they become technical debt that will cripple your system and your team in the long run.

Another thing, the deeper you are into complicated problem solving, the more prone you are to The Complicator’s Gloves.

This pitfall of forgetting the original goal of having open programming competitions can be avoided as long as your organizers and judges are developers from the industry whose years of experience have removed any romantic feelings for CS trivia in favor of more pragmatic approaches to software engineering. But then again, there are far fewer CS/SE hybrids (i.e. well versed in both) in the local scene than Developer/Designer “unicorns” so good luck finding them.

Long story short, Programming Competitions are one of those things that look great on paper, but immediately reveals why no one does it the moment you try to do it on your own. (see also: local Developer Bootcamps)

Can anything be done to address the problems above and push these contests forward and take Philippine IT students’ skills to the next level?

Of course. But it’s going to take a much more dedicated and passionate group of people than what we have right now.

Unofficial Solutions to DevCon C-Cup 2012 Problems

Two weekends ago, DevCon held its Code Challenge Cup (C-Cup) at Xavier School Greenhills.

Now while I am the current VP for Technology of the said group and spent 6 hours painstakingly checking a good chunk of the submissions, I wasn’t involved in the making of the problems (I was the tech support guy, not a “judge”). I didn’t even look at the problems until it was linked to on the event page days after the event.

Anyway, being a sucker for computer science problems I’ve decided to solve all of them on my own. And given that nobody used Ruby at the event, I going to use Ruby to solve them.

Note that while I’ve confirmed that the solutions below pass the judges’ test cases, they are not (yet) endorsed by any of the judges in the event.

Oh, and yeah, to back up my claim that Quiwa’s Data Structures will allow you to solve most of the problems in programming competitions, I’ve included references to the said book in the solutions below.

Problem A: Best-Fit Lines

Problem Type: Statistics

My solution:

While this problem can be solved with some difficulty using brute-force, going through all possibilities of a and b in 0.001 steps, the formula for this problem (simple linear regression) is pretty straightforward.

A few things to note about the code above:

  • I use String#strip() in all of the problems just to be sure. I don’t really need to do it.
  • Truncate is preferred over rounding (even though both will get full points) but Ruby doesn’t have a built-in truncate function. And so I had to make one.
  • Unlike Python, Ruby doesn’t have a built-in sum() function. Using a full reduce() works fine but I don’t like the syntax so I chose a mapreduce form instead.

Problem B: Weird Writing

Problem Type: Coding

My solution:

This is what I’d call a “coding problem”. There are no Computer Science or Math related tricks here; the problem can be solved by anyone who knows the basics of programming e.g. basic data structures, string manipulation, and sorting.

The lines 20 – 26 above can be confusing for non-Ruby or non-Perl programmers due to the amount of small things done per line. Let’s translate it one by one:

Line 20 creates a translation table/hash which we’ll use to translate the words.

  • alphabet.split(//) – converts the string to a character array, as Ruby does not have this built in
  • .zip('A'..'Z') – “zips” the array with the real alphabet creating an array like [["Q", "A"], ["W", "B"]...]
  • Hash[*...flatten] – flattens the array so we could turn it into a Hash e.g. { "Q" => "A", "W" => "B", ...}

Line 21 – 24 creates a new array containing word pairs consisting of the input words and their equivalent words in the Ripley alphabet.

  • word_pairs = words.map do |word| – maps the entire array to a new array based on the return value of the block
  • word.split(//) – again, converts the string to a character array
  • .map { |c| translation_table[c] }.join – maps each character to the Ripley alphabet using the hash we made earlier. The join turns the array back to a string
  • [word, ...] – return value of the block is the word pair array e.g. ["SCOTT", "RVIEE"]

After line 24 you now have an array like [["SCOTT", "RVIEE"], ["WEAVER", "BCSWCD"]]. We then sort this list by the second element of the pair (i.e. .sort_by { |pair| pair[1] }) and put each word on the output file (i.e. .each { |pair| output.puts pair[0] })

Problem C: Congressional Progression

Problem Type: Graph Theory

My solution:

This is the first problem whose solution you could find in Data Structures. A major district in the problem is a vertex, once removed, produces a graph where there is no path between one or more pairs of vertices. This can easily be determined once you have a transitive closure for the graph, which we will construct using Warshall’s algorithm.

The code above a bit straightforward:

  • Line 24 simply removes all of the edges that contain the currently tested major district / vertex
  • Lines 25 – 28 looks fancy, but all it does is create a translation table for the name of the district with their respective positions e.g. { "Borgia" => 0, "Camelville" => 1 }
  • Lines 33 – 45 is a direct translation of the Warshall’s Algorithm from the book to Ruby
  • Line 46 determines if the district is major by checking for the existence of falses in the adjacency matrix.

Data Structures reference: page 331, Warshall’s Algorithm

Problem D: Islands

Problem Type: Graph Theory

My solution:

This problem can be solved in many ways, with the simplest being graph traversal algorithms. And the simplest of those is depth-first search which is used by the code above.

To summarize what the code above does:

  • Look for an island tip closest to the top.
  • Increment the island count.
  • Use DFS to delete the island piece by piece.
  • Repeat until there are no islands left.

Data Structures reference: page 253, Depth-First Search

Problem E: Basketball

Problem Type: Coding

My solution:

Another coding problem. Expect to fail many technical interviews if you can’t program this in less than 30 minutes.

The existence of Array#count() makes this problem too easy in Ruby.

Problem F: Utility Maximization

Problem Type: Mathematical Optimization / Microeconomics

My solution:

Like Problem A, this problem can be solved either through brute-force or having knowledge of microeconomics or mathematical optimization.

And it can be solved using blind luck.

To explain why it can be solved through luck, we must first look at a “real” solution using Lagrange multipliers:

Not fun, right?

However, if you set the function and constraint to the ones given in the problem and take time to solve the partial differential equations, you’ll end up with something like:

If you were lucky, you could’ve seen this pattern in the examples:

In short, the price x quantity of each good is proportional to their substitutability scale. From there, it should not be hard to derive:

i.e. line 18 above.

Problem G: Evolution of Industries

Problem Type: Graph Theory

My solution:

We need to find the path between two points in the graph that produces the highest probability. Calculating probability of independent events is simple (just multiply them all together) while finding the path is only a matter of choosing which pathfinding algorithm to choose. For the code above, I chose the most common algorithm: Dijkstra’s Algorithm.

Line 24 onwards is a direct implementation of the algorithm from the book.

Data Structures reference: page 312, Dijkstra’s Algorithm

Problem H: Relatively Prime

Problem Type: Number Theory

My solution:

The easiest problem of the set. All it takes is for you to know how to calculate the GCF of two numbers efficiently. If you’re not using Ruby or Python (which has built in GCD functions), you can code the Euclidian Algorithm in less than a minute.

Data Structures reference: page 5, Euclid’s Algorithm

Problem I: Jugs

Problem Type: Graph Theory

My solution:

This classic puzzle isn’t that hard when you use graph-traversal algorithms on it. Since we’ve already used DFS and Dijkstra’s algorithm above, I chose to use Breadth-First Search this time around. (Note that DFS is fine because there’s no “shortest path” requirement.)

Here’s how BFS can be used to solve the first example, 3 5 4:

We start with our first node, representing 0 gallons for jug A and 0 gallons for jug B.

We can only do two things, fill A and fill B. All other moves returns you to the same node.

On node (3,0), we can fill B and pour A B. Again, all else returns you to a previously visited node.

On node (0,5), we can only fill A and pour B A. Note that (3,5) has already been created in the previous step.

Node (3,5) is a dead end.

On node (0,3), we can fill A.

On node (3,2), we can empty A.

On node (3,3), we can pour A B.

On node (0,2), we can pour B A.

On node (1,5), we can empty B.

On node (2,0), we can fill B.

On node (1,0), we can pour A B.

On node (2,5), we can pour B A leaving 4 gallons on jug B.

Backtracking, we now have our solution.

The whole thing, animated:

My BFS implementation is kinda weird in that it doesn’t use a real queue (it uses a simple array) and we store the back edges along with the action in a hash of a hash.

The solution above also finds the solution for putting N gallons in jug B. According to the spec, you only need to put the water in either jug. We’ll just leave modifying the solution above to do this as your assignment.

Data Structures reference: page 265, Breadth-First Search

Problem J: Just the Facts

Problem Type: Coding

My solution:

Last and probably one of the least, we have this coding problem.

Your main problem here is how to calculate factorials – using normal integers, you’ll get to the integer limit very quickly. There are many clever ways of calculating the last non-zero digit of a factorial, but there’s really no need to do that. Using arbitrary-precision arithmetic (e.g. Java’s BigInteger) you can calculate 10000! in all of the given languages in less than a second using less than 1MB of memory.

Here’s what we did in the Ruby code above:

  • (1..n).reduce(:*) – calculate factorial using reduce(). Ruby automatically converts integers to Bignum when they reach a certain size.
  • .to_s.gsub("0", "") – convert factorial to string and replace all 0’s with blanks
  • [-1] – get the last character

And that’s that. To summarize, we got

  • 3 problems that can be solved with minimal CS instruction
  • 5 that can be solved using Quiwa’s Data Structures
  • 2 niche statistics and economics problems (A and F)

In other words, it’s a pretty good problem set for programmers of all levels. I might even go as to say that, given enough time, a second-year Computer Science student should be able to answer 8 of them armed only with a laptop, their textbooks, and a complete programming language API.

LobangClub / WebGeek Developer Challenge Analysis

LobangClub/WebGeek Developer Challenge

There were a bunch of tech events yesterday and among them was the awarding for the WebGeek/LobangClub Developer Challenge.

new iPad

obligatory mirror pic

This whole post will not be about my solution to the challenge, but a run through of the different approaches we participants used and presented during the meetup. The scores were close anyway, and it only happened that I chose a combination of techniques from the list below that yielded the highest score.

A word of warning, though: if you aren’t into computer science, linguistics, and math, it may be of your best interest not to continue on to the analysis below. :D

Continue reading “LobangClub / WebGeek Developer Challenge Analysis”

Quick Programming Challenges

As a guy with over a decade of programming experience and a good mathematical background, I occasionally want to throw all the other aspects of software engineering out of the window and just program stuff. Here are two sites I visit to get my quick programming fix.

Continue reading “Quick Programming Challenges”