26 June 2014

24–25 June 2014: Kinyarwanda and Computer Science, Part 1

Kinyarwanda and computer science, as subjects of study, appear to be at opposite ends of a spectrum. Knowing computer science is a pathway to a comfortable job in software development or I.T.; knowing Kinyarwanda might prepare one to be a missionary, or an aid worker. C.S. is the realm of the quantitatively minded, and languages are pretty far toward the mushy end of the humanities.

Of course, nothing is so simple, and none of the above statements are completely true. From three years of studying linguistics, I know that human language can in fact be described and justified in very precise, scientific ways, and that language education benefits from the same precision. And after two years of computer science, I have learned that even the most technical pursuits in programming and engineering are guided by the same qualitative design principles that I follow when I lay out a newspaper.¹

One of the most memorable and rewarding parts of the broad, interdisciplinary education I have gotten at Harvard has been the unexpected connections between seemingly disparate fields. Here, I’d like to share a few of the cases where my studies in Kinyarwanda and computer science have unexpectedly coincided. Hopefully I can teach you something about at least one of the two, and not completely confuse those with limited exposure to either!

Note: I originally intended for this to be a single post with three sections, each about one area in which these two subjects have influenced each other in my education. I have worked up a pretty sizable post with just the first section, though, so I suppose this will turn into a three-part series, more to come!

Type Systems

Programming languages are built around variables, which are names that hold values (e.g. x = 5, or y = "car"), and functions, which do things with them (e.g. addition, which takes two numbers and returns their sum). Variables can have a number of different types, like integer (e.g. 5), string (e.g. "car") or even function [e.g. add (x, y)].

In the same way, human languages are built around words of various types: nouns, verbs, etc. We could even draw a parallel between verbs and functions, as verbs take in other words and do things with them. More on that later.

Different programming languages vary by how explicitly they require the programmer to “declare” the types of their variables. For example, in C, to tell a program you want x to have a value of 5, you have to write

int x = 5 ;
where int classifies x as an integer. In Javascript, on the other hand, you can simply write
x = 5 ;
and its classification as an integer is implied. The reasons for those differences are varied; the latter case tends to be more convenient for the programmer, whereas the former makes it easier to catch mistakes in the code. There is no consensus on which way is “better,” which is really a matter of taste.

In English, now, you don’t really have to add anything special to say that a given word is a noun or a verb; sometimes you do, but usually not. In Kinyarwanda, every word (except for some loanwords) must have a prefix that declares its type. All verb infinitives start with ku-, and there are a number of “classes” of noun, each of which has its own designated prefix. (We might call this subtyping.) Adjectives all have prefixes that agree with the nouns they modify. No word can appear in a sentence without a prefix.

Another way programming languages vary is how they handle conversions between types. For example, consider the following code extract in C, which I will explain:

char x [30] = "I am about to turn " ;
int y = 21 ;
strcat (x, y) ;
The first line defines x as a string of text. The second defines y as a number. The third tries to tack y onto the end of x, with the intended result of y = "I am about to turn 21". (strcat is short for “string concatenate.”) But it doesn’t work, because the function strcat is expecting you to input two strings, but instead got a string and a number. So it throws an error. Before using strcat, you would want to explicitly convert the number to a string of characters that looks like a number, using a separate function made just for that purpose.³

In Javascript, on the other hand, it is much easier:

x = "I am about to turn " ;
y = 21 ;
z = x + y ;
Notice it doesn’t care that they are different types, and can even use the + function—intended for two numbers—to concatenate them.

English is like Javascript. If you want to use a verb as a noun, you don’t have to do anything special to it. Think about the word love, which can be either a noun or a verb. (Of course, often you do add suffixes, like light to lighten.) In Kinyarwanda, however, the verb gukúunda “love” must be given a new prefix and suffix to become the noun urukúundo “love.” This is true of all conversions.

Briefly, in Kinyarwanda, the meaning encoded in the prefixes and suffixes is considered essential to the complete word. This allows for some pretty cool things: for example, starting with the noun umugabo “man,” you can change the prefix to cast it into a different noun class and change the meaning, e.g. ubugabo “manhood,” akagabo “little man,” urugabo/ikigabo “big, boorish man,” amagabo “worthless men” and even ingabo “male pig.”

In English, on the other hand, we figure you can mostly get that information from context. Lack of a mandatory prefix allows our words to get much shorter, though perhaps vaguer and less versatile. The English system also allows us to change things between classes at will in a way that would be difficult in other languages. Think of the following sentences: I was sandwiched between two obese men. and That is very 1970s of you!


It’s funny, recently I have been speaking a lot of Kinyarwanda, a very strict language, but recently most of my coding has been in Javascript, a pretty loose language. In human and computer languages, the trade-off between strictness and looseness is about the same: the strict ones tend to be complex but very sensible, whereas the the loose ones are simpler but often unpredictable.

So to master a very strict programming language like OCaml, you have to wrap your mind around a lot of difficult concepts, but you can be sure everything will work as long as you follow the rules. You can learn everything about the basic structure of a loosey-goosey language like Python, though, and still have to memorize a whole bunch of individual functions, all the while getting annoyed that they are not consistent with each other.

I can say it more relatably about human languages: English and Kinyarwanda are both devilishly hard to master, but for different reasons. English is difficult because nothing makes sense: though the grammar is simple—you can learn everything about normal verbs and nouns in about 15 minutes—but almost everything is somehow irregular, and sometimes there is no distinguishable rule to follow at all. So the only way to really learn a new word is to hear it in context a lot.

Kinyarwanda is difficult, on the other hand, because everything makes sense. Once you figure out the structure, you can instantly conjugate a new verb or pluralize a new noun. Almost nothing is irregular, and almost every minute detail is grammatically specified. To be so precise, though, the supporting grammar has to be extraordinarily complex, requiring years of study before it clicks.

Which is better? I would argue neither. The range of meanings that the two can communicate is exactly identical. (This is a thing in computer science too.) And we generally don’t generally have the luxury of choosing our native tongue, so it may be a moot point anyway. But looking down from our ivory tower, I would say—in a language whose precision falls somewhere between the two we have been discussing—“De gustibus non est disputandum.”


¹ Ironically—or maybe not—almost all design-related classes available to undergraduates at Harvard are at the School of Engineering and Applied Sciences.

² It’s not actually that hard; there are functions that would let you skip the step. For example, the third line could be sprintf (x, "%s%d", x, y) ;. I wanted parallelism, though, and to not confuse people more than I already had. And I don’t know other programming languages that would better illustrate what I want!

No comments:

Post a Comment