Getting Started in Programming

Questions that is often asked in the various programming news groups are variants of "What language should I learn?"

It is a good question. There is an awful lot of languages out there and it is hard to get an overview of the possibilities without a lot of work.

After seeing the question three times in the same week I decided to write this web page to try to give my answer to it. Since then I have occationally added to and revised it.

I would like to emphazise that these are my opinions and nothing more. Others are bound to disagree.

I'll name a lot of different languages below. Exactly which languages I have mentioned in the various sections is somewhat random, if I left one out it is probably because I haven't heard about it rather than any negative opinion about it.

The hyperlinks from each language name are of variable quality. I have stolen most of them from this page.

First things first: English

The very first language you should learn is English. I write this because often people ask for good books in their own native language and are unwilling to accept that the answer very often is "There aren't any."

At least in Norwegian, that is. I don't know the situation for Spanish, Russian or German, but in the lesser used languages the market for good computer books simply isn't there.

Your English doesn't have to be excellent, it just have to be good enough that you can read an English computer book without too many problems.

Reading documentation

The most useful skill you can learn is the ability to read computer documentation. As a programmer you will need to know a lot more than you can easily remember. This doesn't matter if you can find the information you need, when you need it. And quickly.

The most useful part of any technical book is its index. If the book lacks an index, recycle it. It is worth more as recycled paper than as a book.

For online documentation this can either be an actual index (most useful) or some sort of search mechanism (not so useful, but usable).

When you have a specific question you wonder about, turn to the index. The most interesting entries are often the last and the first ones for any given word, in that order. The last entry is often a summary and the first entry is often an introduction. (This rule has a lot of exceptions, of course)

The table of contents can be useful too.

Other than that, you simply need experience. It takes some time to learn how to quickly pinpoint the exact information you are looking for. I have no advice beyond: Practice.

The same skills can also be used for reading computer programs. There will seldom be an index available, but you usually have some way of searching the program text.

Finding a good book

There are a lot of computer books out there. Some are good, some are bad. If all you have is a bad book, it can seriously slow you down. You can also miss important ideas that better books explain to you.

A few general rules to help you:

Unless you have a lot of money, start at the library. You might have to go through some books before you find a good one.
Visit academic book stores and libraries rather than general ones. The books there are usually much better.
Avoid books with titles containing the words "Idiot", "Dummies" or "in XX days". These books are for dummies. Seriously.
Judge a book by its index. Both because you will be using the index a lot and because a good index shows that this book is written by somebody who knows what they are doing.
As far as I know, O'Reilly and Associates have only published good books. (This was true when I first wrote this page... but things have changed. Their quality control seems to have slipped a bit)

Variation!

The question often takes the form "Which one language should I learn?". Or worse: "Should I learn X or Y?". This is starting in the wrong end. What you should learn aren't languages, but ideas and new ways to think.

If you learn just one language you will have blind spots and be limited in the ways you approach new problems.

By learning more languages you will be able to find new ways to solve your problems.

This means that knowing Prolog, Forth and Lisp will make you a better C programmer. This sounds weird, but it is true.

Never mind the details.

Some languages you will learn because you need them, to program something big in. These languages you will need to learn properly, with much detailed knowledge. However, most of that knowledge can be picked up as you go, by reading documentation.

Other languages you will learn because they look interesting, or somebody have recommended them. For these languages learning details is totally unimportant. The main thing is to learn the ideas of the language.

When you face a new language, try to find out what makes that language unique and what the strong points of that language are. Never mind its weak points, you will find them soon enough if you need to. The strong points are easier to overlook. On the other hand, books are often eager to point them out to you.

I said variation!

A lot of languages are very similar. If you have learned BASIC, Pascal and FORTRAN you haven't really learned a lot of variation. If you then learn C, C++ and Java, you will have learned, total, about two languages worth of ideas or so.

The six languages mentioned are very similar in their basic ideas, but this may not be obvious if they are the only ones you know. As I said: Blind spots. (Those languages look rather different from eachother)

So, which languages should you learn? Well, this is where the experts start arguing...

However, a better question is what ideas should you learn.

In general

There is an old saying that "A Real Programmer can program FORTRAN in any language." This means that you can use imperative techniques in almost any language. However, this is seldom wise. (FORTRAN is the archetypcial imperative language)

The same goes for most of the the ideas mentioned below. Most of them are about the way you think while programming rather than the languages themselves. On the other hand, a language can be more or less suited for that way of thinking. It is often, but not always, a good idea to use a style of programming suited for the language used. (Or the other way around, of course)

Imperative programming

The languages mentioned in the previous two paragraphs are geared towards the idea of "imperative programming" where you give the computer commands and you have to decide almost everything. Programs look like "First do this. Then do that. Store the result of that over there. " and so on.

These languages are the inspiration for proverbs like "The computer does exactly what you ask it to do. Unfortunately."

Object orientation (OO)

What is "object orientation"? What does it mean? I don't really know... The words has been used by so many to mean so many different things that they have almost lost any meaning! (Kent Pitman writes on this). Java and C++, as well as many others, claim to be OO. SIMULA and Smalltalk were the original object oriented languages and they still have their devoted followers. There are also many others who use this catchphrase, look around.

Functional programming

Another important idea is "functional programming" which means two things. The first is that you call a lot of functions to do things that you would do in other ways in other languages. However, this is not the main point. The main point is that you can use functions as values, i.e. storing them in variables and passing them around as function parameters and return values. (If that didn't make sense to you, it means you have something to learn here)

The first functional language was Lisp , and various Lisp variants still form the core of the family of functional languages. Look for some implementation of Common Lisp. Haskell, Erlang and ML are other popular functional languages.

Declarative programming

The remaining main trend in computer languages is "declarative programming". This means that you tell the computer things like "Harald is a parent of Haakon. Olav is a parent of Harald. A grandparent is the parent of a parent." The system can then answer questions like "Who is the grandparent of Haakon?". (The system does not understand plain English like this. You have to be much more careful in how you phrase things to a computer) Prolog is the main language in this family. This type of programming isn't used all that much, but is worth learning for its mindbendingness.

Another way of seeing it

The difference between imperative, object oriented, functional and declarative programming can be best seen in how you divide the program into parts:

Imperative: Each part performs a task
Object oriented: Each part describes an object
Functional: Each part calculates a function
Declarative: Each part describes a relation

These alternatives do not exclude each other.

Programming by contract

Somewhat independent of the above, "programming by contract" is a way of designing programs. Each part of a program has a "contract" which specifies how it should cooperate with the rest of the program.

This documentation is usually only written in a form that humans understand, something like "When given a positive real number, this function calculates its square root". Note that this contract contains two parts, the input spesification (postive real number) and the output specification (calculates the square root).

The idea behind "programming by contract" is that one should make as much of this information available to the computer as possible. This means that the system itself can catch a lot of the simple errors programmers make. The main laguage using this concept is Eiffel.

Databases

Databases are also extremely common and are something you should know a bit about. Most databases these days understand a language called SQL, which you should have a look at. However, SQL is not a programming language as such and you need something more powerful to actually program a database system. Look for languages called "something SQL" or "SQL something".

Event-driven programming

Another semi-important idea is "event-driven programming" which is mostly used in GUIs (Graphical User Interfaces). The basic idea is that instead of saying 'first we do this, than we do that' you say 'if the user clicks that button, this happens. If the user clicks the other button, that happens.' While programming, you don't know which order these things happen, so you have to carefully consider all possibilites. This type of programming can be done in almost any language, but Netscapes JavaScript and Microsofts JScript are examples of languages explicitely made for this.

Assembly language

And then there are the assembly languages. Some people say that you should know an assembly language or two to get a better feeling for what the machine can and cannot do. Some people say that this is totally unnecessary. Personally, I tend to agree with the first opinion, from the principal point of view that more knowledge is always a good thing. However, learning assembly language is tiresome and might not be worth the effort, I don't know. If you decide to do this, keep "Never mind the details" firmly in mind.

A good way to start learning assembly is asking a compiler for some high level language to show you the assembly language it generates rather than making machine code directly. (Unfortunately, not all compilers can do this) Do this for very small programs at first. This will not teach you to program in assembly, but will give you an idea of what assembly is and what the machine can do.

Algorithms and Data Structures

Most algorithms and data structures are fairly independent of which language you use. Their description are often left out in language-specific books.

You should read a good book on Algorithms and Data Structures. Good books are often called just "Algorithms and Data Structures" or something similar. As always, the details doesn't matter much at first. Remember what algorithms and datastructures the book shows you, and what their main strong and weak spots are. When you need details, look them up.

Readable Programs

Learn to write readable programs. You may have no problem understanding your program today, but next month it can be almost gibberish if you aren't careful. If somebody else tries to understand your program it gets even harder. Important keywords here are: Layout, good names, comments and documentation.

Remember: Code is written once and read a thousand times. A bit of care while you write can save you a whole lot of grief down the road.

Good layout means that you should be able to see the structure of the program at a glance.

Good names for variables, functions and so on means that the name should give the reader a good idea of the varable/function/whatever is used for.

Comments are for all the stuff that doesn't fit the above. It is important that the comments shouldn't tell you what is obvious from the code, but what is not obvious.

Documentation is all of the above, but rewritten in a way that is easier for humans to read, rather than computers. There will be a lot of duplication between comments and documentation.

Computer Science

Some people think that when they have learned to program in a language or two, they know 'Computer Science'. This is not so, CS is a lot lot more than just programming.

CS is a branch of mathematics, and you need a solid mathematical background before learning it. Picking this up on your own is not easy. If you want to learn CS, get thee to a university.

Conclusions

The important thing isn't where you start, it is that you never stop learning.

Don't learn languages, learn ideas.