Creating Conventions using Meta Conventions

I’ve been working on Cactus JS for a while now, and parts of it are finally available to the public! Now that other people might become interested in contributing, decisions have to be made about pretty much everything thinkable.

Coding conventions are what I’m thinking the most about right now. I follow very strict conventions when I code myself, and whenever there isn’t an existing convention I make one up. Since JavaScript is such a dynamic language, there are a lot of ways to do things. If the convention is arbitrarily chosen, should it be chosen at all, or should each programmer decide for themselves? Should conventions be set on the micro level (like if there should be one or no space between if and ()?

One problem with specifying everything is that contributors have to learn all conventions. They might think the rules are terrible and refuse to use them, and they will undoubtely make mistakes. If you end up with Ada 95 Quality and Style, you probably went too far. Sure you can force an employee to learn it all, but who has the energy for this if making spare time contributions?

One advantage of strict conventions is that it looks great when you can see that the whole code base is unified, and that you on the higher level instantly recognize what the code is setup to act as (for instance, it’s harder to learn all possible ways to simulate packages in JavaScript, than it is to learn one and use it all the time).

So, before writing conventions, perhaps one should write meta conventions. That is, rules that help us decide on how to form conventions.

Here are my thoughts at the moment:

Don’t Make Assumuptions Regarding Tool Use

You might think that a certain convention would be okay, even though it’s hard to remember/takes a lot of time because you have a shiny editor that allows you to write macros that automatically make sure the formatting is valid. But will you always be using this editor? Will everyone else also? Can you assume that users of other editors will write their own corresponding macros, or that they are able to in the first place? If not, someone might abandon the convention, and it will be rendered useless.

Carefully Consider Alternatives

Don’t just pick a convention the first time you think of the scenario. You have to be careful so that you don’t make a bad choice. Code I wrote earlier was riddled with things like this. “Oh well, I’ll just do it this way.” In the end you might not even be able to change the convention once a better alternative shows up at your front door.

Specify Best Practices

There’s a thin line between conventions and best practices at times. Should best practices be set as conventions? Best practices are probably more important than conventions, since they might prevent common problems completely if followed. There is little reason to think that the choice of if ( versus if( would make a big difference, so it can’t be considered “best practice”. I think it’s useful to highlight best practices as such, because users won’t have to forget them when moving to new conventions. If the best practice is indeed helpful, it’s unlikely that opposing conventions exist, they would probably not be specified as conventions at all, fortunately (by this I mean that hypothetical conventions such as “always use public members” are hard to justify.)

Allow for Unambiguous formatting

Formatting of code should always be unambiguous, otherwise different people will invent their own conventions for what isn’t specified, and we lose the uniform feel of the application. I feel pretty good when I open up a document and it looks just like I’m used to. Feels like home!

The most important parts to specify here are the constructs that are common (such as a function definition which appears in pretty much every source file you’ll work with). These might be the conventions you notice first, if you look at a piece of code.

Keep Maintenance at a Minimum

Conventions should be defined in a way that doesn’t force extra work when initially writing the code, or when later modifying it.

Here are a couple of conventions that I’ve used, more or less volountarily:

Lining up variable declaration values, as in:

  1. var a = foo;
  2. var foobarbazqux = b;

Or Drupal’s convention for string concatenation in PHP: 'foo'. $bar .'baz'. $bax

I used the first example because I argued it made the code look better, and that it made it more legible. I still argue that this is the case, but I find that the added maintenance cost weighs heavier than aesthetics. I assume that the Drupal team chose the conventions for similar reasons (although I don’t agree that it’s pretty!)

The first example has another problem, namely that if you remove or rename the variable with the longest name, you have to reformat the entire group of declarations. If you add a variable name that turns out to be the longest, you also have to reformat. Odds are that somewhere along the way someone gets sick of the formatting and quits halfway, which probably means that everyone will think the code looks worse.

The second example is similar, if you have 'foo'.'bar' and want to replace the literals with, say, variables, you will need to reformat. In this case it won’t be time consuming, but instead it’s a lot easier to forget such a detail.

Another well known convention that violates this rule is the 80 character limit on rows. If a modification makes the line exceed 80 characters, the line needs to be split into several other lines. If the opposite happens you need to retract it onto a single line. Can you with accuracy decide whether a split line would fit on just one line, or will you have to try it out and undo if it doesn’t? This uncertainty could be a real showstopper, perhaps you would even need to write a comment about the line lengths so that you can remember which ones you checked. I certainly wouldn’t like that!

Generalize Conventions

If a convention is general, it will be easier to remember since you will be using it in a lot of different places. It will also be easy to reason towards what to do with a certain construct. Special cases could manifest themselves as an “except …” hardly visible at the end of a convention’s text. If this is the case, one can simply remove the “except …” line and be done with it. If that isn’t the case, such as in the two separate conventions “if statements should have a space as in if (” and “for statements should not have a space as in for(”. Unless you look at a higher level, you won’t find anything to generalize (on second thought, that is the meaning of generalization!) I would remove these conventions and replace it with something like “all control structures (if, else,  for, while) should …”. If the programmer then writes a switch statement, he might consider that switch also is a control structure, and that the same rule apply. Contrast this with “I’m writing a switch, now, was there a space or not? I’d better check, or am I so lazy that I’ll take my chances with guesswork?”

Cactus has a few special case conventions. One that I intend to remove could be expressed as “All function calls should have a space between the function name and the opening parenthesis, except when no arguments are passed, then there should be no space.” It’s actually even more complicated than this and I can imagine prospecting contributors thinking I’m out of my mind for putting this in as a convention, who can remember this stuff? I would probably never have started coding with this convention if I asked someone else about it. This example also breaks the “Keep Maintenance at a Minimum” rule.

Don’t Specify Obscurities and Rarities

Say your language has an obsure feature that’s only used once in your entire application. Should you specify usage in a convention? One argument could be that it wouldn’t hurt. But I think it would. It would be another bullet point to remember and if your convention document is riddled with these, it might be too much to handle. After all, it’s easier to remember a convention that you are forced to use at regular intervals.

Keep Conventions at a Minimum

This point might seem contradictory to some of the previous points, but I think you can keep a balance. A smaller conventions document means that there will be less to memorize, and that people will be less likely to avoid reading it. If you have to wade through page after page of what seems to be meaninglessness, odds are you’ll get tired soon after beginning and you’ll do something else instead.

I hope these meta conventions are helpful. Specifying conventions is not a task that you have to perform very often, but if you make bad choices it could impact generations (I can imagine old Fortran applications with some pretty crazy regulations).

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • You can enable syntax highlighting of source code with the following tags: <code>, <blockcode>. Beside the tag style "<foo>" it is also possible to use "[foo]".

More information about formatting options