April 2006 - Posts

Programmers are Translators not Magicians

Jeffrey Palermo recently wrote about how easy (yes EASY) writing software is these days. While I disagree with his central theme, he's right about one thing: Custom software is a skilled art and needs good programmers to implement.

The key, though, is why do we *need* custom software.

People seem to lose sight of what software actually is: It's something to automate or manage things we would normally have to do ourselves. The software is like a little helper the user hires.  

I agree that writing custom bespoke software is for trained people only.  But I really don't think we should be writing custom software for most solutions in the first place. Most of the time people just assume there are no tools to implement their applications and they jump right on the 'custom software' solution, and worse still they start their custom software from scratch using C# or Java using little more than the bundled frameworks.

The reality is users are not just hiring these little helpers, they're hiring little translators (programmers) that can tell inexperienced clueless little helpers (the computer) what to do.

If we had more domain-specific tools we would need less custom software. Domain specific tools speak the right language and are experienced. With domain specific tools you don't need the little translators.

Programmers are incredibly expensive for companies. I've been at these companies where they rely on programmers to almost magically produce solutions for their problems. Of course there's no magic, but it seems that way because the technical details of what they're doing is so foreign to them.

The point is, programmers are not magicians - they're translators.

Jeffrey says: "The management has no way to know if a programmer is good or not.". The same way that if I hire a russian translator I would have absolutely no idea if they were good either. The problem isn't with the management, it's with the fact we need a translator in the first place.

If writing software really was so easy, then why is it so expensive? Why does it need a translator? Unlike other trades, like carpentry, software poses a unique opportunity to break down these barriers because of its dynamic nature.

In the future I think custom software will be a very rare thing in business. The next big thing in business software is going to be an explosion of domain specific languages and domain specific tools, tools that make the development of solutions little more than the concise articulation of requirements by business experts. The plumbing, the bits that make it look pretty, all the rest of the details of the software will be there already. They'll just be configuration.

In the distant future there will be no translators or magician programmers. These vertical solution developers won't be needed. Instead there'll be horizontal solution developers creating these domain specific tools and selling them to companies. And even those will have a shelf life.

Why I Don't Use MSN Search

So I hear that MSN Search's following is falling, dropping its share of all searches from 14% to 11% during the month of March, while Google increased a couple of points to almost 50%.

From my own personal experience I think there's a few reasons why MSN search isn't as widely used as Google.

1. It's uglier. The font used for the search results is larger and just isn't as pleasing on the eye as Google's. Neither is the combination of green, orange and blue. It just looks tackier if you ask me. I find the font just distracts me from the results, I find it more difficult to locate what it is I'm after on the page. Google's results page is just neater and more well thought out.

2. Google is easy to type. Search.Msn.Com isn't. There are 4 unique letters in google. Search.msn has 8 unique letters. So it takes longer to type, so why bother?

3. The 'Next page' link is much bigger on Google's results. This isn't a biggy, but every time I want to go to the next page on msn's search results it takes a while to locate the page links. They're colorful, big and easy to locate on Google's.

4. Google shows more results in less space. A quick search for 'time' in google presents 5 and a bit results on the screen. The same search for msn gives me 4 results in the same space. Yes I could just scroll, but that's more work so, again, why should I.

5. MSN's results page is centered and doesn't expand properly with the width of the screen. I have a wide-screen laptop, but this doesn't give me anything with msn's search results because it seems to limit the width of the results to 800 pixels and centers it. Why?

6. I only get 5 pages links on the footer of the first page of msn's search results. Google gives me 10 pages. While this might not seem like a biggy because I can just click on page 5 and then get to page 10 on msn... it just makes Google APPEAR to have more results than MSN. The pure size of Google's database, combined with its speed, makes it feel powerful. MSN has all that, it just doesn't present it as well.

Of course nobody of any importance will read this and take notice, but I thought I'd put it out there anyway.

Is OOP Bad for Business?

You know data is a very valuable thing. The more data is exposed, the more potential information we can use to make informed decisions. There is rarely a case when it is a good thing not to know about data. Knowing how to use it is entirely different, as is knowing how to correctly categorize, structure and group that data. But for any reason hiding data impedes our ability to build intelligent services. Full disclosure should be the top priority of building an enterprise application.

Now, these days many enterprise applications are written using object oriented programming (OOP) languages, such as Java or C#. There are probably many reasons why this is, not least because the majority of programmers are experienced in OOP languages and the best development environments and tools are for OOP languages. They're also typically general purpose languages so you can re-use your skills in a variety of different domains.

But the problem is that a lot of the features of OOP languages aren't really the type of language features you would look for in order to build the business logic of a business application.

For one, OOP focuses on creating agents that encapsulate data, rather than exposing and enhancing data accessibility.  When this data is business data, building objects around the data is akin to adding unnecessary bureaucratic red tape to an otherwise simple process. 

Object orientation was designed for creating abstractions, and it's especially useful for creating layers of abstractions. It's very good at it. These layers are typically built using a feature of object orientation called 'encapsulation' that allows us to hide the details of one layer in order to create a new layer with new functionality and a different interface. General purpose object oriented languages can be used for most anything (hence general purpose) and so of course the ability to create layers of abstractions would be important. General purpose languages are generic tools.

The people who know your business really aren't going to understand, or want to understand, how to develop abstractions. They won't care what a fragile base class is, what a singleton is, and why he needs to instantiate a factory class to even begin querying his invoices. All he will care about is data and the calculations he can run on that data. Bridging the gap between those who know your business and the application itself is the key to the success of your application.

In most business applications there are really just two layers.

1. There is the domain (or model). This is the data you need to solve your problem and services to retrieve and interact with that data, and...

2. The services layer. This is the business logic that you develop on top of that domain. This represents the application-specific calculations on that data.

Building a business solution should not involve creating any new layers, just expanding your services layer.

And user interfaces don't count. If I receive an invoice, viewing that invoice involves at most picking up a pair of glasses and reading. Viewing data is not really a business function. If the data is there and fully exposed, taking a peek doesn't require any abstractions. Making the data look pretty is purely an exercise in knowing how to locate stuff and draw it, so I don't consider that a layer.

And then there are those who think that encapsulation is required in order to protect data.  Protecting data for referential integrity, security and accidental change is important, but doesn't require encapsulation. Field level permissioning and model-level constraints are usually enough to protect your data, and these constraints should be controlled through business rules - logic written in the business domain, not database constraints that have no knowledge of the business. These business rules can 'watch' the transactions changing the data much like a supervisor. If they see you changing something that doesn't make sense, or if you're not allowed, then they will be able to jump in and stop you.

So I repeat, "building a business solution should not involve creating any new layers". 

This means that, for developing business logic, creating abstractions is not required. This means that encapsulation is not important. And this means that data hiding is not only not important, it's actually detrimental to your application.

Many of the features of OOP languages are just not needed in business applications once the domain model is in place. They are actually overkill for business applications. They give you too much power and in many ways encourage you to make data less accessible. So if someone argues you must use an object oriented language to develop business logic, there are plenty of reasons to disagree with them.

But if you're not going to use OOP languages for your business logic, what choice do you have? The problem is actually the lack of business oriented languages, at least the lack of modern business oriented languages seeing as there were some, but they were developed in the 70s and do little to make business application development any easier.

I hope we will see the emergence of some new languages, domain specific languages, that focus on data and not on constructing abstractions. And no, COBOL is not the answer.

Spreadsheets are NOT Business Applications, Stored Procedures are NOT Business Logic

Or at least they really shouldn't be.

Spreadsheets are all about sheets, rows and columns. Every piece of your spreadsheet logic is going to be engrained in this model of cell references - at best by name, but mostly by row and column. Rows and columns are the domain model for spreadsheets.

I discussed in my last post how important it is to be programming in the right domain model, or your code will be substantially more complex and more difficult to maintain. I call this mixing domains.

The domain model for business applications must be specific to the field of the business. For a loan business, the business application should be talking about loans, customers, payments, and drawdowns.These entities make up the domain model for your business application.

If you have business applications created as spreadsheets, then you're losing out on a lot. Firstly I'm sure you would have noticed how unmaintainable it soon becomes. Secondly you're creating a gap between business specialists who understand the business, and the convoluted logic that only technical people can understand in your spreadsheet. This gap is expensive for a business as it directly impacts the TCO and speed at which business solutions are developed and extended.

The better solution is to develop the business application in domain-centric platforms, such as using a workflow product (or domain specific language), or constructing a domain in a general purpose language such as C# or VB. Unfortunately there are not many business application development platforms that are quite as 'RAD' as Excel may feel (although this feeling is somewhat deceptive given the mess you end up creating).

The same can also be said of Stored Procedures. The domain model here is tables, rows and fields -- database management. SQL is a language that was created to describe relational queries between tables, it was not meant for creating business logic. Doing so, again, is mixing domains and just ends up in more complex and unmaintainable code.

This really backs my argument that what we really need are more domain specific languages. Programming languages that can be developed or reviewed by business experts and that more closely resemble the actual requirements for the project.

Why Domain-Specific Is Important

A domain model is the language, behavior and tools we use to solve problems in a particular field.

An application consists of many different levels of domain models.

At the top level, for example, the domain model might be a business model - consisting of customers, invoices and a general ledger. At the next level the domain model might be a database, consisting of records and tables and queries. The next domain model might be file access consisting of device control, files, and buffers.

In an ideal world we would program each of these levels with a language and IDE that is specific, or at least optimized, for that domain. The next best solution is to have a library specific to the domain that we can use from a general purpose language.

In reality, however, that isn't always the case. In some cases we end up using the wrong domain model for a solution. For example we might program a business application in terms of rows and tables. Or we might not use a database and instead write the data straight to files.

This ends up with unnecessarily complex and unreliable code: The further away our code is from the appropriate domain model, the more complex the code is. Complex code means more testing, more maintenance and impedes our ability to enhance and extend the code. Applications suffer because of complexity.

On the other hand, code that is very close to the domain model is easier to read, especially by those familiar with the domain. It is usually more succinct, and is less prone to errors because of its clarity. You're also more likely to get it right the first time because the language of the requirements, if written, will be far easier to translate into code. For example, if the requirement is to collect a list of all the files on your hard drive, the simple DOS (incidentally a domain specific language for managing files) command "dir c:\ /s" can't really go far wrong.

So when you are developing an application, you should look at the domain of the problem you are trying to solve and make sure you are writing code specifically to that domain.

First see if there are any domain specific languages that could be leveraged to code it up. For example, if the domain relates to database management, look at SQL or perhaps XPath/XQuery. If the domain is text manipulation, look at RegEx or Perl. If the domain relates to GUI elements, look at HTML or WPF. Or for business work flow, you could use a workflow flow-chart designer.

Failing a DSL, look for a domain specific library that you can use from your general purpose language. This will consist of a number of functions or classes that represent the different abstractions in the domain. For example a class for a customer record, or WPF for creating GUIs, or WWF for creating workflows. The goal is to have your code written in terms familiar to the domain appropriate to the problem you are solving.

Failing a domain specific library, you can proceed to code up the domain model yourself. General purpose languages like C# or Java are pretty good at creating abstractions, and this can be used to create domain models. Heck there's even a design pattern for creating a domain model using an object oriented GPL.  If you're particularly ambitious you could even try creating your own domain specific language using something like JetBrains' MPS.

What's important is that you're coding in a language that is appropriate for the domain, otherwise you'll likely end up with some messy, unmaintainable, unreadable and unreliable code.

Visual Studio IDE Tip: Switching Files With the Keyboard

Here's a tip for anyone who uses Visual Studio who ends up with so many windows open in the IDE that they struggle each time they have to find a particular tab.

If you get annoyed at having too many window tabs open you can easily find the one you need without having to even use the mouse.

Just press Alt-W, W and type in the first few characters of the file you need and press enter.

I find this considerably faster than switching from the keyboard to the mouse and then scrolling through the tabs to find the one I need. This works in both VS2003 and VS2005.

Another alternative only available in VS2005 is to press Ctrl+Tab, keep Ctrl down, and click the window you need with the mouse - which is a little faster than Ctrl+Tabbing through each window, but does involve the mouse.

mscorsvw.exe and 100% CPU

If like me you've ever had mscorsvw.exe eating 100% (or a lot) of your CPU for hours or days on end, there are two options for getting rid of it.

One involves running ngen synchronously, at a higher priority one would suppose. To do this you go to the Windows\Microsoft.Net\Framework\[dotNetVersion] folder and run "ngen executeQueuedItems". For some people, including myself, this doesn't work and comes back with an error.

An alternative is to simply disable the CLR Optimization Service that is responsible for running mscorsvw.exe. The CLR Optimization Service is a new thing in .Net 2.0 that ngens (generates native binary versions of your .Net code) assemblies in the background rather than on-demand or just-in-time only (as it was in 1.1). By disabling the service you're forcing it back to the old 1.1 behavior.

To disable the service, go to Control Panel / Administrative Tools/ Services... and choose .Net Runtime Optimization Service, right click and select Stop. To permanently stop it, right click, go to Properties, under Startup Type select Disabled.

For more information on what the .Net Runtime Optimization Service does, please read this post:
http://blogs.msdn.com/davidnotario/archive/2005/04/27/412838.aspx

How Bill Gates Works

Not particularly technical but this is an interesting article that lets us peek into a day in the life of Bill Gates, written by the man himself:

http://biz.yahoo.com/hftn/060407/033006_gates_howiwork_fortune.html?.v=1

I once read that Mr Gates gets 4 million emails a day, yet he claims he gets 100 emails (after being filtered) and that his assistant summarizes the rest. That's quite a lot of reading. But it's interesting to know that there's a slight possibility that the richest man of the world may get your message if you send him something.

Also an interesting part:

"Outlook also has a little notification box that comes up in the lower right whenever a new e-mail comes in. We call it the toast. I'm very disciplined about ignoring that unless I see that it's a high-priority topic."

Glad your company's software is working for you.

The Case for Unit Testing

Leaving aside whether you're for, against or just don't get test driven development, it's worth noting that unit testing will always be necessary to write reliable code. It may seem obvious to some, I'd like to point out the reasons why unit testing is required even for those who diss or dismiss test driven development.

For one, programmers like to re-use code. It doesn't matter whether that code is domain-specific or low level, re-use happens. It clearly helps productivity. However every time you share code with another module or class or method, you're opening your code up to changes made to that shared code: The code can certainly change behind your back.

When you invoke that shared code, you're just passing in parameters, you're not specifying what you expect to happen. The method name might give some hint, but other than that there's really no indication of what you're intending that method to do. The method might be called CalculateCost but for all you know it's planting daisies and you won't see any compilation errors. It's free to do whatever it likes, and that makes your code less reliable. High level, low-level, structured or spaghetti, any code that utilizes re-use is brittle.

Like it or not you're also going to be adding bits to code as time goes on, and anything you change has the potential to inadvertently break existing dependencies or existing behavior.

Not to mention that people make mistakes. Code may change accidentally, not re-used code but the primary code itself. With an infant in the house sometimes I find random blurbs of keystrokes in my code. Luckily this breaks in compilation, but what if she just inserted a 0 inside a literal number.  Or a semicolon in the wrong place. Source control helps here, but unless it's highly guarded and nothing can be checked in without review, then you're bound to accidentally and inadvertently change code at some point.

So what can we do about it?

One solution is to never change the code. Once you know it works 100%, it gets set in stone. The code isn't allowed to change and neither are any of its dependencies. A bit like compiling to an executable and throwing away the source code.

Unfortunately this solution just doesn't work in the real world where requirements change and upgrades happen. At some point you will just have to change that code.

Another option is not to re-use any code, to write everything very clearly and have tons of peer reviews before code is checked in, and to pray every night. This also isn't realistic or infallible.

There really aren't any other solutions other than unit testing.

Unit testing involves creating a set of independent executable tests that validate the behavior of every piece of code you write. The test becomes a kind of contract for the code's behavior. If you inadvertently or intentionally change some shared code and break existing behavior, well you'll know about it because - fingers crossed - the tests will fail. The tests can be written before or after the code is created, whichever works for the development process you adopt.

Maybe one day we'll find another solution that involves less work.  Until then you should make friends with nUnit or mbUnit (or the unit test features of VS2005). It's really the only way to write reliable code.

What are Domain Specific Languages?

At some point in the past you may have heard of the term 'Domain Specific Language' or DSL. More so since Microsoft released a DSL toolkit that plugs into Visual Studio. But I'm sure many of you are left wondering just what a Domain Specific Language actually is... 

Please read my article to learn more about DSLs.