December 2004 - Posts

data layers, soa, and tootb

There is a very interesting article over at TheServerSide today by Rocky Lhotka entitled “The Fallacy of the Data Layer”. IMO, there aren't too many people that really “get it” more then Rocky.

Now, before you link over there, read the article, and start getting all worked up...I've followed enough of Rocky's blogs, articles and comments to go out on a limb and assume that he is not suggesting we go into work tomorrow morning and start replacing our Ado.Net data access code with web services. Instead, I think the article challenges us to continue questioning the notion of exactly what role web services play (if any) inside an application.* By thinking out of the box here a little about data sources, and their similarity to other external interfaces, I think Rocky helps focus attention back to the fact that the application itself should in deed be the center of the architecture.

This is the very basic point that seems to get lost in most other articles that mention SOA. And when it gets lost in articles its one thing, but when the point is missed in real world architecture, it can be painfully expensive. I am speaking first hand about this experience.

By thinking of the Data Layer as we would any other external interface, we better align it with other similar responsibilities; things that give data, receive data, and otherwise interact with our application (i.e. event actions). But we also do one more thing that I think is really the kicker here, we eliminate one more reason why people feel overly compelled to use web services inside an application.

Because accessing a data source has additional security concerns, it is often the case that not all physical tiers of an application can physically connect to a given database. When this happens, and we are not looking at our architecture as Rocky suggest in the article, then we often end up with an application architecture that gets pretty messy. We end up with an application that eventually gets split in two (if not more) pieces by a web service. We have part of the application (that can't see the database) on one side of the web service, and the other part of the application on the inside of the web service (the part that can see the database), and we somehow think that these two applications are really one. Business logic begins to get duplicated in both halves of the application. It becomes very tricky to determine which half of the application is responsible for what. We often find that we need some behaviors from one side of the application on the other side. We duplicate security, we duplicate validation, up to the point that someday we realize that what we have now is two applications instead of one. Not only do we have two applications, but they are two very similar applications, often with copies of the same business code in both. The only real difference being that at the edge of one we interact with users and at the edge of the other we interact with a database. Both interact with one another. And often this split of the application with a web service is initially done because of data layer concerns. This is seen and heard in many architectures discussions today and is often described as using a web service between the front end and back end of an application.

You might think, whats the big deal in describing the split in the application as above or saying that data access is a service; “don't I still have this problem of two applications divided by a service“. The answer is yes and no. In the previous description, we attempted to rationalize a single application split in two, but now we have two clearly defined application responsibilities, and therefore two applications. The application that is the Database service, has clearly defined duties. As an application, its externally exposed service is simply sending and receiving data. Internally, this data is simply persisted to a Database. Any business logic performed, would be clearly defined to the responsibility of retrieving and persisting this data; ie. data type, length etc. But, truly only from the perspective of data persistence. The other application (which is our actual business application) is the single domain that all business logic takes place. This includes business rules on data validations, code to perform special calculations, applying user settings, and all role authorizations, etc. The stuff that makes up the core of what the application is about. We no longer have all of this duplicated between both (halves) of the application. Why?- Because we now have two applications, not one application with two halves.

*By inside an application, I am speaking of between layers of an application 

Partial Class usage

My favorite use of partial classes so far...business object data access code.

In .net 1.x, I often coded my data access layer directly within my business objects. While I was never 100% comfortable with this, it often outweighed my other choices when architecting a project. In my mind, part of a business object's behavior is persisting itself. However, there is always the fear of tightly coupling the BO to a given persistence medium, and then needing to switch persistence mediums. I often find this more of just a good layering question in general, and think the number of times a business objects persistence medium is actually changed is pretty negligible. Keeping the BO and Data Layer as separate classes (and/or) assemblies seems like a better answer to most people who think about these things. However, the problem comes down to how the data is "squirted" into the BO. I say squirted, because if possible I would like the data to come right off a data reader onto my private member variables. If the DataLayer is a separate class (or assembly), then good OOP says that it shouldn't know about my BO's private parts. And if my data layer is just going to return a data reader, then well there is really little need for it to be a separate class or assembly. I would rather see the reader explicitly opened and closed within the scope of a single method call and not handed off by a separate class. Also, if the data layer has to stuff the data into another container (say a separate structure, or <cringe>data set</cringe>), then I've just made a copy of my data and perhaps instantiated a few objects along the way. I only rambled on here because I often get negative feedback when I even comment on why I like my data access code to exist in my BO. But like I said at the beginning, I too have never felt 100% comfortable doing this (despite being convinced of it's merits.)

Well partial classes to the rescue. I now am 100% * convinced, thanks to partial classes. Now I can create a business object assembly with a folder called BusinessObjects and a folder called DataLayer. Inside each folder I can create part of a partial class called Employee. The part of the Employee class in the BusinessObject folder does not need to know anything about my persistence mechanism. It does not have a single System.Data.SQLClient.xxx statement. My other part of Employee in the DataLayer folder provides the Employee CRUD directly against the Employee's private parts  member varialbes! This is really is the best of both worlds. The data access code is now layered into separate .vb files and if the rug gets pulled out and we need to switch from the “S” db to the “O” db, then we do not impact any code files other then the CRUD specific files written in the DataLayer folder. At the same time, we haven't exposed the Employee's private member variables to any other classes, and we can use the fastest means of getting this data into the BO, which is streaming right off of the reader without any additional copies. I like it.

*100% convinced = In my ever evolving quest for design zen, over the next 3 or 4 minutes, for the purpose of an academic exercise, this seems pretty cool.

Disclaimer – I am not actually suggesting that it is a best design practice for a BO to contain actual sql statements, and/or connection management, it all depends on what is being built and the context of the larger architecture. I am however suggesting that I do prefer a BO to be self-containing of its behaviors, part of that behavior is generally persistence. However, the actual persistence code in the BO (and in the partial class for the purpose of this blog) might be simply responsible for calling upon the respective DAL, Data Adapter Layer, etc. I just much prefer:

            MyBO.Save

           Over:

           myBOFactory.Save(myBO)

MS Across America

I attended the latest Microsoft Across America MSDN event to roll into my town yesterday; it was part of the fall program. Our presenter was Joe Stagner, and he did an excellent job. I really like the fact that he wasn't afraid to open up some code, modify and compile as he walked through some examples. His was an excellent presenter and a sharp person to ask a few questions - thanks Joe!

As part of the fall program we touched on:

  • some basics of OOP with vb.Net
  • saw some MapPoint WebService usage examples (pretty cool stuff)
  • some Asp.Net 1.1 Optimization tips, tricks and techniques
  • saw some more demos of the 2.0 membership and personalization

Oh yeah, and I got another shrink wrapped shirt, and action packed dvd (and the 2005 beta refresh installed just fine too!)