September 2006 - Posts
As a developer I never know if what I say is valid. With the constant changing of technology, a hard and fast rule one day might soon be obsolete. With that in mind, I'm always careful about how much I say I know something. As soon and I say "I'm good at x" there's someone there who can point out something you're not doing correctly, thereby voiding your "goodness".
For me personally this lends itself, to never being too sure of myself in large groups of developers. With diverse backgrounds and training you never know if the idea you offer will be shot down quickly as something you should've known. For example, still to this day I see a lot of people who think the follow is acceptable:
SELECT FirstName, LastName FROM Users WHERE UserID = ' + Request.QueryString["ID"]
(HINT: THINK SQL INJECTION)
If you've been developing that way for years never coming across that vulnerability, you might offer that solution the next time you were faced with a situation where you needed to query. However, if you met a developer who knew better, you're current frame of reference would be quickly shattered when shown how bad the above query is.
Yesterday a developer I really admire and is very popular among the .NET community took an idea I submitted to a newsgroup and commented on it how he liked the idea. Boy that made me feel good. Someone far superior in skill and knowledge was acknowledging a good idea, from me no less. I was smiling for awhile.
I ran into an interesting gotcha yesterday. To summarize, when you change a table in SQL Server 2000, if a view is looking at that table, you need to update the view.
Here is an excerpt from SQL Server Books Online:
If a view depends on a table (or view) that was dropped, SQL Server produces an error message if anyone tries to use the view. If a new table (or view) is created, and the table structure does not change from the previous base table, to replace the one dropped, the view again becomes usable. If the new table (or view) structure changes, then the view must be dropped and recreated.
What this means is that is that if you have a view that reads:
CREATE VIEW dbo.SampleView
AS
SELECT *
FROM dbo.Users
If you then added a column to the Users table, that table will not be shows if you use SampleView. You must first drop the view and then recreate it.
Hope this tip helps someone out there.
Yesterday in part one I discussed the honors project I did as a senior in college. I left you with the statement that I had recently had the opportunity to use that "only academic" knowledge in the real world.
The company I work for builds a content management system. I was recently asked to build a way to import content. Now that's simple enough right? Right. What's the difficulty and how does this fit in with Part 2 of Lexical Scanning and Parsing you ask? The difficulty lies in being able to tell the CMS what the content is about. Being that our company believes this idea of the World Wide Web we've built a way to interrelate pages based on what's in them. If you have a page about a certain topic, then we think you might want to know about other articles that relate to the one you're looking at. It's very similar Amazon's 'Customers who searched for "care bears" also expressed interest in: ' feature. When I'm looking at buying the smash 1986 hit "Care Bears Movie II: New Generation" I want to know what other people like me also liked.
Yes, the difficulty, so how can you tell what a give blob of content is about? You could read it and manually tell the system. But when the pages start number over the 10's (try over 4,000) then it becomes a tedious, not-so-fun task. So I set out to write a language by which we could figure out what to tell the system about the content blob.
First, we take a set of keywords about the content blob (luckily for me they're provided). Currently though I'm reading some dissertation by some smart guy in Tokyo with some crazy algorithm to figure the keywords out by itself. That's for version dos. Back to version one. I take these keywords and scan them using regular expressions. I set up an expression that if evaluates to true, I tell our CMS system.
In plain English, I want to do the following:
If this article contains the words "care bears" then this article is about the Care Bears.
Using my knowledge of lexical scanning and parsing I set up a psuedo language that really evaluates boolean expressions. It has support for AND, OR, XOR, NOT, and grouping parenthesis. Precedence of the operators is also considered, just like C#. Here is the Backus-Naur Form for how things are evaluated
<b-expression> ::= <b-term> [<orop> <b-term>]*
<b-term> ::= <not-factor> [<andop> <not-factor>]*
<not-factor> ::= [<notop>] <b-factor>
<b-factor> ::= $(<b-expression>$) | <regular-expression>
<orop> ::= $| | $^
<andop> ::= $&
<notop> ::= $!
As you can hopefully see from the BNF, the following operators and grouping statements are defined as follows:
OR - $|
AND - $&
XOR - $^
NOT - $!
Begin Group - $(
End Group - $)
The reason operators are so cryptic is that they will separate regular expressions, so I needed combinations of characters that wouldn't be recognized as a regular expression. I couldn't do '|' as OR because | is alternation in regular expressions. '^' can be used to represent the start of the line or character negation. And finally '(' and ')' are used for grouping. I added the '$' to the definition of each operator since $ represents the end of line in regular expression, $[plus any character] should never appear in a single line regular expression. If I've lost you I'm sorry, trust me on this.
Here is an example of a real life expression one of our guys wrote:
$(Cat00138 $& $!speech$) $| $(Cat00192 $& $!eyes$) $| KidsHealthCat20030 $| $($(ear $| nose $| nasal $| throat$) $& $!$(hearing $| speech $| piercing$)$)
Basically, it's says:
If this article is in (Category00138 AND NOT speech) OR (Category00192 AND NOT eyes) OR Category20030 OR ((ear OR nose OR nasal OR throat) AND NOT (hearing or speech OR piercing))
Though it's kind of complex, this system allows us flexibility like never before. We can set up any number of regular expressions and join them together with other regular expressions using boolean operators and grouping. If the whole expression evaluates to true for a given expression, we know quite a bit about an article.
The other positive is this system is fast. I can scan all 4,000 documents, running about 100+ expressions on each in a few seconds.
If you've read this far you deserve a cookie. If you've actually understood, you deserve two. If you didn't follow, please let me know through the comments and I'll explain better. I want people to understand, this is some cool stuff, and it points out that "worthless academic" knowledge isn't always worthless and academic.
Sorry, not a tech post, that's later today. I just came from bible
study. Now I don't really care what you believe when you read this.
You can be christian, atheist, agnostic, muslim, hindu, or other. The
following is touching to say the least. To give you some quick
background. The study I'm involved in is talking about how to be a
strong man with quality character. Common "movie men" that come to
mind are Maximus (from "Gladiator"), William Wallace (fom "Braveheart"), Hawkeye aka. Nathaniel Poe (from "Last of the Mohicans").
No one would question either of those men's strength and courage, but
their hearts were tender when needed. Here's another, real-life
example of Sullivan Ballou, a civil war soldier who wrote a letter home
to his wife Sarah before the Battle of Bull Run. The following stirs
something in me, and I hope it does you too, whether married or not.
Here is his letter:
My very dear Sarah:
The indications are very strong that we will move in a few days -
perhaps tomorrow. Lest I should not be able to write you again, I feel
impelled to write a few lines that may fall under your eye when I shall
be no more.
Our movement may be one of a few days duration and full of pleasure
- or it may be one of sever conflict and death to me. Not my will, but
thine, O God, be done. If it is necessary that I should fall on the
battlefield for my country, I am ready. I have no misgivings about, or
lack of confidence in, the cause in which I am engaged, and my courage
does not halt or falter. I know how strongly American Civilization now
leans upon the triumph of the government, and how great a debt we owe
to those who went before us through the blood and suffering of the
Revolution. And I am willing - perfectly willing - to lay down all my
joys in this life to help maintain this government, and to pay that debt.
But, my dear wife, when I know that with my own joys I lay down
nearly all of yours, and replace them in this life with cares and
sorrows - when, after having eaten for long years the bitter fruit of
orphanage myself, I must offer it as their only sustenance to my dear
little children - is it weak or dishonorable, while the banner of my
purpose floats calmly and proudly in the breeze, that my unbounded
love for you, my darling wife and children, should struggle in fierce,
though useless, contest with my love of country?
I cannot describe to you my feelings on this calm summer night, when
two thousand men are sleeping around me, many of them enjoying the last,
perhaps, before that of death - and I, suspicious that Death is creeping
behind me with his fatal dart, am communing with God, my country, and
thee.
I have sought most closely and diligently, and often in my ***,
for a wrong motive in thus hazarding the happiness of those I loved, and
I could not find one. A pure love of my country and the principles I have
often advocated before the people and "the name of honor that I love more
than I fear death" have called upon me, and I have obeyed.
Sarah, my love for you is deathless, it seems to bind me to you with
mighty cables that nothing but Omnipotence could break; and yet my love of
Country comes over me like a strong wind and bears me irresistibly on,
with all these chains, to the battlefield.
The memories of the blissful moments I have spent with you come
creeping over me, and I feel most gratified to God and to you that I have
enjoyed them so long. And hard for me it is to give them up and burn to
ashes the hopes of future years when, God willing, we might still have
lived and loved together, and seen our sons grow up to honorable
manhood around us. I have, I know, but few and small claims upon Divine
Providence, but something whispers to me - perhaps it is the wafted
prayer of my little Edgar - that I shall return to my loved ones
unharmed. If I do not, my dear Sarah, never forget how much I love you,
and when my last breath escapes me on the battlefield, it will whisper
your name.
Forgive my many faults, and the many pains I have caused you. How
thoughtless and foolish I have often times been! How gladly would I
wash out with my tears every little spot upon your happiness, and
struggle with all the misfortune of this world, to shield you and my
children from harm. But I cannot. I must watch you from the spirit
land and hover near you, while you buffet the storms with your precious
little freight, and wait with sad patience till we meet to part no more.
But, O Sarah! If the dead can come back to this earth and flit
unseen around those they loved, I shall always be near you; in the
garish day and in the darkest night - amidst your happiest scenes and
gloomiest hours - always, always; and if there be a soft breeze upon your
cheek, it shall be my breath; or the cool air fans your throbbing temple,
it shall be my spirit passing by.
Sarah, do not mourn me dead; think I am gone and wait for thee, for
we shall meet again.
As for my little boys, they will grow as I have done, and never know
a father's love and care. Little Willie is too young to remember me long,
and my blue-eyed Edgar will keep my frolics with him among the
dimmest memories of his childhood. Sarah, I have unlimited confidence
in your maternal care and your development of their characters. Tell our
mothers I call God's blessing upon them.
O Sarah, I wait for you there! Come to me, and lead thither my
children.
- Sullivan
(From:
http://www.naciente.com/essay19.htm)
A few years back for my honors project as a senior in college I built my own computer language. It wasn't object oriented (that would've been cool) and looked like a cross between C and Pascal. I wrote everything in C++. Looking back I can't remember what got me interested in compiler theory and thinking that building a compiler would be a good idea. Regardless I defined a language and built a compiler for that language. It was pretty slick. It could handle looping structure (for and while), conditionals (if, if/else), arithmetic, variables, boolean expressions, and functions, basically anything a "normal" language should handle. It was originally built to be used in entry level programming classes at the college I attended. The computer science department had made the switch from Pascal to Java as the learning language a few years after I had passed through. The problem with Java was that it was somewhat complex for beginners. You have to import libraries at the beginning of each of your .java files. It forced you right into object oriented thinking before you understood what a struct, conditional, or array was. Nothing against Java, but you're required to trust a lot of the innerworkings before you understand them. The same can be said of many languages out there today. My language was supposed to ease all that by allowing a person to learn the basics of programming with a lightweight compiler.
In order to compile the source code into an executable, my lexical parser/scanner would go through the source code, and write Intel assembly to a file which was then processed to make an executable. There was an added benefit to the architecture that allowed the assembly to be viewed after the executable was generated. Upperclassmen, in their advanced courses, could type an expression in my language, and view the subsequent Intel assembly (much like MSIL). It was a great way to bridge the gap when learning assembly.
It was something I was very passionate about at the time. I spend hundreds of hours working on the darn thing. I remember on summer nights during break when friends were going out to a movie and I had to pass because I had to figure out how to get the value returned from a function into a variable, all in assembly. When building it I never thought it would help me in the real world and it was simply an exercise in geekiness. Soon after I gave my presentation to the honors committee, all knowledge of the language I had written, was relegated to the far reaches of my brain, where Ace of Base and other early 90's pop musical lyrics are stored, most likely never to serve me again.
Just this past month, I came across the second time that assumption was wrong.
(More in part 2 tomorrow)
I recently read that man (human race), in order to feel fulfillment in
their work, needs to be both "mind tired" and "body tired".
Achieving only one of the states of being tired will not ultimately be
as fulfilling as both. Working around the computer all day I get
"mind tired" all the time, but yet rarely get "body tired". I
think that's why I enjoy mowing my yard so much. It's
physical. It's tangible. After a half hour of work I can
easily see change/progress in my lawn. The same half hour spent
in code will not likely yield the same results.
Yesterday turned out to be a great day. I was home sick from work
Friday. Saturday I still wasn't feeling well which forced me to
stay around the house. As such, I spent the day cleaning.
The garage, the kitchen floor, the stove and stove top, the dishes, the
bathroom from top to bottom, and the bedroom. Come five o'clock I
was tired, and yet fulfilled. After a week of stretching my mind
it was nice to cap it off with some physical work in which I could see
progress.
Just caught this on Digg.
There's a guy Luke Johnson who is doing The Luke Johnson Phone Experiment He's posted a video on YouTube with his phone number. In the video he asked anyone and everyone to call to see if he can get it to ring off the hook 24/7.
At first when I saw this video I laughed because I could imagine myself doing something like this, only instead of giving my number, give a friends number. In college we would always prank each other and try to one up, each other. This seems like it would be unbeatable in the prank realm.
After I saw the video I thought about not calling, I mean why be involved in something like this. But I chose to becuase this is why I love the internet. Some college kid, with a cheap camera, put a video up on a site and can reach the world (and they can reach back). The free flow of information and the ease at which it spreads. How quickly it spreads. Incredible. I love it.
I called the number, 1-602-435-3694 for the experiment and actually spoke with Luke. We chatted for about 30 seconds and he told me I was caller number 1,063 or something like that in the two days since this started. As this catches on I'm sure he'll get inundated with calls.
I sure hope this guy has unlimited call me minutes or he doesn't get dropped by his carrier.
I was never a Scripps National Spelling Bee champion, nor am
I a grammatical wizard. In fact in high school, and more acutely in
college, I did everything I could do to avoid English/literature classes.
I got through college taking one required english course, "Fiction to
Film" a course noting the differences between "classics" in
print and how they were presented on screen. I'm only now learning the
value of spelling and grammar. One of the reasons I started this blog was
to become a better writer (shameless plug: if you have suggestions and/or
constructive criticisms, please don't hesitate to leave comments. Comments can
be about anything from punctuation to content to article selection).
In my office, my manager likes to have various developers interview candidates
for open developer positions. She solicits feedback from several of us before
offering a position to any new developer. We have all applicants fill out
a questionnaire about various programming topics (.NET, OO, Sql, CSS). If
the applicant does well enough on the questionnaire, we have them come in for a
face-to-face interview.
Recently I reviewed a questionnaire of an applicant whose answers weren't that
bad, some were misguided, and some were right on. The only problem was
the sheer number of grammatical and spelling errors. 44 errors (if I
caught them all) in a few pages of questions. I'm not talking about poor
word choices or a missing comma here or there, I'm talking about sentences,
that when read, don't make sense. A sentence should be a complete
thought. I was immediately turned off. Is this the guy I want
writing documentation, emails to clients, or CODE??? In fact on the cover
of his questionnaire in bold black ink, highlighted so many times the black ink
is smudged; it says "Grammatical errors all over the place".
Needless to say I reported back to my manager that I would pass on this
applicant.
Since I'm not the final say on things (thank God), the applicant was brought in
anyway. This was mainly due to others believing that despite the
grammatical errors (they saw them as well) that he answered the questions well
enough to warrant an interview. In the interview, he conducted himself
well. He was very nicely dressed. He was very articulate in his
answers. Some of the technical questions he struggled with, but that's
not the aim of this post. During the interview though, I was handed his
resume for the first time. Scanning through it I found some grammatical
errors and some misspellings. To be honest, I was still surprised.
The questionnaire is timed and I thought that he may have been rushed and in
being rushed made some grammatical/spelling errors, despite being in Word and
having spell/grammar checker. His resume though, theoretically the first
introduction of yourself to people, was laced with errors.
I don't think we'll be offering "Johnny eye cantt spel" a job.
Don't make the same mistake, especially on your resume, where unlimited time
for edits and rewrites can be had. In some ways grammatical/spelling,
while not directly related to design patterns, c#, and optimization, says a
whole lot about someone and their attention to detail. If you're not
brushed up in this area, it could cost you a job, it did in this case.
Email addresses to me are like shoes. When I buy a pair of shoes I buy them to be my running shoes. Running shoes are the top of the pyramid for me. After the shoes have hit their life expectancy as running shoes they get replaced by a new pair of running shoes. The original pair is then demoted down to my everyday shoes. Shoes that I wear to work, to the grocery store, ect. After that they get demoted to the shoes I do yard work in. The last and final stage of a pair of shoes, what's left of them, is to be my disc golfing shoes, where they're likely to get wet in rivers and ponds.
I use email addresses are much the same. I get a new email address and I protect it from the hoards of spam out there. I don't just give out the new email address willy-nilly. I try to protect it using an older email address as a buffer. When required to give an email, I provide my spam email address (some old hotmail account) which I can access to answer confirmation emails if need be. Over time though I enter my email into what I think is a trusted site where I will be receiving email notifications regularly, so I give them my primary email address. Eventually I get overloaded with spam and give up on that email address and create a new account somewhere on some great new mail service. The once pristine email address I used to own gets demoted, just like my shoes, to the role of spam collector.
To help combat this, I've been pointed to this new service at SpamBox.us where a temporary email is set up. All emails to that account are forwarded to your real account which is hidden. I just signed up for del.icio.us using this technique. I got the confirmation email in my gmail account, but if any email is stored in a database, it'll be the cryptic cWP3rzqxIlLB8A3c@spambox.info that I signed up with. The spambox email is temporary, anywhere from 30 minutes to one year. After that, it's gone, along with any record to my real email address.
Do I trust SpamBox not to sell my email? Not entirely. They say they won't you never know. The way I see it though, is that if they do in fact sell it, it'll only becoming from one source rather than 35-50 accounts I have opened around the internet.
I hope this helps you keep your inboxes free of spam.
I just got sent a bug that I need to fix. It was my fault. While fixing the bug though I had the chance to look at some old code I had written. I know learning is an continuing thing, but I was suprised to see some of the decisions I made way back when. One thing in particular caught my eye were some custom collection classes I wrote.
The particular code dealt with a blog module in our product. Having two domain objects, blog and post, I had create a postcollection class which would store a collection of posts (go figure eh?). I'd like to think I'm becoming better with separating concerns in my classes in that I've become more aware of places where I should impart some separation. Looking at my old code though, I found that I had built some static methods inside the postcollection class. For example if you wanted to get back the list of posts for a blog you could say postcollection.Load(int weblogId) and you would be returned a postcollection.
If I were building the class today I would put that static method into the post class which would do the same thing but leave the implementation details out of the postcollection class. The thinking is, "Post" is the domain object, "PostCollection" is simply a "bucket" that holds multiple posts, it shouldn't do anything more than hold posts (separation of concerns).
If it were you, which way would you do it? Would you put the logic to get all the posts for a blog into the post class, the collection class, or elsewhere? Just curious what others think.
I got into an interesting discussion yesterday with some developers here about the value to any of the Microsoft developer certifications (ie. MCSD, MCPD, MCAD). I kow there is a mixed bag of feelings out there so I'm asking if you have a certification and why/why not? If you're a hiring manager do you look for applicants to have certifications? Do certified people "Get things done" faster/better than non?
I have toyed with the idea of working towards a certification. About two months ago I purchased a book to help study for the first test in the track. I'm about 300 pages into an 800 page book and I can categorize the stuff into three categories. 1) Stuff I already know 2) Stuff I kinda knew but knew where to look up 3) Stuff I couldn't care less about. I slowly trudge on though and try to learn it all because the Microsoft tests seem based on syntax rather than large ideas and implementation.
Further, I question the value of any certification when there are companies out there who have knock off tests that mirror the real tests so closely that someone can take these "practice tests" and then go in and pass the real thing. In my situation I'm tempted to get one of these products to save the hours and hours of time I'm spending reading whether is DataAdapter.Fill(Dataset) or DataSet.Fill(DataAdapter). I suppose I could skip that section/chapter but I know the tests are filled with simple little questions, like the one above, that in the real world intellisense solves for you.
I've been doing some reading and some quick mock ups to see if I can grasp the idea behind the Model-View-Presenter design pattern. After a few hours looking today, I'll be honest, I don't get it. Maybe it's just because I can't see how it applies to my specific need. The plusses I do see is that you can more easily test presentational logic through NUnit since that logic has been moved from the codebehind to a testable "object".
Is that it?
I think in most cases the view and presenter need to be extremely coupled. The presenter needs to know exactly what UI elements are on the the page so that it can populate and retrieve user data from those UI elements. If that's the case isn't a codebehind the perfect item to act as the "presenter"? I must be missing something because I know the answer is a resounding NO. But the way I understand things now you can't switch Views out with any presenter becuase the presenter needs to know a lot about the view in order to get things to work properly. What am I missing? Everything I read is really trivial talking about the benefits of TDD with MVP but that's it.
As humbling as it may seem, I created this problem and it was actually more straightforward than I originally had thought. As an enhancement to the last version of the import, I added a xml status file which simply acted like a log file. Where did I put this file? I put it in the bin directory. After some digging I was seeing that the app was unloading quite often. It was unloading even when an import wasn't occuring. What was happening is that during debugging, I found that when I would delete the status file (in essence resetting the system) that the .NET worker process was keeping tabs on the bin and any changes to that directory would unload the application. So any updates to this file during the import was causing the app to restart which was causing the import to take longer due to the excessive restarts. This would also explain the somewhat sporadic behavior of my errors. If the timing of it all worked just right things would import ok, albeit slower than it should (which I only know now because I see the speed increase). However, there would be certain points where the the application would quit. Each time it was in a different location and I couldn't pin it down. My thinking is at this point that it couldn't necessarily be pinned down since it was an internal timing thing.
I've written a fairly slick class library that is used from a web app that imports content from a third party source into our application. The source provides two types of feeds, full and incremental. The only difference in the feeds, is the file size. Structurally they are the same.
I'm attempting to test the process, from downloading the feed, extracting and then importing, but along the way I get a ThreadAbortException thrown when I try to do an import of the full feed (rougly ~100MB).
Any ideas where this is coming from? Resource utilization? If I knew where it was coming from I could try to clean it up and work around it.
After much thought, I've ditched the provider model design pattern in a program I'm working on. I thought the provider model was going to be perfect for this situation. It still might be. I made the decision to drop it thought based the time it takes to implement versus simply coding it as a one-off.
Why I chose the provider model in the first place:
I'm writing a "module" that imports content from a third party into our content management system application. There are certain things that need to be done to a file once I have it in my hands before the import. Many of the things than happen though after I have my hands on the content, is very standard. I thought to myself, that the provider model would be perfect since we could theoretically be sucking in content from many different sources. The fact that *where* the information came from was irrelevant to the general page import. I wanted to be able to say, "Whereever I'm getting my content (provider) retrieve me a collection of pages that I need to import". The idea being whether I was using and XML file, flat file, or access database shouldn't matter to the importer.
Why I felt this decision was a bad one:
I was able to get the content into the form I wanted easy enough but then I came to a point where I need to do some translating on a file. Updating html image links for example. Each translation would be different based on what the feed gave me. In order to code with the provider model I would've had to abstract an interface call ITransformation or something like that and have a collection of them registered with my concrete provider such that my importer could cycle through each ITransformation object calling a singular method, Transform().
As it turned out I was doing so much work on each concrete provider compared to the amount of work was being done by the importer. In other words, the importer was getting the content to import from the provider and then doing very