This post may come to you in many formats:
- Surprise as my posts are usually less verbose
- As a major pain in you bally (due to a huge laughing crisis) if you were part of the team involved in the situation
- As a major heads up to everyone who think that working beyond your limit is the way to deploy a project
The story is about 4 wasted hours of a team of 5, in the deployment of a critical project.
“What was the cause of that?!” you may ask… and the answer is: “A blank line!”. Actually I could add some bad words to that answer, but I’ll just leave it at that.
The project was successfully deployed, after a week of day and night effort of the team and it was finally time to load test it and get some actual readings of the capacity of the infrastructure. For the test I was using Visual Studio 2008, with a test project, all nicely configured with dynamic data sources, CSV files, a nice set of webtests and the load tests to go with it.
The project had been tested against the quality environment and everything worked smoothly. By the time I started hitting the production environment with the simplest test (load homepage with login process), I started getting 50% of 401s (access denied). OOPS! What the heck is going on?? I tried the webtest by itself (1 shot) and it went smoothly. Then I tried the load test and 50% were access denied. Lowered the number of concurrent users and the percentage was the same!? Well, we’ve started the obvious and not so obvious process:
- Event log
- Sharepoint logs
- IIS Logs
None of these came up with unexpected information. Well, in the end that wasn’t correct, but I’ll fill in the gaps later on.
It’s time to bring in the artillery, I thought. Fired up WinDbg, which revealed nothing new, just loads of it… have you ever tried to debug a system while it was being load tested? Well, let’s just say: DON’T.
OK, by this time I was throwing out the towel. “Let’s hit another server! Probably there is some problem with this machine…”. And so we did. And the crap hit the fan again.
By now it was official: panic was setting in. Some of the members of the team were starting to roll back code, and the chaos was just around the corner.
In a desperate move, I tried hitting the quality environment again and the problem persisted. Ok we have a common denominator: my machine. This suspicious actually came true, when we used another machine to perform the test. The test came clean. And suddenly … a light at the end of the tunnel! (and it wasn’t a train heading for us…).
I use a CSV with the list of users that will logging in the application. We were using just 1 user at the time, so what would happen if there was 1 line feed too many ??? Exactly! Once in every 2 tests, there would be a test trying to login with A BLANK LINE!!!!!!!!!!!!!!!!!!!!! There it was: the magic number – 50% of access denied.
The thing is, in the IIS log files, there were some lines with the access denied and there was no username!!! But, tired as we were, we just missed it.
So, as a punch line, be rested when you are deploying a system…
PS: Thank you LM, PR, AR e RS for you support J