Programmer's Log

Sunday, October 01, 2006


Being too confident has become a bad characteristic for us. Our headaches - our DB performance problems - came back. Our database was jammed again this Saturday.


Last week, we worked hard to get most the load of main database out to replication. Going to this Saturday, we were pretty confident that there would be a smooth night. 9PM peak time came with more than 40 live matches ranged from all the soccer leagues around the world. As usual, all the performance graphs and SQL Server profilers were turned on. This time I have learnt how to set up a profiler for myself. I also need to monitor the Admin 2, yesterday I continued to move out some pages to access the replication instead of the main Database. Everything was running smoothly. The main DB CPU usage was only around 10%. We started to relax a little.

There were some couple problems with our ASP sites. My supervisor changed the way we stored our odds change time. His display odds javascript didn't run correctly anymore. Problems were soon gone away, since he is my supervisor ... hehe i'm good he must be better ... jeese starting to be cocky ... (I should look at my first sentence again *winks*)

Since all the servers were running nicely. I went to fix bugs in our .NET member sites and ready for the launch of 9 more sites in the futures. All the sites must pass the acceptance tests. We failed 3 times already; however, the errors were not from our codes, it was just that we didn't have the correct data and enough data for the testing. The data we copied from the data warehouse into our development environment wasn't enough. I decided to deploy the rest of 9 sites into real production environment for a final testing.

11PM, the peak time almost passed and the load became less. Everything was running smoothly. My colleague left work. I still had some deployment on the way, and I decided to stay a little late. At the time I looked at the performance graph for my usual check. Something unexpected happened. The performance graph showed unusual behaviours. Database was jammed again. I looked over to my supervisor, and we started receiving complains.

... SETTLEMENT again ...

The operation did the calculation for the win/loss after the matches were over. This is called settlement process. And the database was jammed when the settlement code was executing. The problem was why it didn't happen when my supervisor executed the settlement himself. How to solve this? What could we do now? We are running out of options.

... I was off on Sunday ... but still kept monitoring the system from my house ... next week, we will have to continue to look into this issue again.

Database is not my specialty ... the only thing I know is to listen to my colleague and my supervisor ... well I will learn ...

This Saturday, our external consultant also invited a SQL specialist who was from Taiwan to help us to improve the database performance. He introduced us to COM+ ... a technology which Microsoft is replacing with .NET Remoting and Web services. The first impression from us wasn't too good for him. Let's see what he will have for us.

0 Comments:

Post a Comment

<< Home