Sunday, February 19, 2012

HELP..Installed production system slows over time

We have a SQL Server 2000 transactional business production system that is
installed at 20 different locations and has been running very well in all
but 1 location that was installed over the weekend. The hardware setup
"seems" to be rock solid according to the MS SQL Server performance
guidelines.
It is a Dell PE6800 Quad Zeon Processor machine at 3.16 ghz running
Windows Server 2003 standard with SP1. The machine is dedicated to SQL
Server, the only other database being a Veritas backup DB. There are 4 gb of
400 mhz RAM.
The Data files are located on a Raid 5 high speed disk array and the
Transaction log is on a Raid 1 mirrored pair. The server is connected to the
network by a Broadcom NetXtreme Gigab: Network controller configured for
100Mb full-duplex link.
Once again let me state that this same system is installed on much less
capable hardware at other locations with up to 30 users accessing it via a
VB Client application running on the workstations. Those other locations are
not experiencing the problem.
Now, the differences...
This location has 4 time as much data as the other locations and 40 client
workstations. One of the tables has 2.2 million records (the Account Detail
table which tracks all activity to Invoices) and the other 13 tables have
record counts that vary but mostly in the 100,000 to 600,000 range. The
system data was converted from a DOS DBF system (on a less capable server)
that held these very same records and was running slowly but did not time
out. We ran multiple reports after the data migration and everything seems
to have come over as expected. The same data migration process has been run
at all the other installations without problem.
Here is an example of what happens. When the Server has been freshly
rebooted a simple query on the largest table takes about 3 seconds. After
the 40 clients run the system for about 2 hours the system gradually slows
to a crawl. When I try to run the same test query it can take up to 3
minutes.
Here is the strange part. Looking at the Task Manager, the Processors are
barely taxed and there is 2GB of free RAM. I don't know how to monitor the
disk activity. This is where it gets stranger. After all users have logged
off, the system STILL runs dog slow using the same test query, BUT after
about 2 hours it is back up to good performance. I am puzzled. It is as if
the Transaction queue is getting backed up, but resolves itself after time.
Or is it maybe the Disk arrays having trouble synchronizing? But, a Server
reboot seems to clear the problem almost immediately.
Has anyone seen such a problem? Any ideas? Microsoft MVPs, please weigh in.
Thanks for your help...John,
Could be a blocking issue. Check out blocking using sp_who, sp_who2 and
sp_blockcnt.
Also, take a look at:
How to monitor SQL Server 2000 blocking
http://support.microsoft.com/default.aspx?scid=kb;en-us;271509
HTH
Jerry
"John Kotuby" <jkotuby@.snet.net> wrote in message
news:eGraKE90FHA.1252@.TK2MSFTNGP09.phx.gbl...
> We have a SQL Server 2000 transactional business production system that is
> installed at 20 different locations and has been running very well in all
> but 1 location that was installed over the weekend. The hardware setup
> "seems" to be rock solid according to the MS SQL Server performance
> guidelines.
> It is a Dell PE6800 Quad Zeon Processor machine at 3.16 ghz running
> Windows Server 2003 standard with SP1. The machine is dedicated to SQL
> Server, the only other database being a Veritas backup DB. There are 4 gb
> of 400 mhz RAM.
> The Data files are located on a Raid 5 high speed disk array and the
> Transaction log is on a Raid 1 mirrored pair. The server is connected to
> the network by a Broadcom NetXtreme Gigab: Network controller configured
> for 100Mb full-duplex link.
> Once again let me state that this same system is installed on much less
> capable hardware at other locations with up to 30 users accessing it via a
> VB Client application running on the workstations. Those other locations
> are not experiencing the problem.
> Now, the differences...
> This location has 4 time as much data as the other locations and 40 client
> workstations. One of the tables has 2.2 million records (the Account
> Detail table which tracks all activity to Invoices) and the other 13
> tables have record counts that vary but mostly in the 100,000 to 600,000
> range. The system data was converted from a DOS DBF system (on a less
> capable server) that held these very same records and was running slowly
> but did not time out. We ran multiple reports after the data migration and
> everything seems to have come over as expected. The same data migration
> process has been run at all the other installations without problem.
> Here is an example of what happens. When the Server has been freshly
> rebooted a simple query on the largest table takes about 3 seconds. After
> the 40 clients run the system for about 2 hours the system gradually slows
> to a crawl. When I try to run the same test query it can take up to 3
> minutes.
> Here is the strange part. Looking at the Task Manager, the Processors are
> barely taxed and there is 2GB of free RAM. I don't know how to monitor the
> disk activity. This is where it gets stranger. After all users have logged
> off, the system STILL runs dog slow using the same test query, BUT after
> about 2 hours it is back up to good performance. I am puzzled. It is as if
> the Transaction queue is getting backed up, but resolves itself after
> time. Or is it maybe the Disk arrays having trouble synchronizing? But, a
> Server reboot seems to clear the problem almost immediately.
> Has anyone seen such a problem? Any ideas? Microsoft MVPs, please weigh
> in.
> Thanks for your help...
>
>|||John Kotuby wrote:
> We have a SQL Server 2000 transactional business production system that is
> installed at 20 different locations and has been running very well in all
> but 1 location that was installed over the weekend. The hardware setup
> "seems" to be rock solid according to the MS SQL Server performance
> guidelines.
> It is a Dell PE6800 Quad Zeon Processor machine at 3.16 ghz running
> Windows Server 2003 standard with SP1. The machine is dedicated to SQL
> Server, the only other database being a Veritas backup DB. There are 4 gb of
> 400 mhz RAM.
> The Data files are located on a Raid 5 high speed disk array and the
> Transaction log is on a Raid 1 mirrored pair. The server is connected to the
> network by a Broadcom NetXtreme Gigab: Network controller configured for
> 100Mb full-duplex link.
> Once again let me state that this same system is installed on much less
> capable hardware at other locations with up to 30 users accessing it via a
> VB Client application running on the workstations. Those other locations are
> not experiencing the problem.
> Now, the differences...
> This location has 4 time as much data as the other locations and 40 client
> workstations. One of the tables has 2.2 million records (the Account Detail
> table which tracks all activity to Invoices) and the other 13 tables have
> record counts that vary but mostly in the 100,000 to 600,000 range. The
> system data was converted from a DOS DBF system (on a less capable server)
> that held these very same records and was running slowly but did not time
> out. We ran multiple reports after the data migration and everything seems
> to have come over as expected. The same data migration process has been run
> at all the other installations without problem.
> Here is an example of what happens. When the Server has been freshly
> rebooted a simple query on the largest table takes about 3 seconds. After
> the 40 clients run the system for about 2 hours the system gradually slows
> to a crawl. When I try to run the same test query it can take up to 3
> minutes.
> Here is the strange part. Looking at the Task Manager, the Processors are
> barely taxed and there is 2GB of free RAM. I don't know how to monitor the
> disk activity. This is where it gets stranger. After all users have logged
> off, the system STILL runs dog slow using the same test query, BUT after
> about 2 hours it is back up to good performance. I am puzzled. It is as if
> the Transaction queue is getting backed up, but resolves itself after time.
> Or is it maybe the Disk arrays having trouble synchronizing? But, a Server
> reboot seems to clear the problem almost immediately.
> Has anyone seen such a problem? Any ideas? Microsoft MVPs, please weigh in.
> Thanks for your help...|||Your problem is an interesting one. A question I would have relating to
it has to do with the outcome of your migration data validation. You
did not elaborate on the process utilized which would lead me to
suspect it may have been quite an abbreviated validation. Can you be
quite certain that stray charactors did not populate some table space
along with valid data? I am currently doing a small bit of research
around validation methodoligy. Would be quite interested in learning
more about the process you empoloyed for this.|||We ran Invoice balance reports for every Company (Division) in the original
DOS system and again after the data migration. All companies balanced before
and after the migration to the penny. The report output was identical for 5
years of data.
<jeff.livingston@.philips.com> wrote in message
news:1129655929.952358.319620@.f14g2000cwb.googlegroups.com...
> Your problem is an interesting one. A question I would have relating to
> it has to do with the outcome of your migration data validation. You
> did not elaborate on the process utilized which would lead me to
> suspect it may have been quite an abbreviated validation. Can you be
> quite certain that stray charactors did not populate some table space
> along with valid data? I am currently doing a small bit of research
> around validation methodoligy. Would be quite interested in learning
> more about the process you empoloyed for this.
>|||Thanks Jerry,
I have been looking at the results of the SPs and finding some strange
things, like a single workstation having 30 SELECT statements running
simultaneously. We are digging deeper.. it's a start.
"Jerry Spivey" <jspivey@.vestas-awt.com> wrote in message
news:%23EXw0Z$0FHA.1108@.TK2MSFTNGP14.phx.gbl...
> John,
> Could be a blocking issue. Check out blocking using sp_who, sp_who2 and
> sp_blockcnt.
> Also, take a look at:
> How to monitor SQL Server 2000 blocking
> http://support.microsoft.com/default.aspx?scid=kb;en-us;271509
> HTH
> Jerry
> "John Kotuby" <jkotuby@.snet.net> wrote in message
> news:eGraKE90FHA.1252@.TK2MSFTNGP09.phx.gbl...
>> We have a SQL Server 2000 transactional business production system that
>> is installed at 20 different locations and has been running very well in
>> all but 1 location that was installed over the weekend. The hardware
>> setup "seems" to be rock solid according to the MS SQL Server performance
>> guidelines.
>> It is a Dell PE6800 Quad Zeon Processor machine at 3.16 ghz running
>> Windows Server 2003 standard with SP1. The machine is dedicated to SQL
>> Server, the only other database being a Veritas backup DB. There are 4 gb
>> of 400 mhz RAM.
>> The Data files are located on a Raid 5 high speed disk array and the
>> Transaction log is on a Raid 1 mirrored pair. The server is connected to
>> the network by a Broadcom NetXtreme Gigab: Network controller configured
>> for 100Mb full-duplex link.
>> Once again let me state that this same system is installed on much less
>> capable hardware at other locations with up to 30 users accessing it via
>> a VB Client application running on the workstations. Those other
>> locations are not experiencing the problem.
>> Now, the differences...
>> This location has 4 time as much data as the other locations and 40
>> client workstations. One of the tables has 2.2 million records (the
>> Account Detail table which tracks all activity to Invoices) and the other
>> 13 tables have record counts that vary but mostly in the 100,000 to
>> 600,000 range. The system data was converted from a DOS DBF system (on a
>> less capable server) that held these very same records and was running
>> slowly but did not time out. We ran multiple reports after the data
>> migration and everything seems to have come over as expected. The same
>> data migration process has been run at all the other installations
>> without problem.
>> Here is an example of what happens. When the Server has been freshly
>> rebooted a simple query on the largest table takes about 3 seconds. After
>> the 40 clients run the system for about 2 hours the system gradually
>> slows to a crawl. When I try to run the same test query it can take up to
>> 3 minutes.
>> Here is the strange part. Looking at the Task Manager, the Processors are
>> barely taxed and there is 2GB of free RAM. I don't know how to monitor
>> the disk activity. This is where it gets stranger. After all users have
>> logged off, the system STILL runs dog slow using the same test query, BUT
>> after about 2 hours it is back up to good performance. I am puzzled. It
>> is as if the Transaction queue is getting backed up, but resolves itself
>> after time. Or is it maybe the Disk arrays having trouble synchronizing?
>> But, a Server reboot seems to clear the problem almost immediately.
>> Has anyone seen such a problem? Any ideas? Microsoft MVPs, please weigh
>> in.
>> Thanks for your help...
>>
>|||John Kotuby wrote:
> Here is an example of what happens. When the Server has been freshly
> rebooted a simple query on the largest table takes about 3 seconds. After
> the 40 clients run the system for about 2 hours the system gradually slows
> to a crawl. When I try to run the same test query it can take up to 3
> minutes.
> Here is the strange part. Looking at the Task Manager, the Processors are
> barely taxed and there is 2GB of free RAM. I don't know how to monitor the
> disk activity. This is where it gets stranger. After all users have logged
> off, the system STILL runs dog slow using the same test query, BUT after
> about 2 hours it is back up to good performance. I am puzzled. It is as if
> the Transaction queue is getting backed up, but resolves itself after time.
> Or is it maybe the Disk arrays having trouble synchronizing? But, a Server
> reboot seems to clear the problem almost immediately.
> Has anyone seen such a problem? Any ideas? Microsoft MVPs, please weigh in.
> Thanks for your help...
Do you update statistics? Auto-update?
There may be contention for tempdb, especially if you have a lot of sorts (order by and group by) in you SQL statements.
Supposedly this happens on servers with multiple processors and lots of memory. Search MS knowledge base for
concurrency and tempdb to see if it fits your situation.
Ed

No comments:

Post a Comment