Wednesday, March 7, 2012

Integrity Check Errors

Hello,
I'm using SQL Server 2000 SP4 on Windows Advanced Server 2000 with latest
patches.
I use Database Maintenance Plans to backup several of my databases and I have
selected the option to check integrity before doing backup. If anything fails,
I get a net send notification.
This has been working extremely well for a year now but only lately,
Integrity checks reveals errors on random tables in random databases.
This morning I got a notification on a database called 'Planning'.
So I ran a DBCC CHECKDB('Planning') WITH NO_INFOMSGS in the Query Analyzer
just to make sure.
The following details were revealed: -
================================================================================ Server: Msg 8928, Level 16, State 1, Line 1
Object ID 1310627712, index ID 7: Page (1:3766) could not be processed. See
other errors for details.
Server: Msg 8944, Level 16, State 1, Line 1
Table error: Object ID 1310627712, index ID 7, page (1:3766), row 18. Test
(ColumnOffsets <= (nextRec - pRec)) failed. Values are 26977 and 22.
CHECKDB found 0 allocation errors and 2 consistency errors in table
'WorkTransaction' (object ID 1310627712).
CHECKDB found 0 allocation errors and 2 consistency errors in database
'Planning'.
repair_allow_data_loss is the minimum repair level for the errors found by
DBCC CHECKDB (Planning ).
================================================================================
I have been able to repair these errors by doing the following: -
1. Restrict access to Single User on this particular database.
2. In the Query Analyzer, run the statement :
USE Planning
go
DBCC CHECKTABLE('WorkTransaction', repair_allow_data_loss) WITH NO_INFOMSGS
this seems to correct the problem.
3. Remove Single User restriction.
However, I have been doing the above procedure 2 - 3 times a day on many
other databases and tables that come up with errors.
I was wondering what could be the root cause of all these sudden errors.
I use Simple Database Backup Recovery Model on all my databases.
Thanks,
Sameer.
--
Message posted via http://www.sqlmonster.comThe most likely cause of db integrity issues (especially recurring ones) is
bad hardware. I would run some intensive hardware tests to find the root
cause.
--
Andrew J. Kelly SQL MVP
"Sameer via SQLMonster.com" <u4996@.uwe> wrote in message
news:57da24b5e8956@.uwe...
> Hello,
> I'm using SQL Server 2000 SP4 on Windows Advanced Server 2000 with latest
> patches.
> I use Database Maintenance Plans to backup several of my databases and I
> have
> selected the option to check integrity before doing backup. If anything
> fails,
> I get a net send notification.
> This has been working extremely well for a year now but only lately,
> Integrity checks reveals errors on random tables in random databases.
> This morning I got a notification on a database called 'Planning'.
> So I ran a DBCC CHECKDB('Planning') WITH NO_INFOMSGS in the Query Analyzer
> just to make sure.
> The following details were revealed: -
> ================================================================================> Server: Msg 8928, Level 16, State 1, Line 1
> Object ID 1310627712, index ID 7: Page (1:3766) could not be processed.
> See
> other errors for details.
> Server: Msg 8944, Level 16, State 1, Line 1
> Table error: Object ID 1310627712, index ID 7, page (1:3766), row 18. Test
> (ColumnOffsets <= (nextRec - pRec)) failed. Values are 26977 and 22.
> CHECKDB found 0 allocation errors and 2 consistency errors in table
> 'WorkTransaction' (object ID 1310627712).
> CHECKDB found 0 allocation errors and 2 consistency errors in database
> 'Planning'.
> repair_allow_data_loss is the minimum repair level for the errors found by
> DBCC CHECKDB (Planning ).
> ================================================================================> I have been able to repair these errors by doing the following: -
> 1. Restrict access to Single User on this particular database.
> 2. In the Query Analyzer, run the statement :
> USE Planning
> go
> DBCC CHECKTABLE('WorkTransaction', repair_allow_data_loss) WITH
> NO_INFOMSGS
> this seems to correct the problem.
> 3. Remove Single User restriction.
>
> However, I have been doing the above procedure 2 - 3 times a day on many
> other databases and tables that come up with errors.
> I was wondering what could be the root cause of all these sudden errors.
> I use Simple Database Backup Recovery Model on all my databases.
> Thanks,
> Sameer.
> --
> Message posted via http://www.sqlmonster.com|||Thanks Andrew,
I forgot to mention that I did a comprehensive hardware diagnostics 2 times
on this past Sunday and did tests especially on Memory like 4 time using the
Dell Diagnostics software and no errors were found.
Should I doubt the results and start changing the hardware components
gradually ?
Andrew J. Kelly wrote:
>The most likely cause of db integrity issues (especially recurring ones) is
>bad hardware. I would run some intensive hardware tests to find the root
>cause.
>> Hello,
>[quoted text clipped - 59 lines]
>> Sameer.
--
Message posted via SQLMonster.com
http://www.sqlmonster.com/Uwe/Forums.aspx/sql-server/200511/1|||It still sounds like hardware related. Do you have a UPS and are you
confident of it's condition? But I just re-read your original post and
missed the part in which you were using Repair Allow data loss. That does
exactly what it says. So to fix the problems it has to destroy or loose
some data. The problem is that you don't know what data you loose and your
data integrity is not lost. Your best bet for a situation such as that is
to backup from the last known good backup. Since you are using simple mode
you can never do a restore to a point in time. That limits your restores to
the last full or differential backup. If you only do that once a night you
can loose up to 24 hours worth of work. You should rethink your backup
strategy. The fact that you have been doing this for some time now means
that you probably do not have a valid Full backup anymore. They most likely
all have some form of corruption in them. Either from what is causing the
issues or from you doing the repairs and allowing data loss. You might want
to consider creating a new db and copying the data out from one and back
into another with referential integrity constraints to see where you may
have bad data.
--
Andrew J. Kelly SQL MVP
"Sameer via SQLMonster.com" <u4996@.uwe> wrote in message
news:57db2f1476231@.uwe...
> Thanks Andrew,
> I forgot to mention that I did a comprehensive hardware diagnostics 2
> times
> on this past Sunday and did tests especially on Memory like 4 time using
> the
> Dell Diagnostics software and no errors were found.
> Should I doubt the results and start changing the hardware components
> gradually ?
>
> Andrew J. Kelly wrote:
>>The most likely cause of db integrity issues (especially recurring ones)
>>is
>>bad hardware. I would run some intensive hardware tests to find the root
>>cause.
>> Hello,
>>[quoted text clipped - 59 lines]
>> Sameer.
> --
> Message posted via SQLMonster.com
> http://www.sqlmonster.com/Uwe/Forums.aspx/sql-server/200511/1|||Thanks Andrew,
1. I will check the UPS tomorrow and keep you posted. I'll also have the
batteries in the UPS replaced as soon as I can.
2. I do a backup every hour.
3. If I create a new database, should I use a different name or preferably
keep the same name as the current one with problems ? If I change to a new
name, that would add a lot of tasks for me to change the name wherever it
exists.....all codes, queries, reports, etc.
4. Given I create a new database, regardless of the issue in point 3 above,
how do I copy the data with referential integrity ? use DTS ? or queries ?
Thanks for your guidance....I highly appreciate it.
Sameer.
Andrew J. Kelly wrote:
>It still sounds like hardware related. Do you have a UPS and are you
>confident of it's condition? But I just re-read your original post and
>missed the part in which you were using Repair Allow data loss. That does
>exactly what it says. So to fix the problems it has to destroy or loose
>some data. The problem is that you don't know what data you loose and your
>data integrity is not lost. Your best bet for a situation such as that is
>to backup from the last known good backup. Since you are using simple mode
>you can never do a restore to a point in time. That limits your restores to
>the last full or differential backup. If you only do that once a night you
>can loose up to 24 hours worth of work. You should rethink your backup
>strategy. The fact that you have been doing this for some time now means
>that you probably do not have a valid Full backup anymore. They most likely
>all have some form of corruption in them. Either from what is causing the
>issues or from you doing the repairs and allowing data loss. You might want
>to consider creating a new db and copying the data out from one and back
>into another with referential integrity constraints to see where you may
>have bad data.
>> Thanks Andrew,
>[quoted text clipped - 17 lines]
>> Sameer.
--
Message posted via SQLMonster.com
http://www.sqlmonster.com/Uwe/Forums.aspx/sql-server/200511/1|||3. The new db is just so that you can get the data in a know condition.
Once you have it where you are comfortable with it and want to use it in
production you can rename it after dropping the original.
4. What I meant was that if you have parent child relationships you want to
add RI constraints on the tables in the new db before you add the data.
That way you will know when you go to insert the data which rows are orphans
etc. It really wouldn't matter how you imported the data as long as you do
it in the right order. Parents first then children and so on.
Andrew J. Kelly SQL MVP
"Sameer via SQLMonster.com" <u4996@.uwe> wrote in message
news:57dfb788bfc8e@.uwe...
> Thanks Andrew,
> 1. I will check the UPS tomorrow and keep you posted. I'll also have the
> batteries in the UPS replaced as soon as I can.
> 2. I do a backup every hour.
> 3. If I create a new database, should I use a different name or preferably
> keep the same name as the current one with problems ? If I change to a new
> name, that would add a lot of tasks for me to change the name wherever it
> exists.....all codes, queries, reports, etc.
> 4. Given I create a new database, regardless of the issue in point 3
> above,
> how do I copy the data with referential integrity ? use DTS ? or queries ?
> Thanks for your guidance....I highly appreciate it.
> Sameer.
>
> Andrew J. Kelly wrote:
>>It still sounds like hardware related. Do you have a UPS and are you
>>confident of it's condition? But I just re-read your original post and
>>missed the part in which you were using Repair Allow data loss. That does
>>exactly what it says. So to fix the problems it has to destroy or loose
>>some data. The problem is that you don't know what data you loose and
>>your
>>data integrity is not lost. Your best bet for a situation such as that is
>>to backup from the last known good backup. Since you are using simple
>>mode
>>you can never do a restore to a point in time. That limits your restores
>>to
>>the last full or differential backup. If you only do that once a night
>>you
>>can loose up to 24 hours worth of work. You should rethink your backup
>>strategy. The fact that you have been doing this for some time now means
>>that you probably do not have a valid Full backup anymore. They most
>>likely
>>all have some form of corruption in them. Either from what is causing the
>>issues or from you doing the repairs and allowing data loss. You might
>>want
>>to consider creating a new db and copying the data out from one and back
>>into another with referential integrity constraints to see where you may
>>have bad data.
>> Thanks Andrew,
>>[quoted text clipped - 17 lines]
>> Sameer.
> --
> Message posted via SQLMonster.com
> http://www.sqlmonster.com/Uwe/Forums.aspx/sql-server/200511/1|||Hi Andrew,
To give you latest status ...
1. I replaced the UPS batteries and after 4 days, corruption came back on one
table in the production database.
2. I also defragmented the disk and after a couple of days, corruption came
back.
3. My next step is to replace RAM chips and see what happens...
The weird thing is that when I get alerted through the DB Maintenance
Integrity Check failure, I run a DBCC CHECKDB and it does find errors.
However, when I simply stop/start the SQL Server, the error goes away WITHOUT
me even running a repair on that particular table.
Is there a way to flush all the connected users so that I can set the
database to a single user before repairing it ?
--
Message posted via SQLMonster.com
http://www.sqlmonster.com/Uwe/Forums.aspx/sql-server/200512/1|||> Is there a way to flush all the connected users so that I can set the
> database to a single user before repairing it ?
ALTER DATABASE with the ROLLBACK option.
--
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
Blog: http://solidqualitylearning.com/blogs/tibor/
"Sameer via SQLMonster.com" <u4996@.uwe> wrote in message news:58622451c05f7@.uwe...
> Hi Andrew,
> To give you latest status ...
> 1. I replaced the UPS batteries and after 4 days, corruption came back on one
> table in the production database.
> 2. I also defragmented the disk and after a couple of days, corruption came
> back.
> 3. My next step is to replace RAM chips and see what happens...
> The weird thing is that when I get alerted through the DB Maintenance
> Integrity Check failure, I run a DBCC CHECKDB and it does find errors.
> However, when I simply stop/start the SQL Server, the error goes away WITHOUT
> me even running a repair on that particular table.
> Is there a way to flush all the connected users so that I can set the
> database to a single user before repairing it ?
> --
> Message posted via SQLMonster.com
> http://www.sqlmonster.com/Uwe/Forums.aspx/sql-server/200512/1|||While the database is online and being used, I would like to stop all future
users from connecting from it, kill all current users logged-in and then
change the db to a single user mode to be repaired.
Could you please elaborate it more ?
Thanks.
--
Message posted via http://www.sqlmonster.com|||ALTER DATABASE dbname SET SINGLE_USER WITH ROILLBACK IMMEDIATE
--
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
Blog: http://solidqualitylearning.com/blogs/tibor/
"Sameer via SQLMonster.com" <u4996@.uwe> wrote in message news:5863a80856c4f@.uwe...
> While the database is online and being used, I would like to stop all future
> users from connecting from it, kill all current users logged-in and then
> change the db to a single user mode to be repaired.
> Could you please elaborate it more ?
> Thanks.
> --
> Message posted via http://www.sqlmonster.com|||Wow, it worked like a charm....
How do I reset it back to multi-user after I finish repairing it ?
Thanks a million.
--
Message posted via SQLMonster.com
http://www.sqlmonster.com/Uwe/Forums.aspx/sql-server/200512/1|||ALTER DATABASE dbname SET MULTI_USER
--
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
Blog: http://solidqualitylearning.com/blogs/tibor/
"Sameer via SQLMonster.com" <u4996@.uwe> wrote in message news:5863d7195c667@.uwe...
> Wow, it worked like a charm....
> How do I reset it back to multi-user after I finish repairing it ?
> Thanks a million.
> --
> Message posted via SQLMonster.com
> http://www.sqlmonster.com/Uwe/Forums.aspx/sql-server/200512/1|||Thanks Mr. Karaszi ... I appreciate it !
Now only if this recurring Integrity errors would vanish away after I replace
the RAM, hopefully ....
--
Message posted via SQLMonster.com
http://www.sqlmonster.com/Uwe/Forums.aspx/sql-server/200512/1|||You are not using a RAM disk by any chance are you?
--
Andrew J. Kelly SQL MVP
"Sameer via SQLMonster.com" <u4996@.uwe> wrote in message
news:58622451c05f7@.uwe...
> Hi Andrew,
> To give you latest status ...
> 1. I replaced the UPS batteries and after 4 days, corruption came back on
> one
> table in the production database.
> 2. I also defragmented the disk and after a couple of days, corruption
> came
> back.
> 3. My next step is to replace RAM chips and see what happens...
> The weird thing is that when I get alerted through the DB Maintenance
> Integrity Check failure, I run a DBCC CHECKDB and it does find errors.
> However, when I simply stop/start the SQL Server, the error goes away
> WITHOUT
> me even running a repair on that particular table.
> Is there a way to flush all the connected users so that I can set the
> database to a single user before repairing it ?
> --
> Message posted via SQLMonster.com
> http://www.sqlmonster.com/Uwe/Forums.aspx/sql-server/200512/1|||Hi Andrew,
Thanks for coming back....what's a RAM disk ?
We're using a simple Primary Hard Disk being mirrored to a Secondary Hard
Disk using Windows Disk Mirroring functionality.
For Memory, we use 2 X 512 MB chips on 2 dimm slots.
Next plan is to buy 1 X 1GB memory chip and put it in 1 dimm slot and if need
be, buy another one and fill the last dimm slot.
--
Message posted via http://www.sqlmonster.com|||If you don't know then you are most likely not using it<g>. But you are
using software raid vs. hardware. I don't know if I would trust the
software mirroring. Maybe it's OK but that would be my guess. You might
want to look into a lug in card to do the mirroring.
--
Andrew J. Kelly SQL MVP
"Sameer via SQLMonster.com" <u4996@.uwe> wrote in message
news:58683e30c7369@.uwe...
> Hi Andrew,
> Thanks for coming back....what's a RAM disk ?
> We're using a simple Primary Hard Disk being mirrored to a Secondary Hard
> Disk using Windows Disk Mirroring functionality.
> For Memory, we use 2 X 512 MB chips on 2 dimm slots.
> Next plan is to buy 1 X 1GB memory chip and put it in 1 dimm slot and if
> need
> be, buy another one and fill the last dimm slot.
> --
> Message posted via http://www.sqlmonster.com|||1. Unfortunately, Hardware Mirroring is out of question for our small company,
atleast for now.
But just to verify if Software Mirroring is the cause, then I will
temporarily disable it and observe what happens.
2. I usually get corruptions in 2 tables (each are over half-a-million
records). I don't have any additional indexes in those tables apart from the
Primary Key [Id] column.
......2.1 Is it possible that indexes get corrupted for heavy tables ? If
so, how do I safely rebuild indexes
3. The 2 tables I mentioned above, each table gets populated by its own
corresponding Client VB App developed by one of the consultatnts we had hired.
However, even when I do inspect the code and design of the app, I don't seem
to find any flaws.
......3.1 Could those client apps causing the trouble ?
--
Message posted via SQLMonster.com
http://www.sqlmonster.com/Uwe/Forums.aspx/sql-server/200512/1|||It makes sense that the two most heavily used tables are the ones you see
the most issues with. Since they are changing more than the others there is
more opportunity for issues to arise. But the act of Inserting , Updating
and Deleting too much should not be a concern. Indexes don't get corrupted
after you reach a certain change threshold. I still think the hardware or
now maybe the mirroring software is the root cause. Hardware controllers
that do mirroring these days are extremely inexpensive. Most systems have
the capability built into the mother boards now. I think you will find it
is cheaper than you think and worth exploring.
--
Andrew J. Kelly SQL MVP
"Sameer via SQLMonster.com" <u4996@.uwe> wrote in message
news:586f11a2109af@.uwe...
> 1. Unfortunately, Hardware Mirroring is out of question for our small
> company,
> atleast for now.
> But just to verify if Software Mirroring is the cause, then I will
> temporarily disable it and observe what happens.
>
> 2. I usually get corruptions in 2 tables (each are over half-a-million
> records). I don't have any additional indexes in those tables apart from
> the
> Primary Key [Id] column.
> ......2.1 Is it possible that indexes get corrupted for heavy tables ?
> If
> so, how do I safely rebuild indexes
>
> 3. The 2 tables I mentioned above, each table gets populated by its own
> corresponding Client VB App developed by one of the consultatnts we had
> hired.
> However, even when I do inspect the code and design of the app, I don't
> seem
> to find any flaws.
> ......3.1 Could those client apps causing the trouble ?
> --
> Message posted via SQLMonster.com
> http://www.sqlmonster.com/Uwe/Forums.aspx/sql-server/200512/1

No comments:

Post a Comment