Continuing crashes are driving me nuts!

R

Robbie Hatley

I'd written here a few weeks ago about the following two

recurring problems on my Win2K system (SP4 w. all latest

updates):



1. Sporadic crashes, every day or two: usb mouse freezes,

network disconnects, and sound cuts in and out.



2. Sometimes system won't wake from standby.



Since then, I've determined that problem 2 is really just

a case of problem 1: the system crashes while in standby

(or at the moment of trying to come out), so of course it

won't "wake".



Recently my system has been crashing 2-3 times a day,

which is driving up the wall. The crash is always exactly

the same:



- USB freezes (and hence USB mouse and printer go offline)

- Firewire freezes (and hence external hard disk goes offline)

- Ethernet freezes (and hence Internet goes offline)

- orange error LED on NIC is flashing about 3 times per second.

- orange light on router is flashing about 3 times per second.

- sound starts cuting in-and-out about 3 times per second.

- Serial mouse and PS2 keyboard do NOT freeze.



While in this "crashed" state, I've been able to determine (using

serial mouse and PS2 keyboard) that:



- Device Manager shows no malfunctioning devices

(even though many devices have actually FAILED)



- Event Viewer shows no unusual events have occurred

(even though a VERY unusual event has occurred)



I notice that most of the crashed subsystems are networks.

Is there anything in Windows 2000 system internals that would

effect all 3 networks (USB, firewire, and Ethernet)?



Or is it more likely that this is a hardware issue? I can't

help but notice that all of the affected systems are controlled

by the same IC on the motherboard, namely the Southbridge chip.

I hope it's not the motherboard going south (pardon the pun),

as I can't afford to replace it right now.



It's also interesting that several different things are

oscillating at about 3 per second, as if something is stuck

in a loop. But I don't know what would cause that.



Thoughts? Opinions? Comments? Suggestions?



--

Exasperated,

Robbie Hatley

lonewolf at well dot com

www dot well dot com slant tilde lonewolf slant
 
J

John John - MVP

Are the "crashes" generating BSODs or memory dump files? Are there any

errors in the Event log? Disable the automatically restart on system

failure setting and configure the system to halt and generate a BSOD on

failure, this can be done via the System Properties/Advanced tab or via

the HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\CrashControl value:



http://support.microsoft.com/kb/307973

How to configure system failure and recovery options in Windows



http://www.microsoft.com/windows/wi...ysdm_advancd_startrecover_recovery.htm?id=899

Specify what Windows 2000 does if the system stops unexpectedly



John





Robbie Hatley wrote:

> I'd written here a few weeks ago about the following two

> recurring problems on my Win2K system (SP4 w. all latest

> updates):

>

> 1. Sporadic crashes, every day or two: usb mouse freezes,

> network disconnects, and sound cuts in and out.

>

> 2. Sometimes system won't wake from standby.

>

> Since then, I've determined that problem 2 is really just

> a case of problem 1: the system crashes while in standby

> (or at the moment of trying to come out), so of course it

> won't "wake".

>

> Recently my system has been crashing 2-3 times a day,

> which is driving up the wall. The crash is always exactly

> the same:

>

> - USB freezes (and hence USB mouse and printer go offline)

> - Firewire freezes (and hence external hard disk goes offline)

> - Ethernet freezes (and hence Internet goes offline)

> - orange error LED on NIC is flashing about 3 times per second.

> - orange light on router is flashing about 3 times per second.

> - sound starts cuting in-and-out about 3 times per second.

> - Serial mouse and PS2 keyboard do NOT freeze.

>

> While in this "crashed" state, I've been able to determine (using

> serial mouse and PS2 keyboard) that:

>

> - Device Manager shows no malfunctioning devices

> (even though many devices have actually FAILED)

>

> - Event Viewer shows no unusual events have occurred

> (even though a VERY unusual event has occurred)

>

> I notice that most of the crashed subsystems are networks.

> Is there anything in Windows 2000 system internals that would

> effect all 3 networks (USB, firewire, and Ethernet)?

>

> Or is it more likely that this is a hardware issue? I can't

> help but notice that all of the affected systems are controlled

> by the same IC on the motherboard, namely the Southbridge chip.

> I hope it's not the motherboard going south (pardon the pun),

> as I can't afford to replace it right now.

>

> It's also interesting that several different things are

> oscillating at about 3 per second, as if something is stuck

> in a loop. But I don't know what would cause that.

>

> Thoughts? Opinions? Comments? Suggestions?

>
 
R

Robbie Hatley

"John John - MVP" asked of my recent crashes:



> Are the "crashes" generating BSODs or memory dump files?




Nope. Windows doesn't even seem to know that a major crash

has occurred. The crash is apparently either in the network

drivers (USB, Firewire, and Ethernet, all at once), or in

the hardware.



> Are there any errors in the Event log?




Lots of events logged, but none are right at time of crash,

and none seem relevant to the crash. They range from

"indexing service has started", "doing master merge on

F:\System Volume Information", "bad sectors found on

drive X (cdrom)", etc.



> Disable the automatically restart on system failure setting and

> configure the system to halt and generate a BSOD on failure, this

> can be done via the System Properties/Advanced tab or via the

> HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\CrashControl

> value:

>

> http://support.microsoft.com/kb/307973

> How to configure system failure and recovery options in Windows

>

> http://www.microsoft.com/windows/wi...ysdm_advancd_startrecover_recovery.htm?id=899




Ok, I made sure it's set to write an event, do a memory minidump,

do NOT overwrite old dump files, and do NOT automatically reboot.



> Specify what Windows 2000 does if the system stops unexpectedly




The system eventually DOES "stop", about 30 minutes after the networks

(and hence usb mouse, internet, printer, and external hard disk)

go offline. But when it stops, it just STOPS:

1. System no longer responds to mouse or keyboard

2. Whatever image was on the screen is frozen.

3. No sound from sound card, but on-board beeper often sticks "on"

(screams like banshee).



No BSOD or memory dump occurs.



I see to my surprise that the C:\WINNT\minidump folder is not empty.

There are 3 dumps in there. But the most recent is from about

10 months ago. None of my recent crashes has generated a dump,

even though Windows was set to generate minidumps on hard crashes.



I fear that these crashes are either at the driver level or at

the hardware level. Even when "crashed", the kernel may still

be running, thinking that everything is still hunky-dory.



--

Cheers,

Robbie Hatley

lonewolf at well dot com

www dot well dot com slant tilde lonewolf slant
 
R

Robbie Hatley

A few days ago I'd written here:



> [I've been having] Sporadic crashes, every day or two ...

> ... The crash is always exactly the same:

>

> - USB freezes (and hence USB mouse and printer go offline)

> - Firewire freezes (and hence external hard disk goes offline)

> - Ethernet freezes (and hence Internet goes offline)

> - orange error LED on NIC is flashing about 3 times per second.

> - orange light on router is flashing about 3 times per second.

> - sound starts cuting in-and-out about 3 times per second.

> - Serial mouse and PS2 keyboard do NOT freeze.




Well, I made one very slight change that altered all that dramatically!



I simply removed the following value from my registry:

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Run]

"IntelliPoint"="\"C:\\Program Files\\Microsoft IntelliPoint\\ipoint.exe\""



(The only immediate downside is that certain advanced features of

my usb optical mouse go offline, but I never use those features,

so that's no big deal.)



Now the system still crashses repeatedly, but in a very different way.

Internet and sound don't go offline, and usb mouse doesn't go offline.

Instead, when a crash occurs, several OTHER things go offline:



1. Auto-hidden taskbar won't respond to mouse. It can only be unhidden

by pressing ctrl-esc. It pops up, but when I hover mouse pointer

over task bar, pointer is always "vertical resize", whereas it

should be "regular arrow". The icons in my toolbars don't

3d-highlight on rollover as they should. Clicking icons does nothing.



2. Tab controls freeze. In any window with a tab control, I can not

change the "current" tab to a different tab. Clicking the tabs

does nothing.



3. Performance data goes offline. Task manager shows all processes

as using 0% CPU time. All processes are using 0 bytes of memory.

Total memory used: 0 bytes. Total CPU % used: 0%.



COMPLETELY DIFFERENT FROM BEFORE! And yet, all due to removing

one driver from memory! I don't think the driver I removed was

causing the earlier crashes, else the system wouldn't still be

crashing. But I think it moved drivers around in memory.

Something was (and still is) over-writing system modules in memory,

of that I'm now sure.



And I now very much think this is NOT a hardware issue, else the

nature of the crashes would not have changed so dramatically.

I think it's a balky service or background daemon that's

overwriting system modules in memory. With that in mind, I think

I'll start stripping away unneeded services and daemons and startups,

one by one, a few days apart, and see what happens.



I'm curious: are there any background daemons or services which

are notorious for causing these kinds of problems?



--

Cheers,

Robbie Hatley

lonewolf at well dot com

www dot well dot com slant tilde lonewolf slant
 
S

Steve

On Feb 23, 4:19 pm, "Robbie Hatley"

wrote:

> Well, I made one very slight change that altered all that dramatically!




This sounds very strongly like a memory problem. If you can somehow

test memory, I'm betting you'll find a bad memory location at the root

of the problem. I've seen such a thing several times, most recently on

my W2K system. Swap the location of memory sticks and the nature /

timing of the problem will change. If at all possible (borrow if

necessary), swap different sticks completely out of the system and see

which one the problem follows. My W2K system is a bit long in the

tooth, and I'm working on a replacement anyway. But I recently had to

remove 1 GB of memory, leaving me with only 0.5 GB. Surprisingly,

especially given the amount of multitasking I do (usually around 10 to

20 apps open at once), the system actually seems to run as fast and

maybe a smidgen faster now. In any case, the crashes are gone.



Another approach to troubleshoot this is to turn off "fast boot" or

similar options in the BIOS, allowing the full memory test to run at

boot-up.When I did that, the POST very clearly and repeatably flagged

a memory error.



Hope this helps.



Steve Hendrix
 
R

Robbie Hatley

"Steve" wrote:



"Robbie Hatley" wrote:



> > Well, I made one very slight change that altered all that dramatically!


>

> This sounds very strongly like a memory problem...




I'd think a memory problem would cause malfunctions more severe

and far more immediate. For example, if the module for tab

controls (user32.dll, perhaps?) was loaded into bad memory at

startup, then tab controls would be offline IMMEDIATELY. But

they weren't. They went offline some hours later. In other

words, the memory was physically good, but got over-written.



And I now know by what! Yes, I fixed it. More on that in

a separate post.



> ... Another approach to troubleshoot this is to turn off

> "fast boot" or similar options in the BIOS, allowing the full

> memory test to run at boot-up. When I did that, the POST very

> clearly and repeatably flagged a memory error...




:) One of the first things I thought of. But the memory

POST cycles slowly through all one billion bytes of my RAM

and finds no problem.



See my next post in a new thread for more.



.... TO BE CONTINUTED ...



--

Cheers,

Robbie Hatley

lonewolf at well dot com

www dot well dot com slant tilde lonewolf slant
 
Back
Top Bottom