<br><br><div class="gmail_quote">On Wed, Feb 11, 2009 at 11:36 PM, Steven Adeff <span dir="ltr"><<a href="mailto:adeffs.mythtv@gmail.com">adeffs.mythtv@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div><div></div><div class="Wj3C7c">On Wed, Feb 11, 2009 at 9:10 PM, <<a href="mailto:jarpublic@gmail.com">jarpublic@gmail.com</a>> wrote:<br>
> On Wed, Feb 11, 2009 at 8:16 PM, Brian Wood <<a href="mailto:beww@beww.org">beww@beww.org</a>> wrote:<br>
>> On Wednesday 11 February 2009 17:42:37 <a href="mailto:jarpublic@gmail.com">jarpublic@gmail.com</a> wrote:<br>
>> > At this point I am getting off topic for this list. It is certainly some<br>
>> > hardware failure. When it fails I can't get it to reboot. When I try to<br>
>> > boot from a live CD I get the same kernel panic. However, I would hate<br>
>> > get<br>
>> > rid of the whole system, just because I am too ignorant to track down<br>
>> > exactly which pieced of hardware is failing. Does anybody know a good<br>
>> > linux<br>
>> > list that may be able to help me track down which bit of hardware is<br>
>> > going<br>
>> > bad? It is especially challenging because if I let the system sit for a<br>
>> > while it will boot up an work fine for some some indeterminate amount of<br>
>> > time. I have used lm-sensors to track temps and nothing seems to be hot,<br>
>> > all of the fans are running, and I have checked all of the drives for<br>
>> > bad<br>
>> > blocks. I don't know what else to do at this point. I don't want to<br>
>> > bother<br>
>> > the list anymore but does somebody know the right group to bother about<br>
>> > troubleshooting linux hardware?<br>
>><br>
>> A machine that always works after being off for a while probably has some<br>
>> sort<br>
>> of thermal problem. Sensors are seldom helpful, as this could be on just<br>
>> about anything, chips, resistors, or even solder connections.<br>
>><br>
>> You might try cooling various components with freeze-spray, that sometimes<br>
>> helps identify this sort of trouble. Remember that if the problem is on a<br>
>> chip die or the like it will take several seconds at least before things<br>
>> start to work after you spray it. Don't be impatient, or you will have<br>
>> sprayed lots of components and not know which one it was if it starts<br>
>> working.<br>
>><br>
>> Otherwise, unless you have a lab full of test gear, the only practical<br>
>> troubleshooting method is substitution, replace things one by one with<br>
>> known<br>
>> good replacements until you find the problem.<br>
>><br>
>> I'd suspect the PSU first, but YMMV.<br>
><br>
><br>
> A thermal problem seemed to be the most likely problem to me, but I wasn't<br>
> sure how to narrow this thing down. I didn't really consider the power<br>
> supply because it doesn't completely crash. It just freezes on the current<br>
> screen, and I lose all input and network. Even if I had hardware around to<br>
> switch out the problem is made complicated by the fact that even the bad<br>
> hardware works for some of the time. So it would be hard to say if switching<br>
> a component out help things work because of that component or because the<br>
> failing component happens to be working at that moment. The kernel panic<br>
> comes up immediately after grub before anything happens. So I was hoping<br>
> that it would be simple to narrow it down to a drive or perhaps there was<br>
> some way to get me some fore verbose error messages.<br>
><br>
<br>
</div></div>peripherally following this thread, but I have to agree with Brian<br>
that the first thing I would check is the power supply. I've seen<br>
similar issues arise from power supply's on their last legs.<br>
other than that, without one of those PCI slot-based hardware testers<br>
it could be very hard to figure out without swapping out hardware<br>
piece by piece.<br>
<br>
-<br></blockquote></div><br>Unfortunately it is on old Dell workstation that was decommissioned from school. It has some large flat proprietary PSU that covers the bottom of the whole case. I don't think it would be easy to replace. It is a P4 beast and is big and loud. Maybe it is time to move on. I just have a hard time getting rid of old hardware if I can keep it working for something.<br>