Dell 2970 – Crashes / Reboots

Error on screen:

HyperTransport error caused a system reset Embedded I/O Bridge Device 2

SEL will show:

Chipset Err: Critical Event Sensor PCI Err (BUS 0 Device 7 Function 0) was asserted

There is a discussion about it here (with no real solution):
http://en.community.dell.com/support-forums/servers/f/946/t/19281276.aspx

I replaced both power supplied with known good units from a Dell 2950. No effect.

My SEL was filled with these errors and, obviously, my server was rebooting frequently.

I noticed NO similarities or triggers. OS did not matter, CPU load did not matter, disk activity did not matter, etc etc.

 

THE SOLUTION THAT FIXED THIS ISSUE WAS A REPLACEMENT MOTHERBOARD FROM DELL.   THE ORIGINAL DESIGN IS CURSED.

VMware Virtual Machine Hosting

ZFS + List all snapshots

zfs list -t snapshot

Example Output:

nas1# zfs list -t snapshot
NAME USED AVAIL REFER MOUNTPOINT
nfs/datastore@rep-init-20110113013713 18K – 21K –
nfs/datastore@rep20110113013825 311M – 311M –
nfs/datastore@rep20110113015314 0 – 68K –

VMware Virtual Machine Hosting

ZFS NFS Export

Activate NFS on the pool:

zfs set sharenfs=on

Now restrict access by IP range:

zfs sharenfs=”-maproot=root -alldirs -network 10.10.40.0 -mask 255.255.255.0″

Verify using showmount:

nas1# showmount -e
Exports list on localhost:
/nfs/datastore 10.10.40.0
/nfs 10.10.40.0

VMware Virtual Machine Hosting