Unresponsive Supermicro X10SDV-TLN4F motherboard

Why does a previously working Supermicro X10SDV-TLN4F motherboard suddenly stop booting or responding to the keyboard?

Background

Back in September 2015 I replaced my home server with a Supermicro SuperServer 5028D-TN4T Xeon D-1540. This is an amazing little box:

  • tiny enclosure;
  • Supermicro X10SDV-TLN4F motherboard with single XEON D-1540 CPU and integrated IPMI and KVM;
  • room for four hot-swappable 3.5 inch drives, plus space for another internally.

I configured this server with 16GB of ECC RAM, four 3GB SAS drives and an internal SAS SSD and run SmartOS from a USB Flash drive. The hard drives are configured into two ZRAID-1 mirrors providing 6GB of storage and the SSD is used as a ZFS cache. It’s blindingly fast and suits my needs perfectly.

This server has been running untouched for nn years and apart from a couple of reboots when I upgraded SmartOS has been running peacefully, and fairly quietly, in my garage cum workshop. However, I noticed that the front cover, which is also the air inlet, was collecting sawdust on it so I decided to power down the server and give a good internal clean.

It’s not coming back

I cleaned the server out and plugged in the power and waited, and waited, and waited for the server to come back on-line. Previously, a reboot took 2-3 minutes for the server to come back on-line. This time, nothing.

I attached a screen and keyboard and could immediately see that the server had not booted: it has tried to boot across the network and finally given up. It was now waiting for a boot disk to be added and a key to be pressed.

To be sure of what was happening, I reset the server and watched what was happening. Sure enough, the server never tried to boot from the USB Flash device and went directly to PXE boot: which failed. It was also ignoring the USB connected keyboard. One more reset confirmed that although there was a brief flash of the keyboard LED, after that, nothing worked. No NumLock, nothing.

Debugging

Stage one

I fired up the built in KVM and, unexpectedly, found that this also ignored the keyboard: even the virtual keyboard. I could control the power, reset etc. but not interact with the console.
That meant I couldn’t see any POST messages. All I could do was a POST Snoop, and that came back with 00: i.e. all OK. No errors from the onboard USB controller, so why no USB connectivity. Strange.

Stage two

To try and get control back, I decided to start by reseting the onboard CMOS. This is a bit fiddly on this board when it’s installed in the case, but I managed it.

No change.

Stage three

Check all plugs and connections in case I jogged something when I cleaned the server out.
I had to take the server out of it’s cupboard for this and put it on the bench.

That seemed to work.
This time, the server booted cleanly and came online. Put the case back on, and the server back in the cupboard.

Once again, no boot action.

Stage four

That seems to point at a physical problem. Time to remove and reseat all the boards and memory.

No change.

Time to dig deeper.

Stage five

There’s nothing showing up in the IPMI, however, the BIOS is very old: only 1.0a and I know from This page on tinkertry.com that the current BIOS is 1.1. However, if I can’t boot, then I can’t update the BIOS.

Next step is to see what happens if I remove the USB Flash Media and pull all the drives.

That made a difference. I now have control via the KVM and can get into the BIOS. The finger points at the USB Flash drive initially.

Update later the same day

Well, changing the USB Flash drive seemed to do the trick, though I’m at a loss to understand why. I booted another box with the “faulty” USB drive with no problems; but as soon as I try booting this box, it fails. I guess it’s a tolerance issue or something.

I still don’t understand how a Flash disk failure can cause the iKVM to fail though. However, I don’t have the time to pursue this further right now. I’m just happy that the server is up again.

Here’s to another two years of uninterrupted service before it needs to be powered down.

New ZFS based NAS and VM Host – part 3

In part 1 of this series, I covered the requirements and hardware. In part 2 I covered the initial configuration of the new server. In this part I’ll cover setting the server up as a file server.

The main use case for this new server is to be the main file and media server for our home. To achieve this I needed NFS, SMB and AFP access to the imported datasets.

NFS access is available by default in ZFS, but to get SMB and AFP access requires software to be installed. As indicated in part 2, you are strongly discouraged from installing software in the global zone. The supported approach is to create a new zone and install the software in there.
Zones are sort-of virtual machines in that each thinks it has exclusive use to hardware and are separate security containers. Where SmartOS differs from VMWare is its hybrid approach to supporting virtual machines. It does this by supporting multiple “brands” of zones:

  • “joyent” branded zones appear to be running SmartOS itself. They have no kernel of their own, they re-use the global zone’s kernel and just provide resource and security isolation.
  • “lx” branded zones appear to be running a specific version of Linux. As with “joyent” zones, they re-use the global zone’s kernel but translate the “brand”‘s system calls into those supported by SmartOS. This gets you the benefits of the software that normally runs on the brand, but without the overhead of having two run the brand’s kernel on top of the SmartOS kernel. The result is near bare-metal speeds. Currently (Sept 2015), SmartOS supports Ubuntu, Centos, Debian, Fedora (maybe others).
  • “kvm” branded zones are more like any other KVM virtual machine: allowing just about any other operating to be installed.

First attempt using an Ubuntu branded zone

This failed, so I’m not going into the detail

You could install Ubuntu in a kvm branded zone, but using an Ubuntu version of a lx zone avoids running two kernels. Base images for many Ubuntu variants exist in Joyent’s public repository, so I simply followed the instructions in the SmartOS wiki to:

  1. import the base Ubuntu 14.04 LTS server image
  2. Create a json file that describes the new zone
  3. Create the new zone using the json file.
    At the end of this, I had an Ubuntu 14.04 virtual machine called capella, on the same IP address as the old server and with direct access to the ZFS datasets containing the files from the old server.

I now followed the guide at outcoldman.comcoldman to install SAMBA and the guide at … to install Netatalk.
At the end, I had a functioning Samba server, but I had trouble with netatalk. My Macbook Air running Mavericks couldn’t connect to Capella using AFP. Investigation showed that the 14.04 version of lx-ubuntu was missing the uams_dhx2.so security module that was needed to support Mavericks.
Note: SmartOS freely admit that branded zones are still being developed

Second attempt using a native SmartOS zone

Rather than spending too much time on this, I exploited one of the major advantages of using SmartOS. I simply deleted the zone, downloaded a basic joyent brand zone, created a new json file and created a new joyent branded zone. It took 5 minutes! I used the following json

{
"hostname": "capella.agdon.net",
"alias": "capella",
"brand": "joyent",
"max_physical_memory": 4096,
"image_uuid": "5c7d0d24-3475-11e5-8e67-27953a8b237e",
"resolvers": ["172.29.12.7","8.8.4.4"],
"nics": [
{
"nic_tag": "admin",
"ip": "172.29.12.11",
"netmask": "255.255.255.0",
"gateway": "172.29.12.1",
"primary": "1"
}
],
"filesystems": [
{
"type": "lofs",
"source": "/data/media",
"target": "/import/media"
},
{
"type": "lofs",
"source": "/data/home",
"target": "/import/home"
},
{
"type": "lofs",
"source": "/data/home/git",
"target": "/import/home/git"
},
{
"type": "lofs",
"source": "/data/public",
"target": "/import/public"
},
{
"type": "lofs",
"source": "/data/software",
"target": "/import/software"
}
]
}

I then installed Samba and Netatalk as before. This time all was well and I now had a functioning NFS, SMB and AFP file server.

I reconfigured the clients to access the new server and I was back where I was before I changed hardware. Simples!

Next Step, install Plex media server, SABNZBD, CouchPotato and Sickbeard to create a fully functioning media server.

New ZFS based NAS and VM Host – part 2

In part one I covered the motivation, requirements and hardware. In this post I will cover software installation and configuration.

Installing and configuring SmartOS

SmartOS differs from many other operating systems, though not FreeNAS, by not requiring installation. You simply copy a disk image to a USB thumb drive and boot from it. SmartOS then creates a RAMDisk, copies itself to the RAMDisk and runs from there. Alternatively you can boot across the network using PXE.

This exposes SmartOS’s primary use case as a data-centre operating system. By not requiring installation, upgrades are quickly deployed by copying a new image to the flash drive and rebooting.

This does have an important side effect however. SmartOS supports the notion of “zones”, first implemented in Solaris. When booted, SmartOS itself runs in the “global” zone. However, because the filesystem is on a RAMdisk, any changes you make do not persist across a reboot. There are ways to get around this so that (e.g.) you can ensure your SSH public key is an authorized_key and you can login without a password; but you are strongly discouraged from installing software in the global zone. More on that later.

Installation

I started by following the instructions in the SmartOS wiki to download the latest SmartOS image and copy it to a 2GB consumer grade USB thumb drive. It’s only 161MB so it didn’t take long.
I booted the server from the thumb drive and, because this was a clean system, I was presented with a wizard that asked for hostname, IP address (or DHCP) and the identities of the drives I wished to use for the “zones” pool; which is used to store all the datasets for the other zones.

Rather than risk getting it all wrong, I chose the first two HGST drives and put them in a mirror. After configuring the zpool, I was presented with the login prompt.

The default login is root/root, so I immediately changed the root password!

zpool status showed

# zpool status
pool: zones
state: ONLINE
scan: resilvered 1.98G in 0h0m with 0 errors on Fri Sep 11 13:22:01 2015
config:
NAME        STATE     READ WRITE CKSUM
zones       ONLINE       0     0     0
  mirror  ONLINE       0     0     0
    c1t0d0  ONLINE       0     0     0
    c1t1d0  ONLINE       0     0     0

After this, I added the other two drives in as a second mirrored vdev using

zpool add zones mirror c1t2do c1t3do

and then added the SSD as an slog

zpool add zones log c1t4do

at the end of this zpool status showed

# zpool status
  pool: zones
 state: ONLINE
  scan: resilvered 4.01G in 0h0m with 0 errors on Fri Sep 11 13:38:15 2015
config:

        NAME       STATE      READ WRITE CKSUM
        zones      ONLINE        0     0     0
          mirror-0 ONLINE        0     0     0
            c1t0d0 ONLINE        0     0     0
            c1t1d0 ONLINE        0     0     0
          mirror-2 ONLINE        0     0     0
            c1t2d0 ONLINE        0     0     0
            c1t3d0 ONLINE        0     0     0
        logs
          c1t4d0   ONLINE        0     0     0

errors: No known data errors

This whole process took about 30 minutes (including downloading and copying the SmartOS image)

Moving data from the old server

Now that I had the new server installed and ready to go, I needed to copy the data across from the old server. There are a number of ways to do this:

  1. Physically move the disks across
  2. Use ZFS Send/ZFS receive to copy the data across the network
  3. Use rsync to send files.

As the old server was still running, I didn’t want to move the disks, but experiment showed it was going to take days to copy the data across my network. So I compromised.

I split the mirror on the old server and moved one disk to the new server. I then imported it:

zpool import rdata

It showed up as a degraded mirror. To get the data across, I did the following:

# zfs create snapshot rdata@export
# zfs create zones/import -o mountpoint=/import
# zfs snapshot -r rdata@export
# for dset in "media public software home"; do zfs send -R rdata/${dset}@export | zfs recv data/${dset}; done

After checking the data was there I cleaned up:

zpool export rdata
poweroff

I’ll keep this as an archive disk and re-use the one in the old server.

Next Steps

The initial use for this server is as a file/print server and as a media server. In the next post, I’ll cover how I did this.

New ZFS based NAS and VM Host – part 1

For some time I’ve been using a re-purposed Acer desktop PC as a NAS. It runs OpenIndiana with 2 x 3TB disks in a ZFS mirror. It has Napp-IT installed for administration.

It’s been OK, but it’s not very powerful and only has limited throughout on its single 1000BaseT Ethernet connection. I’ve been considering an upgrade for ages, but finally bit the bullet this week. I’ll cover the requirements and spec in this post and then the build in a followup.

Requirements

The current server was just a file server. I did try to run another zone on it but it couldn’t hack it. I wanted to get back to a position where I could run Virtual Machines as well. The main driver was to be able to use the NAS as a Crashplan host. There is a version of Crashplan for Solaris, but at the last major upgrade they dropped support for being the destination of a Crashplan. I still had the cloud backup but it was nice to have a local replica as well. So, being able to have the Linux version running in a  VM would be good.

  • NFS access from a bunch of RaspberryPi devices in the house
  • CIFS access from PCs and the SONOS devices
  • NETATALK access from my Macbook Air and the Apple TV
  • Support for running multiple VMs: including
  • Windows Home Server V2 to back up the Windows PCs
  • Ubuntu to host Crashplan

Whilst noodling about these requirements I was also thinking about replacing the Thinkpad that runs my radio software in the shack. A lightbulb moment made me reconsider how the IT is structured here.

Currently, I have a CAT6 network throughout the house and down to the shack. The Acer sits in the garage attached to the house and there is a FreeNAS server in the shack who’s main purpose is to be a backup for the Acer. It also has a jail running some scripts to keep backups of the RaspberryPi devices. I then have a Thinkpad T60 as my shack computer.

The idea is to move the FreeNAS device to the garage and put the new server in the shack. If I ensured I could do IO virtualisation (so a VM could make best use of a video card and get isochronous access to USB devices) and ensured it had a good video card then I also use the new server as my shack computer.

Solution

Hardware

To get the virtualisation features means an Intel Xeon class processor or the AMD equivalent. I’ve been a fan of Supermicro for some time and saw that the X10SDV-TLN4F looked to be perfect. I did some research and came across this post by Benjamin Bryan on a Supermicro Datacentre in a box using a close relative of this board. Perfect.

In the end I opted to buy the SYS-5028D-TN4T barebones server which includes this motherboard in the stunning CSE-721TQ-250B case. This has four front access 3.5 drive slots and two internal positions for 2.5 drives. I also bought 32GB ECC memory from Crucial and four HGST 2TB drives from Hitachi.

This is an expensive build but I think it will be worth it. I haven’t bought the video card yet.

(Incidentally. Years ago (back in the 90’s), the rule of thumb was that the computer you really lusted after always cost £2000. For a while that hasn’t been true for desktops, but I reckon it’s still good for servers).

Software

In Ben’s build he opted for the Napp-IT in one approach of Illuminos/ESXi and then VMs. However, in the comments there were references to SmartOS. Having this would avoid the need for a PCI Host Bus Adapter and would be simpler.

SmartOS supports zones like Solaris and OpenIndiana but adds support for KVM virtual machines. SmartOS is run from a flash drive and builds ZFS zpools from the disks. Because you are starting from a flashdrive, SmartOS runs from a RAMDrive and isn’t persistent. This means the global zone should be kept simple: i.e. so you don’t install software in it. Instead, create another zone and install software in there.

The beauty of SmartOS zones is that they use the same kernel as the global zone: i.e. you only need space for new software and any data. What happens is that the new zone os created from a ZFS snapshot of the global zone. Elegant!

More on the build itself in part 2.

Update on contact lenses

In short, I gave up.

In the first post I mentioned that I felt OK with the single vision lenses but that I hadn’t tried driving with them in. As soon as I did, I started to notice problems.

The main one was that I couldn;t focus on vehicles following behind using the internal mirror: it just stayed blurred. The other, more troubling, artifact was dazzling at night and coronas around car lights. The first time out at night I had to pull over and take them out. It was downright dangerous.

After a further consultation it became apparent that I am actually left eye dominant whereas the optician (and I) previously thought I was right eye dominant. That explained the problem with the mirror: the left eye had a distance vision lense. After some discussion, we moved to varifocal toric lenses. However these were monthly’s and took 10 days to obtain.

I really tried to get on with these, but the night vision problems remained and my vision just wasn’t as good as with glasses. Eventually I admitted defeat until something new comes along.

Updating my portable station with a KX3

After receiving quite a good annual bonus, I looked at my wish list and decided to buy a new portable HF radio. Based on my massive satisfaction with the Elecraft K2, the decision was quite easy really. Yes, I’ve joined the KX3 brigade.

IMG_0166

I ordered it in kit form from Waters & Stanton on Good Friday and it appeared at my front door the following Tuesday. I reckon it took about 3 and a half hours to assemble. Why would anybody buy it assembled?

Included in the purchase were the KXFL3 roofing filter, KXAT3 internal ATU and KXBC3 battery charger. I also bought the MH3 microphone.

I debated about the ATU and charger as I will mostly be using this portable. I nearly always use tuned antennas when portable and I will be using a LiFePo4 battery pack for external power. That said, there are bound to be occasions when I would like to tune up a piece of string; and I’m nervous about taking the (homemade) LiFePo4 4S1P pack through airport security. So, I may end up using some Panasonic Eneloop Pro 2450 mAh NIMH batteries for overseas activations.

First impressions are excellent. The receiver sensitivity is superb: even better than the K2, which is pretty good.

Day One with contact lenses

I’ve been wearing glasses for about 15 years now. I have classic Presbyopia: also known as short arms 🙂 My distance vision is still nearly perfect in one eye and not much worse in the other. My close up vision has been deteriorating for about 16-17 years though.

I moved over to Varifocals years ago because I was fed up with having to take my glasses off to look at somebody across a table and then having to put them back on to read a laptop screen. More recently I’ve had to add dedicated reading glasses and mid-vision computer glasses for when I’m at home and using two big monitors. I’ve been considering switching to contact lenses for some time, and after a friend of mine also made the change and reported back that was all was well, I decided to stop hovering and to jump in.

My prescription was pretty current, so it was simply a matter of talking it through with the optometrist in Boots; trying out some test glasses; and saying yes. Pretty painless so far.

I’ve got single-use mono-vision lenses. The right hand one is optimised for distance and the left hand one is a compromise between reading and computer work. I had them fitted yesterday and spent a couple of hours with them in.

I had no major difficulties putting them in or taking them out. Once I had got used to handling them it was straightforward. It appears that I don’t mind touching my eyeball, so that’s good.

It was strange for the first few minutes but then my brain started to adjust for the differences, and by the end of the two hours I could see clearly and focus at all distances. I didn’t drive and I didn’t try the computer monitors though. That’s a job for today.

KSB2 problem found, and solved

I finally solved the problem with my KSB2 board. It turned out to be a poorly soldered joint on one of the transformers that terminate the filter. I found it by writing down everything I knew about the problem and using that to systematically exclude different elements in the system.

The final clue was that receive audio was poorer through the KSB2 filter than through the variable filter on the RF board. Poorer, but not absent. That pointed me back to the filter.

I was ready to rip the filter out and rebuild it from scratch, but decided to have one more look at all the joints using an X3 magnifier. I could just see that one of the leads from T1 primary had an annulus of solder and that I could see the copper end of the wire. Sussed: I obviously hadn’t fully stripped the insulation when I made the transformer. A continuity tester confirmed the problem: high, but not infinite, resistance.

It was a 5 second job to re-solder the joint: leaving the iron on long enough for the remaining insulation to bubble off. Just for completeness, I re-did all the other solder joints as well.

The result is that the board now works fine. I’m not completely happy with the performance, or of the filter alignment, but that’s for the next post.

More tales of woe debugging a KSB2

For a couple of months now, I’ve been trying to get to the bottom of the problem with my KSB2 board: the SSB adapter in my K2. Not continuously of course: I’ve had all sorts of other things to do. I’ve built the KIO2, KDSP2, and KAT2 modules as well. However, these were really just ways of resting my brain from the toil of trying to get the KSB2 working properly.

The problem manifests itself as an almost, but not quite, total absence of RF when I use SSB. The K2 works fine on CW.

Looking at the circuit, when the KSB 2 is installed, it intercepts the Rx IF path with a controlled bypass around the built in variable bandwidth IF filter. On Tx, the audio from the Mic (which is unused on CW) is processed and turned into a DSB (Double Side Band) signal at the Intermediate frequency before passing through the new filter backwards (which filters out the unwanted sideband) and then passing back into the main circuit.

The user can choose whether the received signal passes through the variable bandwidth filter on the RF board, or through the fixed filter on the KSB2 board. On transmit, only the fixed filter on the KSB2 board is used.

Thus, the fixed filter is used on Rx and Tx. I reasoned, wrongly I now suspect, that because I could choose the fixed filter on SSB receive, and hear signals, I could exclude the filter as being the source of the problem.

That left the audio processing section of the KSB2 board, and the ALC (Automatic Level Control) circuit. The latter is different from that used on CW and serves to adjust the power of the RF signal to match the level chosen by the user. Because we are talking about a single side band signal, all of the output is in the frequency-translated audio signal. The output power is directly proportional to how load you talk into the microphone. To get better “punch” to get through interference, fading and other signals, hams often raise the Mic gain to the point where they risk “over-driving” the Power Amplifier and sending out a distorted signal. This is “a bad thing” but is often by hams.

As the Mic gain is increased, the PA tries to maintain itself in a linear configuration (to minimise distortion) by “feeding back” a proportion of the output signal to an earlier stage and use that signal to lower the gain of the PA. This is the ALC signal.

Most manufacturers have a front panel meter that indicates the magnitude of the ALC signal and leave it to the user to adjust the Mic gain to a point where the ALC is at their desired level. Elecraft has chosen to adopt a smarter approach. The user uses a menu to choose a desired level of over-driving and the rig then operates automatically to maintain that level.

The K2 uses different ALC circuits on CW and SSB, so I suspected this section of the KSB2 because if it wasn’t operating properly, it could suppress the drive to the PA too much and result in insufficient RF power: exactly the symptom I was observing.

I spent ages trying to understand the ALC circuit and determining if it was working properly. To get anywhere, I had to assure myself that all the preceding circuits were working properly: Mic processing, DSB generation in the mixer and sideband removal in the filter. To do this, I needed a standard to work against.

From the elecraft list I determined that 100mV of AF at 1kHz should produce 100% of rated power at whatever level I chose. i.e. if I wanted 5W I should get 5W. I set up a signal generator to inject a 100Hz sinewave at 100mV pk to pk into the mic socket and then used my oscilloscope to trace the signal through. It soon became apparent that the problem lay after the first mixer. Although I was seeing 1V pp at the output of the first mixer, I was seeing virtually nothing in the filter.

I don’t have the test gear to debug the filter, so I have reached out to the email list for assistance.

More modules built

IMG_0001.JPG

After a break, I’ve been back building more K2 modules. I’ve built the KIO2, KNB2, KDSP2 and KAT2 modules over Christmas. That’s the lot.

All but the KATU module worked straight away, but I’ve got a problem with the latter. I’ve still got to resolve the lack of Mic gain on SSB and absence of RF in 20m before I try and debug the ATU.

The picture shows the KDSP2, KNB2 and KAT2 modules. The KIO2 is already attached to the top cover of the radio.