zerosleeps

Since 2010

New Rails projects still wet the bed

Here we go again. Past all the Ruby/Rubygems/Node.js/Yarn nonsense and on to creating a shiny new Ruby on Rails project:

% ruby --version
ruby 3.0.1p64 (2021-04-05 revision 0fb782ee38) [x86_64-darwin20]
% rails --version
Rails 6.1.4
% rails new foo

At this point you’ll have about 150 gems installed and 687 (!) folders in node_modules 🙄 Anyway, switch to the brand new, everything-as-per-defaults project and run rails test:

% cd foo
% ./bin/rails test

And what do you know?

/Users/scott/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/bootsnap-1.7.5/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:34:
in `require':
cannot load such file -- rexml/document (LoadError)

There’s an issue about this but it’s been closed with a “not our problem” type of response:

This is an issue with selenium-webdriver … Could you please report there?

But the thing is Rails developers, the dependency on selenium-webdriver is your dependency. It’s in the Gemfile created by default. Out of the box Rails does not work.

I believe this is the first time I’ve attempted to spin up a new Rails application since my last rant back in January, and it’s sad that I wasn’t in the least bit surprised that the defaults don’t work.

I can only assume Rails isn’t interested in attracting new users, because goodness knows why anyone would put up with the amount of shit you need to know about and debug from the first minute of using this stuff.

Ubiquiti USG: the conclusion

Predictably and frustratingly, Ubiquiti’s support staff have just shrugged their shoulders at my USG’s bizarre failure, and pointed me at a document explaining how to return it to them, even though the device is over two years out of warranty (which I was careful to point out in my first contact with Ubiquiti) and therefore not eligible for their RMA process.

I knew going into this that there wasn’t a hope in hell I’d get to talk to someone who could at least hypothesise about what’s gone wrong with my little gateway. Could have been a fun project to replace the busted chip/capacitor/whatever.

CalDigit TS3 Plus, Logitech StreamCam, and macOS

It’s another of my “putting this here in case it’s useful to someone” posts.

I have Logitech’s StreamCam, and when I connect it to one particular port on my CalDigit TS3 Plus - which is in turn connected to my MacBook Pro (16-inch, 2019) running macOS Big Sur 11.1 - I can receive audio from the StreamCam, but no video.

I contacted CalDigit about this, and they got me to triple check the port combination:

MacBook Thunderbolt port TS3 port Result
Back left Back Thunderbolt port OK
Back left Back USB Type-C port No video
Back left Front USB Type-C port OK
Front left Back Thunderbolt port OK
Front left Back USB Type-C port No video
Front left Front USB Type-C port OK
Back right Back Thunderbolt port OK
Back right Back USB Type-C port No video
Back right Front USB Type-C port OK
Front right Back Thunderbolt port OK
Front right Back USB Type-C port No video
Front right Front USB Type-C port OK

I also confirmed that the back USB Type-C port works a treat with other devices I connect to it.

CalDigit’s response:

This appears to be a conflict in macOS between the Streamcam and the 10gbps Controller on the dock. The USB-C 10gbps port and the USB-A port beside it are part of a different controller than the other ports on the dock. I’m afraid we don’t have another workaround aside from connecting to one of the other USB-C ports on the dock. As the drives are working through the same port, its likely related to compatibility over macOS.

Thumbs up to CalDigit support by the way. Fast, friendly, and knowledgeable.

Dead (but not dead) Ubiquiti UniFi Security Gateway

This is the story of my USG, which has failed in the weirdest way. The summary, which is repeated at the end of this post, is that after years of flawless operation it suddenly stopped communicating with the rest of my LAN via one particular port, but it can still send data on that port, has other ports which function perfectly, and the problem is definitely not software-related.

This isn’t a how-to or anything, because I can’t remember every command I typed or setting I changed, but I’m putting the general gist out there because it might help some poor sod someday. If you’re hoping for a solution, stop reading, but if you know anything about network equipment, please send help. I sent out a couple of cries for assistance about this on Ubiquiti’s community forums and on reddit, but crickets.

Basic troubleshooting

So one afternoon we suddenly lost internet connectivity. First thing I did was jump onto my UniFi Controller to see where the problem lay, and it told me that the USG had failed to adopt. Since the device was previously working fine, that status was UniFi saying “it’s disappeared from your network”.

I can honestly say that in three years of running Ubiquiti equipment I’ve never had to reboot or reset anything. In fact, it was because I got so fed up with flaky crappy all-in-one consumer “routers” that I went all-in on Ubiquiti in the first place. Anyway, that’s where I started - pull power from the USG, count to 10, plug it back in.

All indications were good: all the status lights and Ethernet link lights illuminated, flashed, and changed colour as they should, but the Controller still couldn’t find the USG.

It’s worth pointing out that as well as routing traffic between WAN and LAN and performing firewall duties the USG also provides DHCP services, so LAN traffic was starting to fail as well: new clients couldn’t join the network and existing DHCP leases couldn’t be renewed. Fabulous.

I tried a few reboots of the entire network stack with no change, so I dug out a paper-clip and did a hardware reset of the USG. Again, all the lights did the correct thing, and the USG appeared ready to adopt, except… it still didn’t appear to be connected to the network. Sure it was physically connected - OSI layers 1 and 2 looked good - but nothing above that. After a reset of that nature the USG sets it’s LAN port to 192.168.1.1/24 and runs a DHCP server, so at this point it should have been dishing out DHCP leases and routing traffic, even without adoption.

Corrupt storage?

I then ended up on a path to nowhere. I discovered that the USG boots from a USB drive tucked away inside it. There were lots of grumbles online about these internal drives failing. I was reasonably confident that wasn’t my problem, because my USG was booting. Well the status lights certainly indicated it was booting, and if the boot drive had failed it would have been more obvious, but with no way of communicating with the device via Ethernet I didn’t know for sure. So I replaced the internal drive, farted about with dd and .img files and blah blah blah. Didn’t make the blindest bit of difference so I’m not going to dwell on this part of story. Sure did learn a lot about USB drive initialisation times and Octeon boot commands though…

There’s nothing wrong with this thing

I bought a rollover cable from eBay. The USG has a console port, and with no packet data it was my only hope of making progress. The cable arrived, and immediately revealed that the USG was indeed properly booting, all services were running, and it correctly reported eth1 link status. It was also connected to the WAN (the USG is a regular DHCP client on that side of the network) and able to communicate with the outside world.

What the hell? At this point I’m starting to suspect a bad port, despite positive link status. But that can’t be a thing that happens, can it? Let’s see if we can confirm that:

As well as console, WAN, and LAN ports, the USG has a third port which is labelled “WAN 2 / LAN 2”, and it shows up in software as just another Ethernet interface. I mucked about for a while trying to work out how to configure this port to assume the duties of eth1. To Ubiquiti’s credit, once you understand a few of EdgeOS’s basic commands it’s actually pretty enjoyable to view the device’s configuration settings and change them.

To my surprise I managed to set up LAN 2 and disable LAN 1, and it worked. Connecting a device to LAN 2 immediately resulted in a DHCP offer, and all traffic flowed as it should. As it did before the start of this story. With the same attached devices and cables!

So there we are: bad port. Well, yes, I’m 90% sure about that, but there’s more…

(By this stage I’d already bought a new USG, and there was no way I was going to put the broken USG back into service. I can’t trust it, and I’m pretty sure the UniFi Controller will fight me at every step of the way if I try to use LAN2 instead of LAN1.)

Can speak but not hear

While I was poking about via console during all of the above, I discovered that I could capture traffic flowing through the USG’s ports using a command like show interfaces ethernet eth1 capture, and I could see some activity being logged. Surely if the port was dead nothing would work, or it would fail in some other obvious way: no link light, no traffic, or errors in a log perhaps, but as far as I could tell the USG was reporting all systems green. Remember I mentioned the OS was even correctly reporting link status in addition to this trickle of data I was now seeing.

I cracked out Wireshark, and it’s at this point that I gave up. This makes no sense:

I could see that upon connecting my MacBook to the USG’s LAN 1 port, DHCP discover requests were being sent from my Mac, but the USG’s capturing tool didn’t show them ever being received. But what the USG did capture were it’s own outgoing UniFi discovery requests and they were being received by my Mac. Here’s a screenshot from Wireshark (I’ve filtered out some irrelevant junk caused by my Mac sending out mDNS and ARP probes):

And here’s the corresponding USG capture:

ubnt@ubnt:~$ show interfaces ethernet eth1 capture
Capturing traffic on eth1 ...
10:27:21.461396 IP 192.168.1.1.51602 > 255.255.255.255.10001: UDP, length 145
10:27:31.682842 IP 192.168.1.1.51456 > 255.255.255.255.10001: UDP, length 145
10:27:41.915718 IP 192.168.1.1.59045 > 255.255.255.255.10001: UDP, length 145
10:27:52.156490 IP 192.168.1.1.41723 > 255.255.255.255.10001: UDP, length 145
10:28:02.381654 IP 192.168.1.1.47105 > 255.255.255.255.10001: UDP, length 145
10:28:12.599347 IP 192.168.1.1.41440 > 255.255.255.255.10001: UDP, length 145
10:28:22.819994 IP 192.168.1.1.54372 > 255.255.255.255.10001: UDP, length 145

The purple lines from the screenshot are my Mac’s DHCP requests, which never show up in the USG’s capture, but the 7 yellow lines match up perfectly with the 7 generated and captured by the USG.

So, to summarise, I have an Ethernet device with factory-supplied software and settings, no software related issues, warnings, or errors, which suddenly stopped receiving packet data on one port, but can still negotiate a link and send data on that port, and has other ports which function perfectly.

🤯

Please, if anyone reading this has any idea how this kind of failure is possible, get in touch.