By Onno VK6FLAB
The other day I had an interesting exchange with a contest manager and it's not the first time I've had this dance. As you might know, pretty much every weekend marks at least one on-air amateur radio contest. Following rules set out by a contest the aim is to make contact or a QSO with stations, taking note of each, in a process called logging.
Using logging software is one way to keep track of who you talked to, a piece of paper is another. If your station is expecting to make less than a dozen contacts per hour, paper is a perfectly valid way of keeping track, but it's likely that most contests expect you to transcribe your scribbles into electronic form. Which electronic form is normally explicitly stated in the rules for that contest.
While I mention rules, you should check the rules for each contest you participate in. Rules change regularly, sometimes significantly, often subtly with little edge cases captured in updated requirements.
On the software side, using electronic logging, even transcribing your paper log, can get you to unexpected results. I participated in a local contest and logged with a tool I've used before, xlog.
Contests often specify that you must submit logs using something like Cabrillo or ADIF. There are contests that provide a web page where you're expected to paste or manually enter your contacts in some specific format.
Using xlog I exported into each of the available formats, Cabrillo, ADIF, Tab Separated Values or TSV and a format I've never heard of, EDI. The format, according to a VHF Handbook I read, Electronic Data Interchange, was recommended by the IARU Region 1 during a meeting of the VHF/UHF/Microwave committee in Vienna in 1998 and later endorsed by the Executive Committee.
The contest I participated in asked for logs in Excel, Word, ASCII text or the output of electronic logging programs. Based on that I opened up the Cabrillo file and noticed that the export was gibberish. It had entries that bore no relation to the actual contest log entries, so I set about fixing them, one line at a time, to ensure that what I was submitting was actually a true reflection of my log.
So, issue number one is that xlog does not appear to export Cabrillo or ADIF properly. The TSV and EDI files appear, at least at first glance, to have the correct information, and the xlog internal file also contains the correct information. Much food for head-scratching. I'm running the latest version, so I'll dig in further when I have a moment.
In any case, I received a lovely email from the contest manager who apologised for not being able to open up my submitted log because they didn't have access to anything that could open up a Cabrillo file. We exchanged a few emails and I eventually sent a Comma Separated Values, or CSV file, and my log was accepted.
What I discovered was that their computer was "helping" in typical unhelpful "Clippy" style, by refusing to open up a Cabrillo file, claiming that it didn't have software installed that could read it.
Which brings me to issue number two.
All these files, Cabrillo, ADIF, TSV, CSV, EDI, even xlog's internal file are all text files. You can open them up in any text editor, on any platform, even Windows, which for reasons only the developers at Microsoft understand, refuses to open a text file if it has the wrong file extension. This "helpful" aspect of the platform is extended into their email service, "Outlook.com" previously called "Hotmail", which refuses to download "unknown" files, like the Cabrillo file with a ".cbr" extension.
With the demise of Windows Notepad, another annoying aspect has been removed, that of line-endings. To signify the end of a line MacOS, Windows and Linux have different ideas on how to indicate that a line of text has come to an end. In Windows-land, and DOS before it, use Carriage Return followed by Linefeed. Unix, including Linux and FreeBSD use Linefeed only; OS X also uses Linefeed, but classic Macintosh used Carriage Return. In other words, if you open up a text file and it all runs into one big chunk of text, it's likely that line-endings are the cause.
It also means that you, and contest managers, can rename files with data in Cabrillo, ADIF, CSV, TSV, EDI and plenty of other formats like HTML, CSS, JS, JSON, XML and KML to something ending with "TXT" and open it in their nearest text editor. If this makes you giddy, a KMZ file is actually a ZIP file with a KML file inside, which is also true for several other file formats like DOCX to name one.
Of course, that doesn't fix the issues of broken exports like xlog appears to be doing, but at least it gets everyone on the same page.
Word of caution. In most of these files individual characters matter. Removing an innocuous space or quote might completely corrupt the file for software that is written for that file format. So, tread carefully when you're editing.
What other data wrangling issues have you come across?
I'm Onno VK6FLAB