SQLite to CSV export

Yes, I could have written yet another little Python script to do this, but it turns out there’s a really easy way to export a SQLite database table to a CSV file from the command-line interface.

% sqlite3 databasefile.sqlite
sqlite> .headers on
sqlite> .mode csv
sqlite> .output outputfile.csv
sqlite> select * from table_name;
sqlite> .quit

And that’s it. Couldn’t be easier really.

Hat tip to http://www.sqlitetutorial.net/sqlite-export-csv/

Advertisements

Unbuffered output to stdout from Python programs

I recently had cause to debug a Python program that extracts data from a remote (MySQL) database and stores in a local SQLite database. I did the usual thing of inserting print statements to check a handful of variables at different stages of the process, but it wasn’t helping because the program would stay obstinately quiet until the end and then print all the information at once.

Wait, but why? Then I remembered that the info being written to stdout was buffered, so that I was only seeing the messages once the buffer had been filled. That’s all well and good, and perfectly efficient and all, but it was getting in the way of my debugging.

The solution I found was to configure stdout in the Python code to be unbuffered. Thanks to this helpful blog entry at Turnkey Linux, I added the following line to the start of my program (plus the required imports):

sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)

Bingo! No more waiting for my print statements to show up. Cue happy dance. Now I can get on with solving the actual problem I’m having with the code…

(I tried the other solution mentioned, stdbuf, but either I was using it wrong, or it wasn’t doing what I expected it do to.)

Inserting data from other databases in SQLite

I was given a table of additional data that needed to be added to an existing table in a SQLite database. Since I was processing CSV files exported from the database, I thought I would just concatenate the files – until I realized that the ID columns would contain duplicates. Now, this wasn’t crucial for my case but it could have caused confusion at some point in the future. Therefore I decided to do the appending in the database, and then export the combined data into a new CSV file.

The first step was to be able to read both database files. I began by just opening the original database:

sqlite3 my_existing_db.sqlite

and then “attaching” the new database with the data I wanted to append (the quotes are necessary):

sqlite> attach 'my_other_db.sqlite' as db2;

(Of course, both tables must have exactly the same schema!)

My reading around had suggested that it would be easy to import the new data with a simple insert into ... select * from ... command. However, this gave me an error because the values in the ID column (which is the Primary Key) were not unique:

Error: UNIQUE constraint failed: table1.ID

I scratched my head, and then decided to just select the columns with data. But that didn’t work either because the list of columns being inserted didn’t match the list of columns in the original table (well, d’uh!):

Error: table table1 has 10 columns but 9 values were supplied

I scratched my head some more, and then read the documentation about autoincrementing primary keys in SQLite. It turns out that a ROWID must be specified in the list for an insert statement. A lightbulb switched on as I read about how new ROWIDs are assigned in SQLite if a NULL value is given. This prompted me to try:

sqlite> insert into table1 select NULL, col1, col2, ... from db2.table1;

Success! I now had my combined table, which I exported to CSV and continued with my analysis.

In short, SQLite advises against setting up an autoincrementing primary key, and just allows the user to give an explicit NULL value to allocate the next available number.

It may be noted that the above was all done via the SQLite command line interface. It seems that this is the way to go: my attempts to attach a database in DB Browser for SQLite, for example, failed with a message about not being able to attach a database in a transaction. That, and all the examples on Stackoverflow showed the command-line being used. Other graphical browsers may differ.

Upgrading from Ubuntu 12.04 LTS to 16.04 LTS

From a precise pangolin to a xenial xerus

I’ve been putting off upgrading my Linux box for months, but now that official support for Ubuntu 12.04 LTS (Precise Pangolin) has just ended, I figure it’s now or never. In the end I found that I should have done it months ago as it went very smoothly.

My first question was, should I just upgrade to 16.04 LTS, or something newer, or maybe even Linux Mint which has been capturing hearts and minds over the past couple of years? In the end I decided to stick with Ubuntu, and given that the Linux box is really just a backup file store, I thought it would be best to go with a distribution that’s not going to need updating again in the next 6 months. I suspect that this will be the last upgrade for this machine: the motherboard, RAM, CPU, and PSU (and fans) all date from 2008. (Which reminds me, the CPU fan is getting cranky too…)

So off I went and downloaded the relevant .iso from Ubuntu. Hmm – server or desktop version? I had vague notions of using VNC to run a desktop session so I went for desktop, knowing that it would be straightforward to add whatever server packages I’d need later. Download, burn to DVD (yes, it seems I needed a DVD – which had to be done on the Windows machine as the burn attempt failed on Linux), insert and reboot. Oh yes – did I mention that I made sure I had backups of any important data from that hard drive? No? Well, remember to back up your stuff before doing OS upgrades!

I don’t have screenshots so bear with my wordy descriptions.

Installation of 16.04

The machine booted up and presented me with a black screen containing two low-resolution icons at the bottom, one of which looked a bit like a keyboard (I think). I waited. Nothing happened, so I hit the space bar. Ta-da! We have liftoff. I have no idea what it was waiting for. Please use words if I need to do something!! The disc spun up to obnoxious levels and eventually I was presented with my next screen with the choice of trying out Ubuntu, or installing Ubuntu. I clicked “Install”, and on the next screen I checked the box to download updates during the installation, and to add some third-party software that (for licensing reasons) can’t be included by default.

And away it went. It recognized that I had an existing Ubuntu installation on my OS drive and gave me 5 options to choose from. I could let it upgrade 12.04 to 16.04, but I wasn’t sure about doing that: I’ve that it’s more trouble than it’s worth. Alternatively I could install 16.04 alongside 12.04 – not much point in doing that. Then there were two options that I felt were somewhat ambiguous. The first said: “Erase 12.04 and reinstall”. Reinstall what? Reinstall 12.04? Why would I do that? The other said: “Erase disk and install Ubuntu”. Presumably that means the version of Ubuntu on this disc (i.e. 16.04) but again, not entirely clear. Fortunately I had an escape route: the fifth option was off on its own and said: “Something else”. 🙂

So I clicked “Something else” and got on with the next step, which was to examine the hard disk partitions. It took me a minute (some of which was spent contemplating an error message) to work out exactly what was required here: I had to select the desired file system and mount point for the relevant partitions. Since I had a separate partition for /home, I made sure to select / for the main one, and /home for the other. In doing so I completely forgot about the other two hard drives in the machine (not an issue, but it would have saved me another post-installation step – see below).

With hard drives sorted, the next steps were setting the timezone (easy: it correctly identified me as being in Vancouver) and keyboard (standard US). Next was giving the computer a name, and I stuck with the previous one (my current home computer naming scheme makes use of lakes in the English Lake District). Then I had to create a username and set a password. Then it was time to restart and I was reminded to remove the DVD from the drive and asked to press enter to continue.

Post installation

A few moments later I logged in with my shiny new user account and began the task of getting things set up to taste. First up was a second user account, which left me a bit puzzled to being with as I had to click on a button that didn’t look clickable in order to bring up the option of actually enabling the account and setting a password.

The first job (actually, this should have been my first job, not the second user account) when installing a new operating system is to apply all available updates. The package manager said I only had 12, so I left it doing its thing and called it a day.

The following morning I picked up with the next round of tasks: installing tcsh (and using that as my preferred shell) as well as Geeqie and Dropbox. Then I wasted a couple of hours trying to get a headless VNC setup to work before I took a moment to think about how often I would actually use that. The answer came back pretty quickly: almost never, so changed my mind and decided to just leave it as a headless server, and start a VNC session if and when I actually needed one.

Headless operation

I’ve been running the Windows and Linux machines with a set of KVM cables for the past few years. Recently, one of the cables has been getting flaky, causing the display to blank from time to time, so I decided that I was going to cut those cords and have the Linux run without a keyboard, mouse, or monitor. After all, even with the KVM switch, I still logged in over SSH 99 % of the time. (Speaking of which, I was stumped by the fact that my first attempt to SSH in to the Linux machine failed. That is until I remembered to install the SSH server package! D’oh!)

Setting up for headless operation turned out to be trivially easy:
% sudo systemctl disable lightdm.service
In other words, disable the graphical login greeter. (In the old days I would have specified % sudo init 3.) Yay! And into the corner you go little computer!

Other hard drives

Next on my list was mounting my other hard drives in sensible places. By default, Ubuntu had mounted them under /media/andy, and they only mounted on logging in to a desktop session. Running headless, they were never mounted unless I did so explicitly. On the plus side, that made it very easy to get them set up properly. A quick search led me to this post at Ask Ubuntu with a nice clear layout of the steps required.

I created a couple of mount points for my hard drives under / (literally just % sudo mkdir /mountpoint1 /mountpoint2). After that I ascertained the UUIDs for each hard drive (using blkid), and edited /etc/fstab to add the relevant details in the right place. Then I ran
% sudo mount -a
to (re)mount the disks, and I was good to go.

Samba

Since this machine’s primary purpose is to be a file store/network drive, I needed to get Samba up and running. Installing Samba was easy, and setting up was just as easy thanks to this post at Linuxbabe. Installing Samba automatically enables it as a service and sets it running, which is very convenient. I tested it out by re-creating my network drive in Windows, and then mounted it from my Macbook Pro too. All seemed to work just fine.

Final thoughts

And that was pretty much it. Very easy, very smooth. Had I thought things through a bit more closely, I could have saved myself a bit of time and a couple of steps, but in the end it was no hassle thanks to the articles I mentioned above.

I’m sure I’ll find more things missing, and I haven’t done anything about setting up Python, Perl, Java, or any compilers. But much of my development work is done on my Macbook Pro these days, so I’ll just deal with it when the time comes.

Recovering MySQL after improper shutdown

As an amateur DBA I recently made a classic amateur DBA mistake: I restarted my computer without first shutting down MySQL. (The restart in this case was the result of updating macOS to 10.12.4.) After the reboot I couldn’t work out why XAMPP wasn’t launching MySQL again – apache started up fine – so it was time to dig into the log files.

A quick check of var/mysql/hostnameredacted.local.err in the XAMPP installation directory revealed a string of errors and a stacktrace. Reading it carefully I realized the key error message was this:

InnoDB: Operating system error number 2 in a file operation

The other clue was a few lines earlier in the logfile:

InnoDB: Database was not shutdown normally!

As ever, Google to the rescue which led me to a handy blog post which suggested adding the following line to the [mysqld] section of etc/my.cnf:

innodb_force_recovery = 1

I saved the file, and held my breath as I attempted to restart MySQL. Success!

I went back and removed that line again (actually just commented it out – there’s an outside chance I’ll repeat this error at some point in the future!) and re-restarted MySQL. Firing up SequelPro and I was relieved to be able to see all my tables as expected, and ran a few test queries to check it out. Everything looks OK. Phew!

Lesson learned: remember to shutdown MySQL properly before a reboot.

Resuming paused processes on macOS

My Macbook Pro has been annoying the hell out of me lately, running of out application memory on a regular basis. (You’d think that’d be hard to do with 16 GB of RAM, but apparently my Mac is quite good at it…) Sure, I can restart it to clear the memory and have it load things up again as needed, but there always comes a point when that little window pops up and complains about a lack of application memory.

A couple of times I’ve dismissed that window by mistake and wondered how to restart the suspended application(s). A quick search yielded an option to the kill command that I wasn’t aware of. So the next time I do that by mistake and leave an app or two suspended, I can start up the Activity Monitor, check the process ID of the suspended application(s), and then type:

% kill -CONT <PID>

Hey Presto, app gets unstuck. Yay!

That is, as long as macOS doesn’t decide to suspend the terminal…

On backups

In my previous post I mentioned that I’d been using CrashPlan for making backups on our Windows PC. Like many, I prefer to get a feel for the software before I pay for a copy, so I installed the free version. It all seemed to set up OK, I didn’t immediately come across any missing features and so I left it alone to do its thing.

And in the absence of any error messages I believed that it was working just fine. Alas, that was not the case.

In checking out my backup drive when dealing with our Trojan infection, I found out that CrashPlan had filled its allotted hard drive and in fact hadn’t backed up anything in weeks. And in the process had utterly failed to alert me to that fact. Hey CrashPlan – how about using the Windows notification area for alerts? I mean, a backup failing is a major error and the user should be told about it as soon as possible.

One of the reasons I installed CrashPlan in the first place was that I wanted something that was more intelligent backup than Windows’ own. However, the free version of CrashPlan has a set of default settings for the backup frequency and number/age of versions that runs the risk of filling your backup hard drive then nothing more will be backed up because nothing will be old enough to delete. CrashPlan’s official advice on solving this problem? Get a bigger hard drive. Okaaay….

OK I could solve this problem by paying for the basic version, which includes unlimited online backups too. After all, $60 a year isn’t that much to pay for peace of mind is it? Normally I’d argue that is cheap, but in this case all I wanted was a local fire-and-forget backup of our primary documents directories that didn’t rely on me remembering to run rsync. And $60 was too much for just that.

Enter Windows FileHistory. I hadn’t heard of it until I searched again for Windows backup solutions. After a bit of fiddling about (tip: don’t include any folders that have a frequently-updated cache, like that for a web browser!) I seem to have it working just fine. It’s been going for a few days now, and is updating only what it needs to update which is pretty much all I want in a backup program!

However, I’m not entirely convinced that it’s a long-term solution as it seems a bit fragile. When configuring FileHistory, it seemed to get itself into a state where I couldn’t add new folders to the backup, and remove one would remove them all. A round of disabling it, reboot, and re-enabling FileHistory fixed it, but I shouldn’t have to do that every time. (I shouldn’t have to do it at all, of course…)

So we’ll see how it goes. I didn’t like the fact that CrashPlan forced me to create yet another account even to use the free version – my Windows 10 box uses local logins, so no additional accounts needed to use FileHistory. Plus I’ll keep doing my rsync backups to the Linux box, so at least I’ll always have a second copy. Mind you, even that backup drive is filling up…

Postscript: I should admit some fallibility here. My initial backup plan included a directory tree that mistakenly contained our web browser caches, which of course tend to a) be large and b) change frequently. (Not to mention that they are pointless to back up!) That meant every time the backup ran, it saw that a few GB of files had changed and duly did exactly what I’d asked it to do. Alas, this resulted in my paltry 160 GB drive filling up way sooner than expected. So one of the things to make sure when you set up a backup is to make sure it really is only backing up the files that you wish to keep! I know this now… 🙂