I recently set up a project with hosting on Heroku. However, I had code spread across several repositories that all needed to be deployed to the same place. This is a problem because the process to deploy to Heroku is essentially pushing to a git remote — if I did that across two repositories, they would collide.
One possible solution was git submodules, but they are finicky so I was hoping for something simpler. After a bit of investigation, I discovered that git has a feature called subtrees that could be used to handle this.
Read more on Simpler Deploys with git Subtrees…
Lately, I’ve been working on setting up a Personal Package Archive (PPA) to use when provisioning servers with custom packages.
In order to host packages on a Launchpad PPA, one must first upload signed source packages. Since I use a Mac and keep my PGP signing key on a Smartcard, I needed to find a way to connect my Smartcard reader to a virtual machine running Ubuntu. After a bit of research, I found an easy way to do this with Vagrant, VirtualBox, and the standard precise64 basebox.
Read more on Using a Smartcard with a VirtualBox-based Vagrant Virtual Machine…
Google Docs (and the recently-improved Google Sheets) are powerful tools. In the last few years, there have been some awesome additions to these products, one of which is Google Apps Scripting. With the apps scripting tools, you can write your own menus and background tasks for Google Drive, plus general scripts for the Google Apps suite. The interfaces are there; the only limitation to what you can create is you.
In a recent series of posts, I described tools to help plan, build, and maintain a small app as a startup team. While looking at uptime monitors, I wasn’t really impressed with any of the free options. Only after publishing the post did I discover an uptime tracker in Google Sheets. I was aware of other cool uses of Google Sheets, such as an Amazon.com price monitor and Gmail NoResponse tracking, but using it as an uptime monitor had slipped my mind. Read more on Extending Google Sheets: Uptime Monitor…
Often, I want to develop a Chef configuration that can be applied to a whole cluster of systems. During development, I may not have access to the final virtual (or physical) machines that will make up the cluster. To resolve this problem, I construct a Vagrant cluster that allows me to develop locally.
Instead of using a single Vagrant, the Vagrant cluster contains at least one Vagrant for each role I am developing for. I tweak my Vagrantfile so that it will construct the cluster based on the contents of the standard JSON files used to define Chef nodes. This integrates everything nicely into the Chef server environment and allows me to easily work with a representation of the final production systems. Read more on Constructing a Vagrant Cluster for Chef Development…
Amazon Web Services (AWS) provides an amazingly flexible platform for running and managing virtual machines in their Elastic Compute Cloud (EC2). With EC2, it is almost effortless to spin up clusters of dozens to hundreds of nodes. This allows for incredible flexibility in setting up various environments, for purposes such as development, testing, and production.
Of course, all of this comes with the hourly cost of running EC2 instances. Some EC2 instances, such as Windows instances, cost a substantial amount per month. While it probably won’t break the bank, it certainly factors into the decision of how many nodes and environments can be spun up and kept active.
The prevailing attitude often seems to be that EC2 instances must be kept running 24/7. This seems to ignore one of the great attractions of EC2 (and other AWS services) — you only pay for the resources that you consume. Keeping an instance running 24/7 when it isn’t actually being utilized is consuming unnecessary resources. Turning off instances when they won’t be utilized eliminates this resource wastage, and reduces cost. Fortunately, EC2 has a powerful set of tools that makes that very easy to configure schedules for turning instances on and off, and re-assigning static IP addresses.
Read more on Making AWS More Affordable with EC2 Scheduling…
Logging for Sinatra applications can be a bit tricky. When in development mode, exceptions are helpfully shown in the browser, or in your terminal where you started the application. In production, however, it takes some additional configuration to properly log requests and errors.
Read more on Production Logging for Sinatra Applications…
I was recently working on a project where I needed to be able to tell (in an automated way) if the versions of two builds matched. The project provided version info that consisted of a manually hardcoded version number and timestamp of when the source was built. This was problematic. The manual version number was incremented rarely and so was useless for day-to-day development. And the timestamp would only tell you when that particular build was generated but not what source was used to generate it. This made it difficult to trace down bugs as it was nearly impossible to know for sure what source a build was generated from.
So I set about to come up with a good replacement.
What Makes Version Information Useful?
Here are the things we wanted from version information:
- Determine if two builds come from the same source (even if built on different machines, at different times).
- Easily find the code associated with a particular build.
- Easily determine approximately how old a build is; compare age of versions.
- Provide an easy mechanism for the user to tell different versions of builds apart.
Read more on Creating Automated Build Versions During Development…
I recently needed to configure a machine with multiple installations of Ubuntu Server 12.10. One way to go about doing this is to create a separate, small boot partition to store the configuration for an initial Grub boot menu. Each installed OS gets its own partition with its own bootloader configuration. The initial menu stored on the boot partition is used to chainload the bootloader files for whichever OS is selected.
This technique involves using Grub Legacy on the small boot partition to chainload to the Grub2 configuration on each of the OS partitions. Instructions on how to do this can be found in the Ubuntu Lucid Multiple OS Installation guide. The instructions describe a couple of ways to go about getting Grub Legacy on the boot partition. I chose to go the route of installing Ubuntu (12.10 in my case) like normal, then removing Grub2, installing Grub Legacy temporarily, and finally reinstalling Grub2 again once the boot partition had been configured.
I ran into a couple of snags after getting to the Remove Grub 2 and install Grub Legacy part of the instructions, so I am going to duplicate the steps here and insert the couple of additional steps I ended up needing.
Read more on Multiple Ubuntu Installations with Grub…
Nginx is a modern, open-source, high-performance web server. It is capable of handling a huge number of concurrent connections easily (see the C10K problem). Over a year ago, I wrote about using nginx as a load balancer and remote proxy. Since then, my understanding of nginx and best practices in its configuration have progressed significantly. I’ve decided to refresh my blog post to provide some of this additional knowledge.
As I explained in my previous post, nginx relies on a non-blocking I/O, event-driven model, which allows it to easily handle a large number of incoming concurrent client connections with ease. This makes it an excellent choice as a load balancer and reverse proxy. In contrast, the traditional Apache HTTP model relies on a limited number of synchronous threads which may block on I/O.
Nginx running on a single server handles incoming client requests and distributes them to a pool of upstream application servers that actually fulfill the requests. The pool of application servers can be easily scaled up or down to handle changes in traffic levels. This flexibility provides a way to scale the capacity of almost any web application quite easily.
Following are some specific scenarios and nginx configuration examples that I have used when setting up and maintaining applications and network infrastructures for both Atomic Object and our clients. I lead up to a fairly practical configuration implementation that I’ve used recently.
Read more on Load Balancing and Reverse Proxying with Nginx, Updated…
I was recently tasked with finding a solution for leader election in a distributed system we are developing. I began to explore the realm of distributed algorithms, which eventually resulted in a more fundamental problem: How can we easily make all the nodes communicate reliably? After exploring ZeroMQ and UDP broadcasting, I decided to explore the Hadoop ecosystem to see if any of their solutions were reusable, and I discovered the excellent Apache ZooKeeper.
ZooKeeper is an open-source server that allows for highly reliable distributed coordination. It is a clustered service that can serve as a single source of truth for configuration information, distributed synchronization and other services. Zookeeper makes a number of guarantees about it’s data, it will be: sequential, atomic, reliable, timely, and consistent across the cluster.
Read more on Taming Your Cluster with ZooKeeper…