How Sequel and Sinatra Solve Ruby’s API Problem

Introduction

In recent years, the number of JavaScript single page application frameworks and mobile applications has increased substantially. This imposes a correspondingly increased demand for server-side APIs. With Ruby on Rails being one of the today’s most popular web development frameworks, it is a natural choice among many developers for creating back-end API applications.

Yet while the Ruby on Rails architectural paradigm makes it quite easy to create back-end API applications, using Rails only for the API is overkill. In fact, it’s overkill to the point that even the Rails team has recognized this and has therefore introduced a new API-only mode in version 5. With this new feature in Ruby on Rails, creating API-only applications in Rails became an even easier and more viable option.

But there are other options too. The most notable are two very mature and powerful gems, which in combination provide powerful tools for creating server-side APIs. They are Sinatra and Sequel.

Both of these gems have a very rich feature set: Sinatra serves as the domain specific language (DSL) for web applications, and Sequel serves as the object-relational mapping (ORM) layer. So, let’s take a brief look at each of them.

API With Sinatra and Sequel: Ruby Tutorial

Ruby API on a diet: introducing Sequel and Sinatra.

Sinatra

Sinatra is Rack-based web application framework. The Rack is a well known Ruby web server interface. It is used by many frameworks, like Ruby on Rails, for example, and supports lot of web servers, like WEBrick, Thin, or Puma. Sinatra provides a minimal interface for writing web applications in Ruby, and one of its most compelling features is support for middleware components. These components lie between the application and the web server, and can monitor and manipulate requests and responses.

For utilizing this Rack feature, Sinatra defines internal DSL for creating web applications. Its philosophy is very simple: Routes are represented by HTTP methods, followed by a route matching a pattern. A Ruby block within which request is processed and the response is formed.

get '/' do
'Hello from sinatra'
end

The route matching pattern can also include a named parameter. When route block is executed, a parameter value is passed to the block through the params variable.

get '/players/:sport_id' do
# Parameter value accessible through params[:sport_id]
end

Matching patterns can use splat operator * which makes parameter values available through params[:splat].

get '/players/*/:year' do
# /players/performances/2016
# Parameters - params['splat'] -> ['performances'], params[:year] -> 2016
end

This is not the end of Sinatra’s possibilities related to route matching. It can use more complex matching logic through regular expressions, as well as custom matchers.

Sinatra understands all of the standard HTTP verbs needed for creating a REST API: Get, Post, Put, Patch, Delete, and Options. Route priorities are determined by the order in which they are defined, and the first route that matches a request is the one that serves that request.

Sinatra applications can be written in two ways; using classical or modular style. The main difference between them is that, with the classical style, we can have only one Sinatra application per Ruby process. Other differences are minor enough that, in most cases, they can be ignored, and the default settings can be used.

Classical Approach

Implementing classical application is straightforward. We just have to load Sinatra and implement route handlers:

require 'sinatra'
get '/' do
'Hello from Sinatra'
end

By saving this code to demo_api_classic.rb file, we can start the application directly by executing the following command:

ruby demo_api_classic.rb

However, if the application is to be deployed with Rack handlers, like Passenger, it is better to start it with the Rack configuration config.ru file.

require './demo_api_classic'
run Sinatra::Application

With the config.ru file in place, the application is started with the following command:

rackup config.ru

Modular Approach

Modular Sinatra applications are created by subclassing either Sinatra::Base or Sinatra::Application:

require 'sinatra'
class DemoApi < Sinatra::Application
# Application code
run! if app_file == $0
end

The statement beginning with run! is used for starting the application directly, with ruby demo_api.rb, just as with the classical application. On the other hand, if the application is to be deployed with Rack, the handlers content of rackup.ru must be:

require './demo_api'
run DemoApi

Sequel

Sequel is the second tool in this set. In contrast to ActiveRecord, which is part of the Ruby on Rails, Sequel’s dependencies are very small. At the same time, it is quite feature rich and can be used for all kinds of database manipulation tasks. With its simple domain specific language, Sequel relieves the developer from all the problems with maintaining connections, constructing SQL queries, fetching data from (and sending data back to) the database.

For example, establishing a connection with the database is very simple:

DB = Sequel.connect(adapter: :postgres, database: 'my_db', host: 'localhost', user: 'db_user')

The connect method returns a database object, in this case, Sequel::Postgres::Database, which can be further used to execute raw SQL.

DB['select count(*) from players']

Alternatively, to create a new dataset object:

DB[:players]

Both of these statements create a dataset object, which is a basic Sequel entity.

One of the most important Sequel dataset features is that it does not execute queries immediately. This makes it possible to store datasets for later use and, in most cases, to chain them.

users = DB[:players].where(sport: 'tennis')

So, if a dataset does not hit the database immediately, the question is, when does it? Sequel executes SQL on the database when so-called “executable methods” are used. These methods are, to name a few, alleach,mapfirst, and last.

Sequel is extensible, and its extensibility is a result of a fundamental architectural decision to build a small core complemented with a plugin system. Features are easily added through plugins which are, actually, Ruby modules. The most important plugin is the Model plugin. It is an empty plugin which does not define any class or instance methods by itself. Instead, it includes other plugins (submodules) which define a class, instance or model dataset methods. The Model plugin enables the use of Sequel as the object-relational-mapping (ORM) tool and is often referred to as the “base plugin”.

class Player < Sequel::Model
end

The Sequel model automatically parses the database schema and sets up all necessary accessor methods for all columns. It assumes that table name is plural and is an underscored version of the model name. In case there is a need to work with databases that do not follow this naming convention, the table name can be explicitly set when the model is defined.

class Player < Sequel::Model(:player)
end

So, we now have everything we need to start building the back-end API.

Read the full article from Toptal

Meet RxJava: The Missing Reactive Programming Library for Android

If you’re an Android developer, chances are you’ve heard of RxJava. It’s one of the most discussed libraries for enabling Functional Reactive Programming (FRP) in Android development. It’s touted as the go-to framework for simplifying concurrency/asynchronous tasks inherent in mobile programming.

But… what is RxJava and how does it “simplify” things?

Functional Reactive Programming for Android: An Introduction to RxJava

Untangle your Android from too many Java threads with RxJava.

While there are lots of resources already available online explaining what RxJava is, in this article my goal is to give you a basic introduction to RxJava and specifically how it fits into Android development. I’ll also give some concrete examples and suggestions on how you can integrate it in a new or existing project.

Why Consider RxJava?

At its core, RxJava simplifies development because it raises the level of abstraction around threading. That is, as a developer you don’t have to worry too much about the details of how to perform operations that should occur on different threads. This is particularly attractive since threading is challenging to get right and, if not correctly implemented, can cause some of the most difficult bugs to debug and fix.

Granted, this doesn’t mean RxJava is bulletproof when it comes to threading and it is still important to understand what’s happening behind the scenes; however, RxJava can definitely make your life easier.

Let’s look at an example.

Network Call - RxJava vs AsyncTask

Say we want to obtain data over the network and update the UI as a result. One way to do this is to (1) create an inner AsyncTask subclass in our Activity/Fragment, (2) perform the network operation in the background, and (3) take the result of that operation and update the UI in the main thread.

public class NetworkRequestTask extends AsyncTask<Void, Void, User> {
private final int userId;
public NetworkRequestTask(int userId) {
this.userId = userId;
}
@Override protected User doInBackground(Void... params) {
return networkService.getUser(userId);
}
@Override protected void onPostExecute(User user) {
nameTextView.setText(user.getName());
// ...set other views
}
}
private void onButtonClicked(Button button) {
new NetworkRequestTask(123).execute()
}

Harmless as this may seem, this approach has some issues and limitations. Namely, memory/context leaks are easily created since NetworkRequestTask is an inner class and thus holds an implicit reference to the outer class. Also, what if we want to chain another long operation after the network call? We’d have to nest two AsyncTasks which can significantly reduce readability.

In contrast, an RxJava approach to performing a network call might look something like this:

private Subscription subscription;
private void onButtonClicked(Button button) {
subscription = networkService.getObservableUser(123)
.subscribeOn(Schedulers.io())
.observeOn(AndroidSchedulers.mainThread())
.subscribe(new Action1<User>() {
@Override public void call(User user) {
nameTextView.setText(user.getName());
// ... set other views
}
});
}
@Override protected void onDestroy() {
if (subscription != null && !subscription.isUnsubscribed()) {
subscription.unsubscribe();
}
super.onDestroy();
}

Using this approach, we solve the problem (of potential memory leaks caused by a running thread holding a reference to the outer context) by keeping a reference to the returned Subscription object. This Subscription object is then tied to the Activity/Fragment object’s #onDestroy() method to guarantee that the Action1#call operation does not execute when the Activity/Fragment needs to be destroyed.

Also, notice that that the return type of #getObservableUser(...) (i.e. an Observable<User>) is chained with further calls to it. Through this fluid API, we’re able to solve the second issue of using an AsyncTask which is that it allows further network call/long operation chaining. Pretty neat, huh?

Let’s dive deeper into some RxJava concepts.

Observable, Observer, and Operator - The 3 O’s of RxJava Core

In the RxJava world, everything can be modeled as streams. A stream emits item(s) over time, and each emission can be consumed/observed.

If you think about it, a stream is not a new concept: click events can be a stream, location updates can be a stream, push notifications can be a stream, and so on.

The stream abstraction is implemented through 3 core constructs which I like to call “the 3 O’s”; namely: theObservable, Observer, and the Operator. The Observable emits items (the stream); and the Observerconsumes those items. Emissions from Observable objects can further be modified, transformed, and manipulated by chaining Operator calls.

Observable

An Observable is the stream abstraction in RxJava. It is similar to an Iterator in that, given a sequence, it iterates through and produces those items in an orderly fashion. A consumer can then consume those items through the same interface, regardless of the underlying sequence.

Say we wanted to emit the numbers 1, 2, 3, in that order. To do so, we can use the Observable<T>#create(OnSubscribe<T>) method.

Observable<Integer> observable = Observable.create(new Observable.OnSubscribe<Integer>() {
@Override public void call(Subscriber<? super Integer> subscriber) {
subscriber.onNext(1);
subscriber.onNext(2);
subscriber.onNext(3);
subscriber.onCompleted();
}
});

Invoking subscriber.onNext(Integer) emits an item in the stream and, when the stream is finished emitting, subscriber.onCompleted() is then invoked.

This approach to creating an Observable is fairly verbose. For this reason, there are convenience methods for creating Observable instances which should be preferred in almost all cases.

The simplest way to create an Observable is using Observable#just(...). As the method name suggests, it just emits the item(s) that you pass into it as method arguments.

Observable.just(1, 2, 3); // 1, 2, 3 will be emitted, respectively

Observer

The next component to the Observable stream is the Observer (or Observers) subscribed to it. Observers are notified whenever something “interesting” happens in the stream. Observers are notified via the following events:

  • Observer#onNext(T) - invoked when an item is emitted from the stream
  • Observable#onError(Throwable) - invoked when an error has occurred within the stream
  • Observable#onCompleted() - invoked when the stream is finished emitting items.

To subscribe to a stream, simply call Observable<T>#subscribe(...) and pass in an Observer instance.

Observable<Integer> observable = Observable.just(1, 2, 3);
observable.subscribe(new Observer<Integer>() {
@Override public void onCompleted() {
Log.d("Test", "In onCompleted()");
}
@Override public void onError(Throwable e) {
Log.d("Test", "In onError()");
}
@Override public void onNext(Integer integer) {
Log.d("Test", "In onNext():" + integer);
}
});

The above code will emit the following in Logcat:

In onNext(): 1
In onNext(): 2
In onNext(): 3
In onNext(): 4
In onCompleted()

There may also be some instances where we are no longer interested in the emissions of an Observable. This is particularly relevant in Android when, for example, an Activity/Fragment needs to be reclaimed in memory.

To stop observing items, we simply need to call Subscription#unsubscribe() on the returned Subscription object.

Subscription subscription = someInfiniteObservable.subscribe(new Observer<Integer>() {
@Override public void onCompleted() {
// ...
}
@Override public void onError(Throwable e) {
// ...
}
@Override public void onNext(Integer integer) {
// ...
}
});
// Call unsubscribe when appropriate
subscription.unsubscribe();

As seen in the code snippet above, upon subscribing to an Observable, we hold the reference to the returned Subscription object and later invoke subscription#unsubscribe() when necessary. In Android, this is best invoked within Activity#onDestroy() or Fragment#onDestroy().

Operator

Items emitted by an Observable can be transformed, modified, and filtered through Operators before notifying the subscribed Observer object(s). Some of the most common operations found in functional programming (such as map, filter, reduce, etc.) can also be applied to an Observable stream. Let’s look at map as an example:

Observable.just(1, 2, 3, 4, 5).map(new Func1<Integer, Integer>() {
@Override public Integer call(Integer integer) {
return integer * 2;
}
}).subscribe(new Observer<Integer>() {
@Override public void onCompleted() {
// ...
}
@Override public void onError(Throwable e) {
// ...
}
@Override public void onNext(Integer integer) {
// ...
}
});

The code snippet above would take each emission from the Observable and multiply each by 2, producing the stream 2, 4, 6, 8, 10, respectively. Applying an Operator typically returns another Observable as a result, which is convenient as this allows us to chain multiple operations to obtain a desired result.

Given the stream above, say we wanted to only receive even numbers. This can be achieved by chaining afilter operation.

Observable.just(1, 2, 3, 4, 5).map(new Func1<Integer, Integer>() {
@Override public Integer call(Integer integer) {
return integer * 2;
}
}).filter(new Func1<Integer, Boolean>() {
@Override public Boolean call(Integer integer) {
return integer % 2 == 0;
}
}).subscribe(new Observer<Integer>() {
@Override public void onCompleted() {
// ...
}
@Override public void onError(Throwable e) {
// ...
}
@Override public void onNext(Integer integer) {
// ...
}
});


   

Service Oriented Architecture with AWS Lambda: A Step-by-Step Tutorial

When building web applications, there are many choices to be made that can either help or hinder your application in the future once you commit to them. Choices such as language, framework, hosting, and database are crucial.

One such choice is whether to create a service-based application using Service Oriented Architecture (SOA) or a traditional, monolithic application. This is a common architectural decision affecting startups, scale-ups, and enterprise companies alike.

Service Oriented Architecture is used by a large number of well-known unicorns and top-tech companies such as Google, Facebook, Twitter, Instagram and Uber. Seemingly, this architecture pattern works for large companies, but can it work for you?

Service Oriented Architecture with AWS Lambda: A Step-By-Step Tutorial

Service Oriented Architecture with AWS Lambda: A Step-By-Step Tutorial

In this article we will introduce the topic of Service Oriented architecture, and how AWS Lambda in combination with Python can be leveraged to easily build scalable, cost-efficient services. To demonstrate these ideas, we will build a simple image uploading and resizing service using Python, AWS Lambda, Amazon S3 and a few other relevant tools and services.

What is Service Oriented Architecture?

Service Oriented Architecture (SOA) isn’t new, having roots from several decades ago. In recent years its popularity as a pattern has been growing due to offering many benefits for web-facing applications.

SOA is, in essence, the abstraction of one large application into many communicating smaller applications. This follows several best practices of software engineering such as de-coupling, separation of concerns and single-responsibility architecture.

Implementations of SOA vary in terms of granularity: from very few services that cover large areas of functionality to many dozens or hundreds of small applications in what is termed “microservice” architecture. Regardless of the level of granularity, what is generally agreed amongst practitioners of SOA is that it is by no means a free lunch. Like many good practices in software engineering, it is an investment that will require extra planning, development and testing.

What is AWS Lambda?

AWS Lambda is a service offered by the Amazon Web Services platform. AWS Lambda allows you to upload code that will be run on an on-demand container managed by Amazon. AWS Lambda will manage the provisioning and managing of servers to run the code, so all that is needed from the user is a packaged set of code to run and a few configuration options to define the context in which the server runs. These managed applications are referred to as Lambda functions.

AWS Lambda has two main modes of operation:

Asynchronous / Event-Driven:

Lambda functions can be run in response to an event in asynchronous mode. Any source of events, such as S3, SNS, etc. will not block and Lambda functions can take advantage of this in many ways, such as establishing a processing pipeline for some chain of events. There are many sources of information, and depending on the source events will be pushed to a Lambda function from the event source, or polled for events by AWS Lambda.

Synchronous / Request->Response:

For applications that require a response to be returned synchronously, Lambda can be run in synchronous mode. Typically this is used in conjunction with a service called API Gateway to return HTTP responses from AWS Lambda to an end-user, however Lambda functions can also be called synchronously via a direct call to AWS Lambda.

AWS Lambda functions are uploaded as a zip file containing handler code in addition to any dependencies required for the operation of the handler. Once uploaded, AWS Lambda will execute this code when needed and scale the number of servers from zero to thousands when required, without any extra intervention required by the consumer.

Lambda Functions as an Evolution of SOA

Basic SOA is a way to structure your code-base into small applications in order to benefit an application in the ways described earlier in this article. Arising from this, the method of communication between these applications comes into focus. Event-driven SOA (aka SOA 2.0) allows for not only the traditional direct service-to-service communication of SOA 1.0, but also for events to be propagated throughout the architecture in order to communicate change.

Event-driven architecture is a pattern that naturally promotes loose coupling and composability. By creating and reacting to events, services can be added ad-hoc to add new functionality to an existing event, and several events can be composed to provide richer functionality.

AWS Lambda can be used as a platform to easily build SOA 2.0 applications. There are many ways to trigger a Lambda function; from the traditional message-queue approach with Amazon SNS, to events created by a file being uploaded to Amazon S3, or an email being sent with Amazon SES.

Implementing a Simple Image Uploading Service

We will be building a simple application to upload and retrieve images utilizing the AWS stack. This example project will contain two lambda functions: one running in request->response mode that will be used to serve our simple web frontend, and another that will detect uploaded images and resize them.

The first lambda function will run asynchronously in response to a file-upload event triggered on the S3 bucket that will house the uploaded images. It will take the image provided and resize it to fit within a 400x400 image.

The other lambda function will serve the HTML page, providing both the functionality for a user to view the images resized by our other Lambda function as well as an interface for uploading an image.

Initial AWS Configuration

Before we can begin, we will need to configure some necessary AWS services such as IAM and S3. These will be configured using the web-based AWS console. However, most of the configuration can also be achieved by using the AWS command-line utility, which we will use later.

Creating S3 Buckets

S3 (or Simple Storage Service) is an Amazon object-store service that offers reliable and cost-efficient storage of any data. We will be using S3 to store the images that will be uploaded, as well as the resized versions of the images we have processed.

The S3 service can be found under the “Services” drop-down in the AWS console under the “Storage & Content Delivery” sub-section. When creating a bucket you will be prompted to enter both the bucket name as well as to select a region. Selecting a region close to your users will allow S3 to optimize for latency and cost, as well as some regulatory factors. For this example we will select the “US Standard” region. This same region will later be used for hosting the AWS Lambda functions.

It is worth noting that S3 bucket names are required to be unique, so if the name chosen is taken you will be required to choose a new, unique name.

For this example project, we will create two storage buckets named “test-upload” and “test-resized”. The “test-upload” bucket will be used for uploading images and storing the uploaded image before it is processed and resized. Once resized, the image will be saved into the “test-resized” bucket, and the raw uploaded image removed.

S3 Upload Permissions

By default, S3 Permissions are restrictive and will not allow external users or even non-administrative users to read, write, update, or delete any permissions or objects on the bucket. In order to change this, we will need to be logged in as a user with the rights to manage AWS bucket permissions.

Assuming we are on the AWS console, we can view the permissions for our upload bucket by selecting the bucket by name, clicking on the “Properties” button in the top-right of the screen, and opening the collapsed “Permissions” section.

In order to allow anonymous users to upload to this bucket, we will need to edit the bucket policy to allow the specific permission that allows upload to be allowed. This is accomplished through a JSON-based configuration policy. These kind of JSON policies are used widely throughout AWS in conjunction with the IAM service. Upon clicking on the “Edit Bucket Policy” button, simply paste the following text and click “Save” to allow public image uploads:

{
"Version": "2008-10-17",
"Id": "Policy1346097257207",
"Statement": [
{
"Sid": "Allow anonymous upload to /",
"Effect": "Allow",
"Principal": {
"AWS": "*"
},
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::test-upload/*"
}
]
}

After doing this, we can verify the bucket policy is correct by attempting to upload an image to the bucket. The following cURL command will do the trick:

curl https://test-upload.s3.amazonaws.com -F 'key=test.jpeg' -F 'file=@test.jpeg'

If a 200-range response is returned, we will know that the configuration for the upload bucket has been successfully applied. Our S3 buckets should now be (mostly) configured. We will return later to this service in the console in order to connect our image upload events to the invocation of our resize function.

IAM Permissions for Lambda

Lambda roles all run within a permission context, in this case a “role” defined by the IAM service. This role defines any and all permissions that the Lambda function has during its invocation. For the purposes of this example project, we will create a generic role that will be used between both of the Lambda functions. However, in a production scenario finer granularity in permission definitions is recommended to ensure that any security exploitations are isolated to only the permission context that was defined.

The IAM service can be found within the “Security & Identity” sub-section of the “Services” drop-down. The IAM service is a very powerful tool for managing access across AWS services, and the interface provided may be a bit over-whelming at first if you are not familiar with similar tools.

Once on the IAM dashboard page, the “Roles” sub-section can be found on the left-hand side of the page. From here we can use the “Create New Role” button to bring up a multi-step wizard to define the permissions of the role. Let’s use “lambda_role” as the name of our generic permission. After continuing from the name definition page, you will be presented with the option to select a role type. As we only require S3 access, click on “AWS Service Roles” and within the selection box select “AWS Lambda”. You will be presented with a page of policies that can be attached to this role. Select the “AmazonS3FullAccess” policy and continue to the next step to confirm the role to be created.

It is important to note the name and the ARN (Amazon Resource Name) of the created role. This will be used when creating a new Lambda function to identify the role that is to be used for function invocation.

Note: AWS Lambda will automatically log all output from function invocations in AWS Cloudwatch, a logging service. If this functionality is desired, which is recommended for a production environment, permission to write to a Cloudwatch log stream must be added to the policies for this role.

The Code!

Overview

Now we are ready to start coding. We will assume at this point you have set up the “awscli” command. If you have not, you can follow the instructions at https://aws.amazon.com/cli/ to set up awscli on your computer.

Note: the code used in these examples is made shorter for ease of screen-viewing. For a more complete version visit the repository at https://github.com/gxx/aws-lambda-python/.

Read the full article on Toptal 

The Six Commandments of Good Code: Write Code that Stands the Test of Time

Humans have only been grappling with the art and science of computer programming for roughly half a century. Compared to most arts and sciences, computer science is in many ways still just a toddler, walking into walls, tripping over its own feet, and occasionally throwing food across the table. As a consequence of its relative youth, I don’t believe we have a consensus yet on what a proper definition of “good code” is, as that definition continues to evolve. Some will say “good code” is code with 100% test coverage. Others will say it’s super fast and has a killer performance and will run acceptably on 10 year old hardware. While these are all laudable goals for software developers, however I venture to throw another target into the mix: maintainability. Specifically, “good code” is code that is easily and readily maintainable by an organization (not just by its author!) and will live for longer than just the sprint it was written in. The following are some things I’ve discovered in my career as an engineer at big companies and small, in the USA and abroad, that seem to correlate with maintainable, “good” software.

Never settle for code that just "works." Write superior code.

Commandment #1: Treat Your Code the Way You Want Other’s Code to Treat You

I’m far from the first person to write that the primary audience for your code is not the compiler/computer, but whomever next has to read, understand, maintain, and enhance the code (which will not necessarily be you 6 months from now). Any engineer worth their pay can produce code that “works”; what distinguishes a superb engineer is that they can write maintainable code efficiently that supports a business long term, and have the skill to solve problems simply and in a clear and maintainable way.

In any programming language, it is possible to write good code or bad code. Assuming we judge a programming language by how well it facilitates writing good code (it should at least be one of the top criteria, anyway), any programming language can be “good” or “bad” depending on how it is used (or abused).

An example of a language that by many is considered ‘clean’ and readable is Python. The language itself enforces some level of white space discipline and the built in APIs are plentiful and fairly consistent. That said, it’s possible to create unspeakable monsters. For example, one can define a class and define/redefine/undefine any and every method on that class during runtime (often referred to as monkey patching). This technique naturally leads to at best an inconsistent API and at worst an impossible to debug monster. One might naively think,”sure, but nobody does that!” Unfortunately they do, and it doesn’t take long browsing pypi before you run into substantial (and popular!) libraries that (ab)use monkey patching extensively as the core of their APIs. I recently used a networking library whose entire API changes depending on the network state of an object. Imagine, for example, calling client.connect() and sometimes getting a MethodDoesNotExist error instead of HostNotFound or NetworkUnavailable.

Commandment #2: Good Code Is Easily Read and Understood, in Part and in Whole

Good code is easily read and understood, in part and in whole, by others (as well as by the author in the future, trying to avoid the “Did I really write that?” syndrome).

By “in part” I mean that, if I open up some module or function in the code, I should be able to understand what it does without having to also read the entire rest of the codebase. It should be as intuitive and self-documenting as possible.

Code that constantly references minute details that affect behavior from other (seemingly irrelevant) portions of the codebase is like reading a book where you have to reference the footnotes or an appendix at the end of every sentence. You’d never get through the first page!

Some other thoughts on “local” readability:

  • Well encapsulated code tends to be more readable, separating concerns at every level.

  • Names matter. Activate Thinking Fast and Slow’ssystem 2 way in which the brain forms thoughts and put some actual, careful thought into variable and method names. The few extra seconds can pay significant dividends. A well-named variable can make the code much more intuitive, whereas a poorly-named variable can lead to headfakes and confusion.

  • Cleverness is the enemy. When using fancy techniques, paradigms, or operations (such as list comprehensions or ternary operators), be careful to use them in a way that makes your code morereadable, not just shorter.

  • Consistency is a good thing. Consistency in style, both in terms of how you place braces but also in terms of operations, improves readability greatly.

  • Separation of concerns. A given project manages an innumerable number of locally important assumptions at various points in the codebase. Expose each part of the codebase to as few of those concerns as possible. Say you had a people management system where a person object may sometimes have a null last name. To somebody writing code in a page that displays person objects, that could be really awkward! And unless you maintain a handbook of “Awkward and non obvious assumptions our codebase has” (I know I don’t) your display page programmer is not going to know last names can be null and is probably going to write code with a null pointer exception in it when the last name-being null case shows up. Instead handle these cases with well thought out APIs and contracts that different pieces of your codebase use to interact with each other.

Commandment #3: Good Code Has a Well Thought-out Layout and Architecture to Make Managing State Obvious

State is the enemy. Why? Because it is the single most complex part of any application and needs to be dealt with very deliberately and thoughtfully. Common problems include database inconsistencies, partial UI updates where new data isn’t reflected everywhere, out of order operations, or just mind numbingly complex code with if statements and branches everywhere leading to difficult to read and even harder to maintain code. Putting state on a pedestal to be treated with great care, and being extremely consistent and deliberate with regard to how state is accessed and modified, dramatically simplifies your codebase. Some languages (Haskell for example) enforce this at a programmatic and syntactic level. You’d be amazed how much the clarity of your codebase can improve if you have libraries of pure functions that access no external state, and then a small surface area of stateful code which references the outside pure functionality.

Commandment #4: Good Code Doesn’t Reinvent the Wheel, it Stands on the Shoulders of Giants

Before potentially reinventing a wheel, think about how common the problem is you’re trying to solve or the function is you’re trying to perform. Somebody may have already implemented a solution you can leverage. Take the time to think about and research any such options, if appropriate and available.

That said, a completely reasonable counter-argument is that dependencies don’t come for “free” without any downside. By using a 3rd party or open source library that adds some interesting functionality, you are making the commitment to, and becoming dependent upon, that library. That’s a big commitment; if it’s a giant library and you only need a small bit of functionality do you really want the burden of updating the whole library if you upgrade, for example, to Python 3.x? And moreover, if you encounter a bug or want to enhance the functionality, you’re either dependent on the author (or vendor) to supply the fix or enhancement, or, if it’s open source, find yourself in the position of exploring a (potentially substantial) codebase you’re completely unfamiliar with trying to fix or modify an obscure bit of functionality.

Certainly the more well used the code you’re dependent upon is, the less likely you’ll have to invest time yourself into maintenance. The bottom line is that it’s worthwhile for you to do your own research and make your own evaluation of whether or not to include outside technology and how much maintenance that particular technology will add to your stack.

Below are some of the more common examples of things you should probably not be reinventing in the modern age in your project (unless these ARE your projects).

Databases

Figure out which of CAP you need for your project, then chose the database with the right properties. Database doesn’t just mean MySQL anymore, you can chose from:

  • “Traditional” Schema’ed SQL: Postgres / MySQL / MariaDB / MemSQL / Amazon RDS, etc.
  • Key Value Stores: Redis / Memcache / Riak
  • NoSQL: MongoDB/Cassandra
  • Hosted DBs: AWS RDS / DynamoDB / AppEngine Datastore
  • Heavy lifting: Amazon MR / Hadoop (Hive/Pig) / Cloudera / Google Big Query
  • Crazy stuff: Erlang’s Mnesia, iOS’s Core Data

Data Abstraction Layers

You should, in most circumstances, not be writing raw queries to whatever database you happen to chose to use. More likely than not, there exists a library to sit in between the DB and your application code, separating the concerns of managing concurrent database sessions and details of the schema from your main code. At the very least, you should never have raw queries or SQL inline in the middle of your application code. Rather, wrap it in a function and centralize all the functions in a file called something really obvious (e.g., “queries.py”). A line like users = load_users(), for example, is infinitely easier to read than users = db.query(“SELECT username, foo, bar from users LIMIT 10 ORDER BY ID”). This type of centralization also makes it much easier to have consistent style in your queries, and limits the number of places to go to change the queries should the schema change.

Other Common Libraries and Tools to Consider Leveraging

  • Queuing or Pub/Sub Services. Take your pick of AMQP providers, ZeroMQ, RabbitMQ, Amazon SQS
  • Storage. Amazon S3, Google Cloud Storage
  • Monitoring: Graphite/Hosted Graphite, AWS Cloud Watch, New Relic
  • Log Collection / Aggregation. LogglySplunk

Auto Scaling

  • Auto Scaling. Heroku, AWS Beanstalk, AppEngine, AWS Opsworks, Digital Ocean

Commandment #5: Don’t Cross the Streams!

There are many good models for programming designpub/subactorsMVC etc. Choose whichever you like best, and stick to it. Different kinds of logic dealing with different kinds of data should be physically isolated in the codebase (again, this separation of concerns concept and reducing cognitive load on the future-reader). The code which updates your UI should be physically distinct from the code that calculates what goes into the UI, for example.

Commandment #6: When Possible, Let the Computer Do the Work

If the compiler can catch logical errors in your code and prevent either bad behavior, bugs, or outright crashes, we absolutely should take advantage of that. Of course, some languages have compilers that make this easier than others. Haskell, for example, has a famously strict compiler that results in programmers spending most of their effort just getting code to compile. Once it compiles though, “it just works”. For those of you who’ve either never written in a strongly typed functional language this may seem ridiculous or impossible, but don’t take my word for it. Seriously, click on some of these links, it’s absolutely possible to live in a world without runtime errors. And it really is that magical.

Admittedly, not every language has a compiler or a syntax that lends itself to much (or in some cases any!) compile-time checking. For those that don’t, take a few minutes to research what optional strictness checks you can enable in your project and evaluate if they make sense for you. A short, non-comprehensive list of some common ones I’ve used lately for languages with lenient runtimes include:

Conclusion

This is by no means an exhaustive or the perfect list of commandments for producing “good” (i.e., easily maintainable) code. That said, if every codebase I ever had to pick up in the future followed even half of the concepts in this list, I will have many fewer gray hairs and might even be able to add an extra 5 years on the end of my life. And I’ll certainly find work more enjoyable and less stressful.

This article is from Toptal