WSGI: The Server-Application Interface for Python

In 1993, the web was still in its infancy, with about 14 million users and a hundred websites. Pages were static but there was already a need to produce dynamic content, such as up-to-date news and data. Responding to this, Rob McCool and other contributors implemented the Common Gateway Interface (CGI) in the National Center for Supercomputing Applications (NCSA) HTTPd web server (the forerunner of Apache). This was the first web server that could serve content generated by a separate application.

Since then, the number of users on the Internet has exploded, and dynamic websites have become ubiquitous. When first learning a new language or even first learning to code, developers, soon enough, want to know about how to hook their code into the web.

Python on the Web and the Rise of WSGI

Since the creation of CGI, much has changed. The CGI approach became impractical, as it required the creation of a new process at each request, wasting memory and CPU. Some other low-level approaches emerged, like FastCGI](http://www.fastcgi.com/) (1996) and mod_python (2000), providing different interfaces between Python web frameworks and the web server. As different approaches proliferated, the developer’s choice of framework ended up restricting the choices of web servers and vice versa.

To address this problem, in 2003 Phillip J. Eby proposed PEP-0333, the Python Web Server Gateway Interface (WSGI). The idea was to provide a high-level, universal interface between Python applications and web servers.

In 2003, PEP-3333 updated the WSGI interface to add Python 3 support. Nowadays, almost all Python frameworks use WSGI as a means, if not the only means, to communicate with their web servers. This is how DjangoFlask and many other popular frameworks do it.

This article intends to provide the reader with a glimpse into how WSGI works, and allow the reader to build a simple WSGI application or server. It is not meant to be exhaustive, though, and developers intending to implement production-ready servers or applications should take a more thorough look into the WSGI specification.

The Python WSGI Interface

WSGI specifies simple rules that the server and application must conform to. Let’s start by reviewing this overall pattern.

The Python WSGI server-application interface.

Application Interface

In Python 3.5, the application interfaces goes like this:

def application(environ, start_response):
body = b'Hello world!\n'
status = '200 OK'
headers = [('Content-type', 'text/plain')]
start_response(status, headers)
return [body]

In Python 2.7, this interface wouldn’t be much different; the only change would be that the body is represented by a str object, instead of a bytes one.

Though we’ve used a function in this case, any callable will do. The rules for the application object here are:

  • Must be a callable with environ and start_response parameters.
  • Must call the start_response callback before sending the body.
  • Must return an iterable with pieces of the document body.

Another example of an object that satisfies these rules and would produce the same effect is:

class Application:
def __init__(self, environ, start_response):
self.environ = environ
self.start_response = start_response
def __iter__(self):
body = b'Hello world!\n'
status = '200 OK'
headers = [('Content-type', 'text/plain')]
self.start_response(status, headers)
yield body

Server Interface

A WSGI server might interface with this application like this::

def write(chunk):
\[code\]'Write data back to client\[/code\]'
...
def send_status(status):
\[code\]'Send HTTP status code\[/code\]'
...
def send_headers(headers):
\[code\]'Send HTTP headers\[/code\]'
...
def start_response(status, headers):
\[code\]'WSGI start_response callable\[/code\]'
send_status(status)
send_headers(headers)
return write
# Make request to application
response = application(environ, start_response)
try:
for chunk in response:
write(chunk)
finally:
if hasattr(response, 'close'):
response.close()

As you may have noticed, the start_response callable returned a write callable that the application may use to send data back to the client, but that was not used by our application code example. This write interface is deprecated, and we can ignore it for now. It will be briefly discussed later in the article.

Another peculiarity of the server’s responsibilities is to call the optional close method on the response iterator, if it exists. As pointed out in Graham Dumpleton’s article here, it is an often-overlooked feature of WSGI. Calling this method, if it exists, allows the application to release any resources that it may still hold.

The Application Callable’s environ Argument

The environ parameter should be a dictionary object. It is used to pass request and server information to the application, much in the same way CGI does. In fact, all CGI environment variables are valid in WSGI and the server should pass all that apply to the application.

While there are many optional keys that can be passed, several are mandatory. Taking as an example the following GET request:

$ curl 'http://localhost:8000/auth?user=obiwan&token=123'

These are the keys that the server must provide, and the values they would take:

KeyValueComments
REQUEST_METHOD "GET"
SCRIPT_NAME "" server setup dependent
PATH_INFO "/auth"
QUERY_STRING "token=123"
CONTENT_TYPE ""
CONTENT_LENGTH ""
SERVER_NAME "127.0.0.1" server setup dependent
SERVER_PORT "8000"
SERVER_PROTOCOL "HTTP/1.1"
HTTP_(...) Client supplied HTTP headers
wsgi.version (1, 0) tuple with WSGI version
wsgi.url_scheme "http"
wsgi.input File-like object
wsgi.errors File-like object
wsgi.multithread False True if server is multithreaded
wsgi.multiprocess False True if server runs multiple processes
wsgi.run_once False True if the server expects this script to run only once (e.g.: in a CGI environment)

The exception to this rule is that if one of these keys were to be empty (like CONTENT_TYPE in the above table), then they can be omitted from the dictionary, and it will be assumed they correspond to the empty string.

wsgi.input and wsgi.errors

Most environ keys are straightforward, but two of them deserve a little more clarification: wsgi.input, which must contain a stream with the request body from the client, and wsgi.errors, where the application reports any errors it encounters. Errors sent from the application to wsgi.errors typically would be sent to the server error log.

These two keys must contain file-like objects; that is, objects that provide interfaces to be read or written to as streams, just like the object we get when we open a file or a socket in Python. This may seem tricky at first, but fortunately, Python gives us good tools to handle this.

First, what kind of streams are we talking about? As per WSGI definition, wsgi.input and wsgi.errors must handle bytes objects in Python 3 and str objects in Python 2. In either case, if we’d like to use an in-memory buffer to pass or get data through the WSGI interface, we can use the class io.BytesIO.

As an example, if we are writing a WSGI server, we could provide the request body to the application like this:

  • For Python 2.7
import io
...
request_data = 'some request body'
environ['wsgi.input'] = io.BytesIO(request_data)

  • For Python 3.5
import io
...
request_data = 'some request body'.encode('utf-8') # bytes object
environ['wsgi.input'] = io.BytesIO(request_data)

On the application side, if we wanted to turn a stream input we’ve received into a string, we’d want to write something like this:

  • For Python 2.7
readstr = environ['wsgi.input'].read() # returns str object

  • For Python 3.5
readbytes = environ['wsgi


                

            


Python Design Patterns: For Sleek And Fashionable Code

Let’s say it again: Python is a high-level programming language with dynamic typing and dynamic binding. I would describe it as a powerful, high-level dynamic language. Many developers are in love with Python because of its clear syntax, well structured modules and packages, and for its enormous flexibility and range of modern features.

In Python, nothing obliges you to write classes and instantiate objects from them. If you don’t need complex structures in your project, you can just write functions. Even better, you can write a flat script for executing some simple and quick task without structuring the code at all.

At the same time Python is a 100 percent object-oriented language. How’s that? Well, simply put, everything in Python is an object. Functions are objects, first class objects (whatever that means). This fact about functions being objects is important, so please remember it.

So, you can write simple scripts in Python, or just open the Python terminal and execute statements right there (that’s so useful!). But at the same time, you can create complex frameworks, applications, libraries and so on. You can do so much in Python. There are of course a number of limitations, but that’s not the topic of this article.

However, because Python is so powerful and flexible, we need some rules (or patterns) when programming in it. So, let see what patterns are, and how they relate to Python. We will also proceed to implement a few essential Python design patterns.

Why Is Python Good For Patterns?

Any programming language is good for patterns. In fact, patterns should be considered in the context of any given programming language. Both the patterns, language syntax and nature impose limitations on our programming. The limitations that come from the language syntax and language nature (dynamic, functional, object oriented, and the like) can differ, as can the reasons behind their existence. The limitations coming from patterns are there for a reason, they are purposeful. That’s the basic goal of patterns; to tell us how to do something and how not to do it. We’ll speak about patterns, and especially Python design patterns, later.

Python is a dynamic and flexible language. Python design patterns are a great way of harnessing its vast potential.

Python is a dynamic and flexible language. Python design patterns are a great way of harnessing its vast potential.

Python’s philosophy is built on top of the idea of well thought out best practices. Python is a dynamic language (did I already said that?) and as such, already implements, or makes it easy to implement, a number of popular design patterns with a few lines of code. Some design patterns are built into Python, so we use them even without knowing. Other patterns are not needed due of the nature of the language.

For example, Factory is a structural Python design pattern aimed at creating new objects, hiding the instantiation logic from the user. But creation of objects in Python is dynamic by design, so additions like Factory are not necessary. Of course, you are free to implement it if you want to. There might be cases where it would be really useful, but they’re an exception, not the norm.

What is so good about Python’s philosophy? Let’s start with this (explore it in the Python terminal):

>>> import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

These might not be patterns in the traditional sense, but these are rules that define the “Pythonic” approach to programming in the most elegant and useful fashion.

We have also PEP-8 code guidelines that help structure our code. It’s a must for me, with some appropriate exceptions, of course. By the way, these exceptions are encouraged by PEP-8 itself:

“But most importantly: know when to be inconsistent – sometimes the style guide just doesn’t apply. When in doubt, use your best judgment. Look at other examples and decide what looks best. And don’t hesitate to ask!”

Combine PEP-8 with The Zen of Python (also a PEP - PEP-20), and you’ll have a perfect foundation to create readable and maintainable code. Add Design Patterns and you are ready to create every kind of software system with consistency and evolvability.

Python Design Patterns

What Is A Design Pattern?

Everything starts with the Gang of Four (GOF). Do a quick online search if you are not familiar with the GOF.

Design patterns are a common way of solving well known problems. Two main principles are in the bases of the design patterns defined by the GOF:

  • Program to an interface not an implementation.
  • Favor object composition over inheritance.

Let’s take a closer look at these two principles from the perspective of Python programmers.

Program to an interface not an implementation

Think about Duck Typing. In Python we don’t like to define interfaces and program classes according these interfaces, do we? But, listen to me! This doesn’t mean we don’t think about interfaces, in fact with Duck Typing we do that all the time.

Let’s say some words about the infamous Duck Typing approach to see how it fits in this paradigm: program to an interface.

If it looks like a duck and quacks like a duck, it's a duck!

If it looks like a duck and quacks like a duck, it's a duck!

We don’t bother with the nature of the object, we don’t have to care what the object is; we just want to know if it’s able to do what we need (we are only interested in the interface of the object).

Can the object quack? So, let it quack!

try:
bird.quack()
except AttributeError:
self.lol()

Did we define an interface for our duck? No! Did we program to the interface instead of the implementation? Yes! And, I find this so nice.

As Alex Martelli points out in his well known presentation about Design Patterns in Python, “Teaching the ducks to type takes a while, but saves you a lot of work afterwards!”

Favor object composition over inheritance

Now, that’s what I call a Pythonic principle! I have created fewer classes/subclasses compared to wrapping one class (or more often, several classes) in another class.

Instead of doing this:

class User(DbObject):
pass

We can do something like this:

class User:
_persist_methods = ['get', 'save', 'delete']
def __init__(self, persister):
self._persister = persister
def __getattr__(self, attribute):
if attribute in self._persist_methods:
return getattr(self._persister, attribute)

The advantages are obvious. We can restrict what methods of the wrapped class to expose. We can inject the persister instance in runtime! For example, today it’s a relational database, but tomorrow it could be whatever, with the interface we need (again those pesky ducks).

Composition is elegant and natural to Python.

Behavioral Patterns

Behavioural Patterns involve communication between objects, how objects interact and fulfil a given task. According to GOF principles, there are a total of 11 behavioral patterns in Python: Chain of responsibility, Command, Interpreter, Iterator, Mediator, Memento, Observer, State, Strategy, Template, Visitor.

Behavioural patterns deal with inter-object communication, controlling how various objects interact and perform different tasks.

Behavioural patterns deal with inter-object communication, controlling how various objects interact and perform different tasks.

I find these patterns very useful, but this does not mean the other pattern groups are not.

Iterator

Iterators are built into Python. This is one of the most powerful characteristics of the language. Years ago, I read somewhere that iterators make Python awesome, and I think this is still the case. Learn enough about Python iterators and generators and you’ll know everything you need about this particular Python pattern.

Chain of responsibility

This pattern gives us a way to treat a request using different methods, each one addressing a specific part of the request. You know, one of the best principles for good code is the Single Responsibility principle.

Every piece of code must do one, and only one, thing.

This principle is deeply integrated in this design pattern.

For example, if we want to filter some content we can implement different filters, each one doing one precise and clearly defined type of filtering. These filters could be used to filter offensive words, ads, unsuitable video content, and so on.

class ContentFilter(object):
def __init__(self, filters=None):
self._filters = list()
if filters is not None:
self._filters += filters
def filter(self, content):
for filter in self._filters:
content = filter(content)
return content
filter = ContentFilter([
offensive_filter,
ads_filter,
porno_video_filter])
filtered_content = filter.filter(content)

Command

This is one of the first Python design patterns I implemented as a programmer. That reminds me: Patterns are not invented, they are discovered. They exist, we just need to find and put them to use. I discovered this one for an amazing project we implemented many years ago: a special purpose WYSIWYM XML editor. After using this pattern intensively in the code, I read more about it on some sites.

The command pattern is handy in situations when, for some reason, we need to start by preparing what will be executed and then to execute it when needed. The advantage is that encapsulating actions in such a way enables Python developers to add additional functionalities related to the executed actions, such as undo/redo, or keeping a history of actions and the like.

Let’s see what a simple and frequently used example looks like:

class RenameFileCommand(object):
def __init__(self, from_name, to_name):
self._from = from_name
self._to = to_name
def execute(self):
os.rename(self._from, self._to)
def undo(self):
os.rename(self._to, self._from)
class History(object):
def __init__(self):
self._commands = list()
def execute(self, command):
self._commands.append(command)
command.execute()
def undo(self):
self._commands.pop().undo()
history = History()
history.execute(RenameFileCommand('docs/cv.doc', 'docs/cv-en.doc'))
history.execute(RenameFileCommand('docs/cv1.doc', 'docs/cv-bg.doc'))
history.undo()
history.undo()

Like what you're reading?
Get the latest updates first.
No spam. Just great engineering and design posts.

Creational Patterns

Let’s start by pointing out that creational patterns are not commonly used in Python. Why? Because of the dynamic nature of the language.

Someone wiser than I once said that Factory is built into Python. It means that the language itself provides us with all the flexibility we need to create objects in a sufficiently elegant fashion; we rarely need to implement anything on top, like Singleton or Factory.

In one Python Design Patterns tutorial, I found a description of the creational design patterns that stated these design “patterns provide a way to create objects while hiding the creation logic, rather than instantiating objects directly using a new operator.”

That pretty much sums up the problem: We don’t have a new operator in Python!

Nevertheless, let’s see how we can implement a few, should we feel we might gain an advantage by using such patterns.

Singleton

The Singleton pattern is used when we want to guarantee that only one instance of a given class exists during runtime. Do we really need this pattern in Python? Based on my experience, it’s easier to simply create one instance intentionally and then use it instead of implementing the Singleton pattern.

But should you want to implement it, here is some good news: In Python, we can alter the instantiation process (along with virtually anything else). Remember the __new__() method I mentioned earlier? Here we go:

class Logger(object):
def __new__(cls, *args, **kwargs):
if not hasattr(cls, '_logger'):
cls._logger = super(Logger, cls
).__new__(cls, *args, **kwargs)
return cls._logger

In this example, Logger is a Singleton.

These are the alternatives to using a Singleton in Python:

  • Use a module.
  • Create one instance somewhere at the top-level of your application, perhaps in the config file.
  • Pass the instance to every object that needs it. That’s a dependency injection and it’s a powerful and easily mastered mechanism.

Dependency Injection

I don’t intend to get into a discussion on whether dependency injection is a design pattern, but I will say that it’s a very good mechanism of implementing loose couplings, and it helps make our application maintainable and extendable. Combine it with Duck Typing and the Force will be with you. Always.

Duck? Human? Python does not care. It's flexible!

Duck? Human? Python does not care. It's flexible!

I listed it in the creational pattern section of this post because it deals with the question of when (or even better: where) the object is created. It’s created outside. Better to say that the objects are not created at all where we use them, so the dependency is not created where it is consumed. The consumer code receives the externally created object and uses it. For further reference, please read the most upvoted answer to this Stackoverflow question.

It’s a nice explanation of dependency injection and gives us a good idea of the potential of this particular technique. Basically the answer explains the problem with the following example: Don’t get things to drink from the fridge yourself, state a need instead. Tell your parents that you need something to drink with lunch.

Python offers us all we need to implement that easily. Think about its possible implementation in other languages such as Java and C#, and you’ll quickly realize the beauty of Python.

Let’s think about a simple example of dependency injection:

class Command:
def __init__(self, authenticate=None, authorize=None):
self.authenticate = authenticate or self._not_authenticated
self.authorize = authorize or self._not_autorized
def execute(self, user, action):
self.authenticate(user)
self.authorize(user, action)
return action()
if in_sudo_mode:
command = Command(always_authenticated, always_authorized)
else:
command = Command(config.authenticate, config.authorize)
command.execute(current_user, delete_user_action)

We inject the authenticator and authorizer methods in the Command class. All the Command class needs is to execute them successfully without bothering with the implementation details. This way, we may use the Command class with whatever authentication and authorization mechanisms we decide to use in runtime.

We have shown how to inject dependencies through the constructor, but we can easily inject them by setting directly the object properties, unlocking even more potential:

command = Command()
if in_sudo_mode:
command.authenticate = always_authenticated
command.authorize = always_authorized
else:
command.authenticate = config.authenticate
command.authorize = config.authorize
command.execute(current_user, delete_user_action)

There is much more to learn about dependency injection; curious people would search for IoC, for example.

But before you do that, read another Stackoverflow answer, the most upvoted one to this question.

Again, we just demonstrated how implementing this wonderful design pattern in Python is just a matter of using the built-in functionalities of the language.

Let’s not forget what all this means: The dependency injection technique allows for very flexible and easy unit-testing. Imagine an architecture where you can change data storing on-the-fly. Mocking a database becomes a trivial task, doesn’t it? For further information, you can check out Toptal’s Introduction to Mocking in Python.

You may also want to research PrototypeBuilder and Factory design patterns.

Structural Patterns

Facade

This may very well be the most famous Python design pattern.

Read the full article in 



Migrate Legacy Data Without Screwing It Up

Migrating legacy data is hard.

Many organizations have old and complex on-premise business CRM systems. Today, there are plenty of cloud SaaS alternatives, which come with many benefits; pay as you go and to pay only for what you use. Therefore, they decide to move to the new systems.

Nobody wants to leave valuable data about customers in the old system and start with the empty new system, so we need to migrate this data. Unfortunately, data migration is not an easy task, as around 50 percent of deployment effort is consumed by data migration activities. According to Gartner, Salesforce is the leader of cloud CRM solutions. Therefore, data migration is a major topic for Salesforce deployment.

10 Tips For Successful Legacy Data Migration To Salesforce

How to ensure successful transition of legacy data into a new system
while preserving all history.

So, how can we ensure a successful transition of legacy data into a shiny new system and ensure we will preserve all of its history? In this article, I provide 10 tips for successful data migration. The first five tips apply to any data migration, regardless of the technology used.

Data Migration in General

1. Make Migration a Separate Project

In the software deployment checklist, data migration is not an “export and import” item handled by a clever “push one button” data migration tool that has predefined mapping for target systems.

Data migration is a complex activity, deserving a separate project, plan, approach, budget, and team. An entity level scope and plan must be created at the project’s beginning, ensuring no surprises, such as “Oh, we forgot to load those clients’ visit reports, who will do that?” two weeks before the deadline.

The data migration approach defines whether we will load the data in one go (also known as the big bang), or whether we will load small batches every week.

This is not an easy decision, though. The approach must be agreed upon and communicated to all business and technical stakeholders so that everybody is aware of when and what data will appear in the new system. This applies to system outages too.

2. Estimate Realistically

Do not underestimate the complexity of the data migration. Many time-consuming tasks accompany this process, which may be invisible at the project’s beginning.

For example, loading specific data sets for training purposes with a bunch of realistic data, but with sensitive items obfuscated, so that training activities do not generate email notifications to clients.

The basic factor for estimation is the number of fields to be transferred from a source system to a target system.

Some amount of time is needed in different stages of the project for every field, including understanding the field, mapping the source field to the target field, configuring or building transformations, performing tests, measuring data quality for the field, and so on.

Using clever tools, such as Jitterbit, Informatica Cloud Data Wizard, Starfish ETL, Midas, and the like, can reduce this time, especially in the build phase.

In particular, understanding the source data – the most crucial task in any migration project – cannot be automated by tools, but requires analysts to take time going through the list of fields one by one.

The simplest estimate of the overall effort is one man-day for every field transferred from the legacy system.

An exception is data replication between the same source and target schemas without further transformation – sometimes known as 1:1 migration – where we can base the estimate on the number of tables to copy.

A detailed estimate is an art of its own.

3. Check Data Quality

Do not overestimate the quality of source data, even if no data quality issues are reported from the legacy systems.

New systems have new rules, which may be violated with legacy data. Here’s a simple example. Contact email can be mandatory in the new system, but a 20-year-old legacy system may have a different point of view.

There can be mines hidden in historical data that have not been touched for a long time but could activate when transferring to the new system. For example, old data using European currencies that do not exist anymore need to be converted to Euros, otherwise, currencies must be added to the new system.

Data quality significantly influences effort, and the simple rule is: The further we go in history, the bigger mess we will discover. Thus, it is vital to decide early on how much history we want to transfer into the new system.

4. Engage Business People

Business people are the only ones who truly understand the data and who can therefore decide what data can be thrown away and what data to keep.

It is important to have somebody from the business team involved during the mapping exercise, and for future backtracking, it is useful to record mapping decisions and the reasons for them.

Since a picture is worth more than a thousand words, load a test batch into the new system, and let the business team play with it.

Even if data migration mapping is reviewed and approved by the business team, surprises can appear once the data shows up in the new system’s UI.

“Oh, now I see, we have to change it a bit,” becomes a common phrase.

Failing to engage subject matter experts, who are usually very busy people, is the most common cause of problems after a new system goes live.

5. Aim for Automated Migration Solution

Data migration is often viewed as a one-time activity, and developers tend to end up with solutions full of manual actions hoping to execute them only once. But there are many reasons to avoid such an approach.

  • If migration is split into multiple waves, we have to repeat the same actions multiple times.
  • Typically, there are at least three migration runs for every wave: a dry run to test the performance and functionality of data migration, a full data validation load to test the entire data set and to perform business tests, and of course, production load. The number of runs increases with poor data quality. Improving data quality is an iterative process, so we need several iterations to reach the desired success ratio.

Thus, even if migration is one-time activity by nature, having manual actions can significantly slow down your operations.

Salesforce Data Migration

Next we will cover five tips for a successful Salesforce migration. Keep in mind, these tips are likely applicable to other cloud solutions as well.

6. Prepare for Lengthy Loads

Performance is one of, if not the biggest, tradeoff when moving from an on-premise to a cloud solution – Salesforce not excluded.

On-premise systems usually allow for direct data load into an underlying database, and with good hardware, we can easily reach millions of records per hour.

But, not in the cloud. In the cloud, we are heavily limited by several factors.

  • Network latency – Data is transferred via the internet.
  • Salesforce application layer – Data is moved through a thick API multitenancy layer until they land in their Oracle databases.
  • Custom code in Salesforce – Custom validations, triggers, workflows, duplication detection rules, and so on – many of which disable parallel or bulk loads.

As a result, load performance can be thousands of accounts per hour.

It can be less, or it can be more, depending on things, such as the number of fields, validations and triggers. But it is several grades slower than a direct database load.

Performance degradation, which is dependent on the volume of the data in Salesforce, must also be considered.

It is caused by indexes in the underlying RDBMS (Oracle) used for checking foreign keys, unique fields, and evaluation of duplication rules. The basic formula is approximately 50 percent slowdown for every grade of 10, caused by O(logN) the time complexity portion in sort and B-tree operations.

Moreover, Salesforce has many resource usage limits.

One of them is the Bulk API limit set to 5,000 batches in 24-hour rolling windows, with the maximum of 10,000 records in each batch.

So, the theoretical maximum is 50 million records loaded in 24 hours.

In a real project, the maximum is much lower due to limited batch size when using, for example, custom triggers.

This has a strong impact on the data migration approach.

Even for medium-sized datasets (from 100,000 to 1 million accounts), the big bang approach is out of the question, so we must split data into smaller migration waves.

This, of course, impacts the entire deployment process and increases the migration complexity because we will be adding data increments into a system already populated by previous migrations and data entered by users.

We must also consider this existing data in the migration transformations and validations.

Further, lengthy loads can mean we cannot perform migrations during a system outage.

If all users are located in one country, we can leverage an eight-hour outage during the night.

But for a company, such as Coca-Cola, with operations all over the world, that is not possible. Once we have U.S., Japan, and Europe in the system, we span all time zones, so Saturday is the only outage option that doesn’t affect users.

And that may not be enough, so, we must load while online, when users are working with the system.

7. Respect Migration Needs in Application Development

Application components, such as validations and triggers, should be able to handle data migration activities. Hard disablement of validations at the time of the migration load is not an option if the system must be online. Instead, we have to implement different logic in validations for changes performed by a data migration user.

  • Date fields should not be compared to the actual system date because that would disable the loading of historical data. For example, validation must allow entering a past account start date for migrated data.
  • Mandatory fields, which may not be populated with historical data, must be implemented as non-mandatory, but with validation sensitive to the user, thus allowing empty values for data coming from the migration, but rejecting empty values coming from regular users via the GUI.
  • Triggers, especially those sending new records to the integration, must be able to be switched on/off for the data migration user in order to prevent flooding the integration with migrated data.

Another trick is using field Legacy ID or Migration ID in every migrated object. There are two reasons for this. The first is obvious: To keep the ID from the old system for backtracking; after the data is in the new system, people may still want to search their accounts using the old IDs, found in places as emails, documents, and bug-tracking systems. Bad habit? Maybe. But users will thank you if you preserve their old IDs. The second reason is technical and comes from the fact Salesforce does not accept explicitly provided IDs for new records (unlike Microsoft Dynamics) but generates them during the load. The problem arises when we want to load child objects because we have to assign them IDs of the parent objects. Since we will know those IDs only after loading, this is a futile exercise.

Let’s use Accounts and their Contacts, for example:

  1. Generate data for Accounts.
  2. Load Accounts into Salesforce, and receive generated IDs.
  3. Incorporate new Account IDs in Contact data.
  4. Generate data for Contacts.
  5. Load Contacts in Salesforce.

We can do this more simply by loading Accounts with their Legacy IDs stored in a special external field. This field can be used as a parent reference, so when loading Contacts, we simply use the Account Legacy ID as a pointer to the parent Account:

  1. Generate data for Accounts, including Legacy ID.
  2. Generate data for Contacts, including Account Legacy ID.
  3. Load Accounts into Salesforce.
  4. Load Contacts in Salesforce, using Account Legacy ID as parent reference.

The nice thing here is that we have separated a generation and a loading phase, which allows for better parallelism, decrease outage time, and so on.

8. Be Aware of Salesforce Specific Features

Like any system, Salesforce has plenty of tricky parts of which we should be aware in order to avoid unpleasant surprises during data migration. Here are handful of examples:

  • Some changes on active Users automatically generate email notifications to user emails. Thus, if we want to play with user data, we need to deactivate users first and activate after changes are completed. In test environments, we scramble user emails so that notifications are not fired at all. Since active users consume costly licenses, we are not able to have all users active in all test environments. We have to manage subsets of active users, for example, to activate just those in a training environment.
  • Inactive users, for some standard objects such as Account or Case, can be assigned only after granting the system permission “Update Records with Inactive Owners,” but they can be assigned, for example, to Contacts and all custom objects.
  • When Contact is deactivated, all opt out fields are silently turned on.
  • When loading a duplicate Account Team Member or Account Share object, the existing record is silently overwritten. However, when loading a duplicate Opportunity Partner, the record is simply added resulting in a duplicate.
  • System fields, such as Created DateCreated By IDLast Modified DateLast Modified By ID, can be explicitly written only after granting a new system permission “Set Audit Fields upon Record Creation.”
  • History-of-field value changes cannot be migrated at all.
  • Owners of knowledge articles cannot be specified during the load but can be updated later.
  • The tricky part is the storing of content (documents, attachments) into Salesforce. There are multiple ways to do it (using Attachments, Files, Feed attachments, Documents), and each way has its pros and cons, including different file size limits.
  • Picklist fields force users to select one of the allowed values, for example, a type of account. But when loading data using Salesforce API (or any tool built upon it, such as Apex Data Loader or Informatica Salesforce connector), any value will pass.

The list goes on, but the bottom line is: Get familiar with the system, and learn what it can do and what it cannot do before you make assumptions. Do not assume standard behavior, especially for core objects. Always research and test.

9. Do Not Use Salesforce as a Data Migration Platform

It is very tempting to use Salesforce as a platform for building a data migration solution, especially for Salesforce developers. It is the same technology for the data migration solution as for the Salesforce application customization, the same GUI, the same Apex programming language, the same infrastructure. Salesforce has objects which can act as tables, and a kind of SQL language, Salesforce Object Query Language (SOQL). However, please do not use it; it would be a fundamental architectural flaw.

Salesforce is an excellent SaaS application with a lot of nice features, such as advanced collaboration and rich customization, but mass processing of data is not one of them. The three most significant reasons are:

  • Performance – Processing of data in Salesforce is several grades slower than in RDBMS.
  • Lack of analytical features – Salesforce SOQL does not support complex queries and analytical functions that must be supported by Apex language, and would degrade performance even more.
  • Architecture* – Putting a data migration platform inside a specific Salesforce environment is not very convenient. Usually, we have multiple environments for specific purposes, often created ad hoc so that we can put a lot of time on code synchronization. Plus, you would also be relying on connectivity and availability of that specific Salesforce environment.

Instead, build a data migration solution in a separate instance (it could be a cloud or on-premise) using an RDBMS or ETL platform. Connect it with source systems and target the Salesforce environments you want, move the data you need into your staging area and process it there. This will allow you to:

  • Leverage the full power and capabilities of the SQL language or ETL features.
  • Have all code and data in one place so that you can run analyses across all systems.
    • For example, you can combine the newest configuration from the most up-to-date test Salesforce environment with real data from the production Salesforce environment.
  • You are not so dependent upon the technology of the source and target systems and you can reuse your solution for the next project.

10. Oversight Salesforce Metadata

At the project beginning, we usually grab a list of Salesforce fields and start the mapping exercise. During the project, it often happens that new fields are added by the application development team into Salesforce, or that some field properties are changed. We can ask the application team to notify the data migration team about every data model change, but doesn’t always work. To be safe, we need to have all data model changes under supervision.

A common way to do this is to download, on a regular basis, migration-relevant metadata from Salesforce into some metadata repository. Once we have this, we can not only detect changes in the data model, but we can also compare data models of two Salesforce environments.

What metadata to download:

  • A list of objects with their labels and technical names and attributes such as creatable or updatable.
  • A list of fields with their attributes (better to get all of them).
  • A list of picklist values for picklist fields. We will need them to map or validate input data for correct values.
  • A list of validations, to make sure new validations are not creating problems for migrated data.

How to download metadata from Salesforce? Well, there is no standard metadata method, but there are multiple options:

  • Generate Enterprise WSDL – In the Salesforce web application navigate to the Setup / API menu and download strongly typed Enterprise WSDL, which describe all the objects and fields in Salesforce (but not picklist values nor validations).
  • Call Salesforce describeSObjects web service, directly or by using Java or C# wrapper (consult Salesforce API). This way, you get what you need, and this the recommended way to export the metadata.
  • Use any of the numerous alternative tools available on the internet.

Prepare for the Next Data Migration

Cloud solutions, such as Salesforce, are ready instantly. If you are happy with the built-in functionalities, just log in and use it. However, Salesforce, like any other cloud CRM solution, brings specific problems to data migration topics to be aware of, in particular, regarding the performance and resources limits.

Moving legacy data into the new system is always a journey, sometimes a journey to history hidden in data from past years. In this article, based on a dozen migration projects, I presented 10 tips how to migrate legacy data and successfully avoid the most pitfalls.

The key is to understand what the data reveals. So, before you start the data migration, make sure you are well prepared for the potential problems your data may hold.

This article was originally posted on Toptal 

 



Application Development with Rapid Application Development Framework AllcountJS

The idea of Rapid Application Development (RAD) was born as a response to traditional waterfall development models. Many variations of RAD exist; for example, Agile development and the Rational Unified Process. However, all such models have one thing in common: they aim to yield maximum business value with minimal development time through prototyping and iterative development. To accomplish this, the Rapid Application Development model relies on tools that ease the process. In this article, we shall explore one such tool, and how it can be used to focus on business value and optimization of the development process.

AllcountJS is an emerging open source framework built with rapid application development in mind. It is based on the idea of declarative application development using JSON-like configuration code that describes the structure and behavior of the application. The framework has been built on top of Node.js, Express, MongoDB and relies heavily on AngularJS and Twitter Bootstrap. Although it is reliant on declarative patterns, the framework still allows further customization through direct access to API where needed.

AllcountJS as your RAD Framework

Why AllcountJS as Your RAD Framework?

According to Wikipedia, there are at least one hundred tools that promise rapid application development, but this raises the question: how rapid is “rapid.” Do these tools allow a particular data-centric application to be developed in a few hours? Or, perhaps, it is “rapid” if the application can be developed in a few days or a few weeks. Some of these tools even claim that a few minutes is all it takes to produce a working application. However, it is unlikely that you could build a useful application in under five minutes and still claim to have satisfied every business need. AllcountJS doesn’t claim to be such a tool; what AllcountJS offers is a way to prototype an idea in a short period of time.

With AllcountJS framework, it is possible to build an application with a themeable auto-generated user interface, user management features, RESTful API and a handful of other features with minimal effort and time. It is possible to use AllcountJS for a wide variety of use cases, but it best suits applications where you have different collections of objects with different views for them. Typically, business applications are a good fit for this model.

AllcountJS has been used to build allcountjs.com, plus a project tracker for it. It is worth noting that allcountjs.com is a customized AllcountJS application, and that AllcountJS allows both static and dynamic views to be combined with little hassle. It even allows dynamically loaded parts to be inserted into static content. For example, AllcountJS manages a collection of demo application templates. There is a demo widget on the main page of allcountjs.com that loads a random application template from that collection. A handful of other sample applications are available in the gallery at allcountjs.com.

Getting Started

To demonstrate some of the capabilities of RAD framework AllcountJS, we will create a simple application for Toptal, which we will call Toptal Community. If you follow our blog you may already know that a similar application was built using Hoodie as part of one of our earlier blog posts. This application will allow community members to sign up, create events and apply to attend them.

In order to set up the environment, you should install Node.jsMongoDB and Git. Then, install AllcountJS CLI by invoking an “npm install” command and perform project init:

npm install -g allcountjs-cli
allcountjs init toptal-community-allcount
cd toptal-community-allcount
npm install

AllcountJS CLI will ask you to enter some info about your project in order to pre-fill package.json.

AllcountJS can be used as standalone server or as a dependency. In our first example we aren’t going to extend AllcountJS, so a standalone server should just work for us.

Inside this newly created app-config directory, we will replace contents of main.js JavaScript file with the following snippet of code:

A.app({
appName: "Toptal Community",
onlyAuthenticated: true,
allowSignUp: true,
appIcon: "rocket",
menuItems: [{
name: "Events",
entityTypeId: "Event",
icon: "calendar"
}, {
name: "My Events",
entityTypeId: "MyEvent",
icon: "calendar"
}],
entities: function(Fields) {
return {
Event: {
title: "Events",
fields: {
eventName: Fields.text("Event").required(),
date: Fields.date("Date").required(),
time: Fields.text("Starts at").masked("99:99").required(),
appliedUsers: Fields.relation("Applied users", "AppliedUser", "event")
},
referenceName: "eventName",
sorting: [['date', -1], ['time', -1]],
actions: [{
id: "apply",
name: "Apply",
actionTarget: 'single-item',
perform: function (User, Actions, Crud) {
return Crud.actionContextCrud().readEntity(Actions.selectedEntityId()).then(function (eventToApply) {
var userEventCrud = Crud.crudForEntityType('UserEvent');
return userEventCrud.find({filtering: {"user": User.id, "event": eventToApply.id}}).then(function (events) {
if (events.length) {
return Actions.modalResult("Can't apply to event", "You've already applied to this event");
} else {
return userEventCrud.createEntity({
user: {id: User.id},
event: {id: eventToApply.id},
date: eventToApply.date,
time: eventToApply.time
}).then(function () { return Actions.navigateToEntityTypeResult("MyEvent") });
}
});
})
}
}]
},
UserEvent: {
fields: {
user: Fields.fixedReference("User", "OnlyNameUser").required(),
event: Fields.fixedReference("Event", "Event").required(),
date: Fields.date("Date").required(),
time: Fields.text("Starts at").masked("99:99").required()
},
filtering: function (User) { return {"user.id": User.id} },
sorting: [['date', -1], ['time', -1]],
views: {
MyEvent: {
title: "My Events",
showInGrid: ['event', 'date', 'time'],
permissions: {
write: [],
delete: null
}
},
AppliedUser: {
permissions: {
write: []
},
showInGrid: ['user']
}
}
},
User: {
views: {
OnlyNameUser: {
permissions: {
read: null,
write: ['admin']
}
},
fields: {
username: Fields.text("User name")
}
}
}
}
}
});

Although AllcountJS works with Git repositories, for sake of simplicity we will not use it in this tutorial. To run the Toptal Community application, all we have to do is invoke AllcountJS CLI run command in the toptal-community-allcount directory.

allcountjs run

It is worth noting that MongoDB should be running when this command is executed. If all goes well, the application should be up and running at http://localhost:9080.

To login please use the username “admin” and password “admin”.

Less Than 100 Lines

You may have noticed that the application defined in main.js took only 91 lines of code. These lines include the declaration of all the behaviors that you may observe when you navigate to http://localhost:9080. So, what exactly is happening, under the hood? Let us take a closer look at each aspect of the application, and see how the code relates to them.

Sign in & Sign up

The first page you see after opening the application is a sign in. This doubles as a sign up page, assuming that the checkbox - labelled “Sign Up” - is checked before submitting the form.

Sign in & Sign up

This page is shown because the main.js file declares that only authenticated users may use this application. Moreover, it enables the ability for users to sign up from this page. The following two lines are all that was necessary for this:

A.app({
...,
onlyAuthenticated: true,
allowSignUp: true,
...
})

Welcome Page

After signing in, you’ll be redirected to a welcome page with an application menu. This portion of the application is generated automatically, based on the menu items defined under the “menuItems” key.

welcome page example

Along with a couple of other relevant configurations, the menu is defined in the main.js file as follows:

A.app({
...,
appName: "Toptal Community",
appIcon: "rocket",
menuItems: [{
name: "Events",
entityTypeId: "Event",
icon: "calendar"
}, {
name: "My Events",
entityTypeId: "MyEvent",
icon: "calendar"
}],
...
});

AllcountJS uses Font Awesome icons, so all icon names referenced in the configuration are mapped to Font Awesome icon names.

Browsing & Editing Events

After clicking on “Events” from the menu, you’ll be taken to the Events view shown in the screenshot below. It is a standard AllcountJS view that provides some generic CRUD functionalities on the corresponding entities. Here, you may search for events, create new events and edit or delete existing ones. There are two modes of this CRUD interface: list and form. This portion of the application is configured through the following few lines of JavaScript code.

A.app({
...,
entities: function(Fields) {
return {
Event: {
title: "Events",
fields: {
eventName: Fields.text("Event").required(),
date: Fields.date("Date").required(),
time: Fields.text("Starts at").masked("99:99").required(),
appliedUsers: Fields.relation("Applied users", "AppliedUser", "event")
},
referenceName: "eventName",
sorting: [['date', -1], ['time', -1]],
...
}
}
}
});

This example shows how entity descriptions are configured in AllcountJS. Notice how we are using a function to define the entities; every property of AllcountJS configuration can be a function. These functions can request dependencies to be resolved through its argument names. Before the function is called, appropriate dependencies are injected. Here, “Fields” is one of the AllcountJS configuration APIs used to describe entity fields. The property “Entities” contains name-value pairs where the name is an entity-type identifier and value is its description. An entity-type for events is described, in this example, where the title is “Events.” Other configurations, such as default-sort-ordering, reference name, and the like, may also be defined here. Default-sort-order is defined through an array of field names and directions, while the reference name is defined through a string (read more here).

allcountJS function

This particular entity-type has been defined as having four fields: “eventName,” “date,” “time” and “appliedUsers,” the first three of which are persisted in the database. These fields are mandatory, as indicated by the use of “required().” Values in these fields with such rules are validated before the form is submitted on the front-end as shown in the screenshot below. AllcountJS combines both client-side and server-side validations to provide the best user experience. The fourth field is a relationship that bears a list of users who have applied to attend the event. Naturally, this field is not persisted in the database, and is populated by selecting only those AppliedUser entities relevant to the event.

allcountjs development rules

Applying to Attend Events

When a user selects a particular event, the toolbar shows a button labelled “Apply.” Clicking on it adds the event to the user’s schedule. In AllcountJS, actions similar to this can be configured by simply declaring them in the configuration:

actions: [{
id: "apply",
name: "Apply",
actionTarget: 'single-item',
perform: function (User, Actions, Crud) {
return Crud.actionContextCrud().readEntity(Actions.selectedEntityId()).then(function (eventToApply) {
var userEventCrud = Crud.crudForEntityType('UserEvent');
return userEventCrud.find({filtering: {"user": User.id, "event": eventToApply.id}}).then(function (events) {
if (events.length) {
return Actions.modalResult("Can't apply to event", "You've already applied to this event");
} else {
return userEventCrud.createEntity({
user: {id: User.id},
event: {id: eventToApply.id},
date: eventToApply.date,
time: eventToApply.time
}).then(function () { return Actions.navigateToEntityTypeResult("MyEvent") });
}
});
})
}
}]

The property “actions” of any entity type takes an array of objects that describe the behavior of each custom action. Each object has an “id” property which defines a unique identifier for the action, the property “name” defines the display name and the property “actionTarget” is used to define the action context. Setting “actionTarget” to “single-item” indicates that the action should be performed with a particular event. A function defined under the property “perform” is the logic executed when this action is performed, typically when the user clicks on the corresponding button.

Dependencies may be requested by this function. For instance, in this example the function depends on “User,” “Actions” and “Crud.” When an action occurs, a reference to the user, invoking this action, can be obtained by requiring the “User” dependency. The “Crud” dependency, which allows the manipulation of database state for these entities, is also requested here. The two methods that return an instance of Crud object are: The method “actionContextCrud()” - returns CRUD for “Event” entity-type since the action “Apply” belongs to it, while the method “crudForEntityType()” - returns CRUD for any entity type identified by its type ID.

CRUD dependencies

The implementation of the action begins by checking if this event is already scheduled for the user, and if not, it creates one. If it is already scheduled, a dialog box is shown by returning the value from calling “Actions.modalResult()”. Besides showing a modal, an action may perform different types of operations in a similar way, such as “navigate to view,” “refresh view,” “show dialog,” and so on.

implementation of the action

User Schedule of Applied Events

After successfully applying to an event, the browser is redirected to the “My Events” view, which shows a list of events the user has applied to. The view is defined by the following configuration:

UserEvent: {
fields: {
user: Fields.fixedReference("User", "OnlyNameUser").required(),
event: Fields.fixedReference("Event", "Event").required(),
date: Fields.date("Date").required(),
time: Fields.text("Starts at").masked("99:99").required()
},
filtering: function (User) { return {"user.id": User.id} },
sorting: [['date', -1], ['time', -1]],
views: {
MyEvent: {
title: "My Events",
showInGrid: ['event', 'date', 'time'],
permissions: {
write: [],
delete: null
}
},
AppliedUser: {
permissions: {
write: []
},
showInGrid: ['user']
}
}
},

In this case, we are using a new configuration property, “filtering.” As with our earlier example, this function also relies on the “User” dependency. If the function returns an object, it is treated as a MongoDB query; the query filters the collection for events that belong only to the current user.

Another interesting property is “Views.” “View” is a regular entity-type, but it’s MongoDB collection is the same as for parent entity-type. This makes it possible to create visually different views for the same data in the database. In fact, we used this feature to create two different views for “UserEvent:” “MyEvent” and “AppliedUser.” Since the prototype of the sub-views is set to the parent entity type, properties that are not overridden are “inherited” from the parent type.

views

Listing Event Attendees

After applying to an event, other users may see a list of all the users planning to attend. This is generated as a result of the following configuration elements in main.js:

AppliedUser: {
permissions: {
write: []
},
showInGrid: ['user']
}
// ...
appliedUsers: Fields.relation("Applied users", "AppliedUser", "event")

“AppliedUser” is a read-only view for a “MyEvent” entity-type. This read-only permission is enforced by setting an empty array to the “Write” property of the permissions object. Also, as the “Read” permission isn&



Usability for Conversion: Stop Using Fads, Start Using Data

When it comes to creating and designing a product, we are looking for the best solution to ensure we meet our goal. Ultimately, our goal will always be to convince the customer to buy our product or use our service; i.e., for converting leads into sales. But what can we do to ensure the highest conversion rate (i.e., of leads to sales) possible? When we look around for ways to understand what works with conversion and what doesn’t, we may encounter several fads or trends that presumptuously claim to know exactly what we need to do; things like changing a button to a particular color, using a particular picture or icon, or employing a certain layout. However, there is no one size fits all “magic bullet” to conversion. EVERY demographic is different, so we need to use our data and our knowledge of our specific targeted audience to create designs that convert. IF there is one single piece of advice that’s most important, it’s to focus on usability.

Usability and Conversion

Stop following trends to achieve your conversion rates.

Building your product and setting it loose.

You or your client have just launched your new website or product, but you are noticing that your conversionrate is dramatically low. To use an example for this exercise, let’s give a percentage: 0.3%. That’s only 3 out of every 1000 leads converting into customers. Presumably not what you’re looking for.

Concerned, you run off to Google and search ways to convert users and you find articles that say:

“Red converts better than green!” “Orange beats any color!” “Cat pictures! Everybody loves kittens!” “Pictures of people convert better!” “Pictures of products convert better!” “Company logos make you $$”

While each of these approaches may have in fact been useful in one or more scenarios, the likelihood that these “magic answers” are right for you is often slim at best. There’s no data behind the claim that making the button orange in all of our products will help our product convert better.

Another thing to point out when we read articles like the ones above is the quality of leads that the company is receiving. Although we would want as many leads as possible, we would also want to make sure that these are quality leads to getting better data and keep improving our product for that target audience. Having a ton of leads might sound exciting, but we end up wasting our time on leads that don’t go anywhere, in which case we are wasting money and losing opportunities on the leads that will move our company forward and help out product grow.

How do you identify your quality leads?

There are a couple of things to think about before you start the process of optimizing your site for conversion. Here’s a list that you should consider before you start optimizing your site:

  • Know Your Audience – What is your audience like? Their demographics? Their location? Is your website tailored to them?

  • Know Your Goals – What is the ultimate goal for the site? Are you looking to store emails? Get people to sign up for a service? Buy a product?

  • Know Your Usability Scores – How is your site performing in mobile? IS it responsive? How’s the speed of the site when it loads in the browser? How’s the navigation?

  • Check Your Content – Is your content easy to read? Is the language geared to the personality and education level of your targeted audience? Does the content clearly communicate your message/goal?

  • Check Your Fallout – Where are you losing your audience? What is your bounce rate? What is the average time visitors are spending on your pages? What are the high performing pages? What are the low performing pages?

Once you have all of these questions answered, you can start optimizing your site. You will notice that I didn’t touch on your colors or designs for the checklist. Although it was not mentioned, once you define your audience, analyze your website, and have clear goals, you will find that your design will either reflect or miss that.

Find your audience.

The most important thing is your audience. To find your audience you have to look at the process before you started working on the site.

Usability and Conversion

Find your audience.

Who are you targeting?

It is important that you have a precise definition of who are you targeting. If your product is intended for people around the age of 18 - 24, your content, design, and usability should reflect that. The easiest way to come up with all these descriptions is to create your own personas. Personas are fictitious or real people that describe a specific member of your audience. You need to write up everything that you need from them, like name, age, ethnicity, occupation, technology savviness, etc.

You can use tools like Google Analytics or other paid analytic tools to help obtain good in-depth information about your users. You can also perform some user testing in various websites like UserTesting.com or in person where you can develop your personas from them.

What are you targeting them for?

Another clear thing you need to have before you even start the design is the purpose of the site. Are you selling the user goods or are you providing a service? Hows does the site align with the company’s mission and vision? How does the goal align with your personas?

Defining usability data.

Once you have all this data written down, you can then proceed to check your usability stats. When it comes to mobile websites, there is a great tool I like to use to check my user experience (you’ve probably heard of it too): Google PageSpeed.

As a rule of thumb, you want your User Experience grade to be above 90. This means that things are clear and easy to tap/click, see, and navigate through your site. You also want to make sure your site speed is at least 80 or more. If you are 70, check the options to help you optimize your page for speed. Use services like CloudFlare, Highwinds, Akamai, etc. for caching and CDN (Content Delivery Network) to help improve speed.

For your desktop, I would suggest you use tools like Crazy EggGoogle AnalyticsVisual Web Optimizer, or any other heat map or visual guide software. These will help you find where people are focusing the most and identify pitfalls such as areas that don’t get much attention. If you combine some of these products, using heat maps, trackers, and Google Analytics you can identify your fallouts and what pages aren’t performing the way you want them to.

Another great way to test your product is by performing live user testing either at your site or at a formal user testing facility. UserTesting.com is a great tool that you can use to test your websites or mobile applications. You set up your questionnaire and identify your audience, and just wait for the results. Listen to their feedback, but watch what they do. Sometimes an action gives much more data than an answer to a question.

The next step on our list will be to check the content. This could be an incredible source of valuable information. Sometimes we focus on the design, when in reality changing a few words here and there will give you the conversion that you desire. Check your content before you decide to make any other design changes.

Everything checks out, then what is going on?

When you look at the design and start wondering what to change, keep in mind that you want to test for usability and not for conversion. I’ll explain this in a bit. One of the keys to a site that converts well is that the user trusts it. Repeat that again: One of the keys to a site that converts well is that the user trusts it.

Your presentation, colors, branding, content, everything creates an impact on the user and, in just a matter of seconds, you can lose a user or gain their full confidence.

Usability and Conversion

Your product colors

For colors, make sure they are all consistent with your brand and your company. Design for what you want the user to perceive when they first look at the site. Remember, you only have a few seconds before they go away. My general recommendation is that you create a 3 color palette.

  • Your primary color. This color is what most of the site will have. The color will portray your company/product’s vision.

  • Your secondary color. This color consists of the items you will use to bring attention to another section of the site while the user reads and digests your content. These would be the colors for your links, navigation, etc.

  • Your call-to-action color. This color is extremely important. The color of this button or link will let the user know that this button is performing an action (in our case, convert them). Normally this color should compliment the rest of the colors. You want this color to stand out, but not clash or take away from your brand.

To give you an example of a fad, there are sites that have claimed in the past that turning a button from green to red, or vice versa, will automatically increase your conversion rate. They will cite an example of how it worked for their site. Before you rush to change your colors, though, look at the design. Is your primary or secondary color red? If that is the case, then your button as red will just blend in with the rest of your product, people will ignore it. Are your colors clashing with red? That creates a distraction, not a conversion.

Layouts

Layouts are going to be important for you to convert your users and these need to be very specific in terms of usability. You need to create harmony between your content strategy and the design emotion you want to evoke from the user when they see the landing page. Remember what we talked about before? Trust. Make sure your page engenders trust in the user.

A lot of people ask about hero banners and how effective they are in terms of improving conversion rates. The answer to this, like most things we’ve discussed, depends on your audience. Does your hero banner easily explain what the user wants? Then go for it. Test it out. Otherwise, consider other options that might fit better with your message.

Another example of a fad is hero carousels. You will notice that some websites will provide a big banner on their page, but just as you are reading it, the banner switches over and shows you more information. This rarely works well for usability. You are creating a stressful situation for the user, because now they have a time limit to finish reading what they first saw upon arrival. If you want to use carousels, make sure you make them with plenty of time for a user to finish reading the content of each slides, or just don’t auto-animate it.

Building forms

If you need the user to sign up for something, make that process obvious, easy, and readily accessible.

  • Do you really need all the fields you have on your sign up form?

  • Could more information be collected once you start building a relationship with your user rather than requiring it of them upfront?

If you need a lot of fields for your product form, consider splitting the form into different steps. Make the user follow your flow. Create a funnel that is enjoyable for the user.

Be clear on why you are asking for information. Do you need an address? Tell the user why you need it. You need a user to provide a phone number? Tell the user why. If you don’t need that information right away and you can build a relationship with an email, then go that route. It may take a little longer for you to “secure” that lead, but in the end, it will provide so much more quality to your brand and your business, and will probably yield more leads.

Elements inside your pages

Work with your content team (if you are not writing the content yourself) to discover things that you want to emphasize to the user to communicate that you are looking out for their best interest.

Typography plays an important role. Make sure that your main headline is the easiest to read and answers quickly the question that the user has when landing on your page. Use bullet points to engage the user with simple and quick answers. Give the user what they want, but entice them; educate them on why they need to know more. Once you build that trust and interest, your leads will start converting at a higher rate.

While no single image is magical, using imagery to complement your message is a valuable technique. Are you trying to offer a service that is good for the family? Look for images that complement this, like a happy family. Imagery can play a big role depending on your audience so, again, it is very important that you know your audience before you choose an image.

NOTE: A quick note about using stock photography, it should be common sense by now, but just in case, make sure you’re using stock photography that looks natural. You want the photo to enhance the story of your page, not just display people looking eerily happy while they look at the camera.

YET another example of a fad is showing a professional looking person as a “customer rep” which, 9 out of 10 times, is just a stock image to give a user a sense of trust. Users will be able to identify that these aren’t really the people taking care of them. Additionally, the product should be about your user, not you. How is the image going to make them feel? How does the image relate to the product you’re selling or the service you are providing?

Don’t have imagery? An illustration can help provide more information and instill confidence in your user. Again, when designing an illustration focus on what user needs, the emotion they should feel upon looking at the illustration, and so on.

Usability and Conversion

Usability instead of conversion

So how do you measure the success of your new designs? The main thing that you have to understand about conversion is that it can’t be completely broken down into categories. You have to test and test often. However, when you test, be specific on what you are trying to test. The more specific you can make your test, the better data you will collect to keep improving.

So why should you test for usability rather than conversion? Because when you test for usability you are by definition looking at things from the user’s perspective. The user will notice this and, if you can reach a level of trust between the user and your brand, you will be able to get a conversion. The keyword here for you is trust. If you build only to try and “trick” the user into converting, you will end up damaging a relationship with that user, which will cause you to lose the confidence and trust from that user and many others.

Build trust and build relationships. This can’t be emphasized enough. When you build trust with your users, you keep them coming back and you help promote your business indirectly by word of mouth. People are very active in social media and other areas of their lives. Getting positive reviews will help you get more confidence with new users and better leads.

What is another great thing about usability? SEO. In order to start gaining more leads, you need to drive more people to your website. Usability will not only create a great user experience but it can help you stand out from your competitors on Google Searches, etc. Google has put a huge focus on giving the user what they need, just by searching, so sites that demonstrate the capability to provide the users with that information get ahead from others who are just trying to beat or game the system.

Let’s recap. Evaluate your site and follow the checklist provided above. Test for usability and test often. Focus your design on helping the users reach their goals. Design to build trust, not to trick. The more trust you can build with the user, the stronger the relationship and the higher quality conversion you will receive.

Happy converting! :D

This article was originally posted on Toptal 



Control Your Laptop with an Android Phone using Python, Twisted, and Django

Introduction

It’s always fun to put your Android or Python programming skills on display. A while back, I figured it’d be cool to try and control my laptop via my Android mobile device. Think about it: remote laptop access including being able to play and pause music, start and stop programming jobs or downloads, etc., all by sending messages from your phone. Neat, huh?

Before you keep on reading, please bear in mind that this is a pet project, still in its early stages—but the basic platform is there. By gluing together some mainstream tools, I was able to setup my Android phone to control my laptop via a Python interpreter.

By the way: the project is open source. You can check out the client code here, and the server code here.

The Remote Laptop Access Tool Belt: Python, Twisted, Django, and Amarok

This project involves the following technologies, some of which you may be familiar with, some of which are quite specific to the task at-hand:

  • Python 2.7+
  • Twisted: an excellent event-driven framework especially crafted for network hackers.
  • Django: I used v1.4, so you’ll have to adjust the location of some files if you want to run a lower version.
  • Amarok: a D-BUS (more on this below) manageable media player. This could be subbed out for other such media players (ClementineVLC, or anything that supports MPRIS) if you know their messaging structures. I chose Amarok because it comes with my KDE distribution by default. Plus, it’s fast and easily configurable.
  • An Android phone with Python for Android installed (more on this below). The process is pretty straightforward—even for Py3k!
  • Remote Amarok and Remote Amarok Web.

At a High Level

At a high level, we consider our Android phone to be the client and our laptop, the server. I’ll go through this remote access architecture in-depth below, but the basic flow of the project is as follows:

  1. The user types some command into the Python interpreter.
  2. The command is sent to the Django instance.
  3. Django then passes the command along to Twisted.
  4. Twisted then parses the command sends a new command via D-Bus to Amarok.
  5. Amarok interacts with the actual laptop, controlling the playing/pausing of music.

Using this toolbelt, learn how to control a laptop with Python, Twisted, and Django.

Now, lets dig in.

Python on Android

So one good day, I started looking at Python interpreters that I could run on my Android phone (Droid 2, back then). Soon after, I discovered the excellent SL4A package that brought Python For Android to life. It’s a really nifty package: you click a couple buttons and suddenly you have an almost fully functional Python environment on your mobile or tablet device that can both run your good ol’ Python code and access the Android API (I say almost because some stuff probably is missing and the Android API isn’t 100% accessible, but for most use-cases, it’s sufficient).

If you prefer, you can also build your own Python distribution to run on your Android device, which has the advantage that you can then run any version of the interpreter you desire. The process involves cross-compiling Python to be run on ARM (the architecture used on Android devices and other tablets). It’s not easy, but it’s certainly doable. If you’re up for the challenge, check here or here.

Once you have your interpreter setup, you can do basically whatever you like by combining Python with the Android API, including controlling your laptop remotely. For example, you can:

  • Send and read SMS.
  • Interact with third-party APIs around the Internet via urllib and other libraries.
  • Display native look and feel prompts, spinning dialogs, and the like.
  • Change your ringtone.
  • Play music or videos.
  • Interact with Bluetooth—this one in particular paves the way for a lot of opportunities. For example, I once played around with using my phone as a locker-unlocker application for my laptop (e.g., unlock my laptop via Bluetooth when my phone was nearby).

How Using Your Phone to Control Your Laptop Works

The Architecture

Our project composition is as follows:

  • A client-side application built on Twisted if you want to test the server code (below) without having to run the Django application at all.

  • A server-side Django application, which reads in commands from the Android device and passes them along to Twisted. As it stands, Amarok is the only laptop application that the server can interact with (i.e., to control music), but that’s a sufficient proof-of-concept, as the platform is easily extensible.

  • A server-side Twisted ‘instance’ which communicates with the laptop’s media player via D-Bus, sending along commands as they come in from Django (currently, I support ‘next’, ‘previous’, ‘play’, ‘pause’, ‘stop’, and ‘mute’). Why not just pass the commands directly from Django to Amarok? Twisted’s event-driven, non-blocking attributes take away all the hard work of threading (more below). If you’re interested in marrying the two, see here.

Twisted is excellent, event-driven, and versatile. It operates using a callback system, deferred objects, and some other techniques. I’d definitely recommend that you try it out: the amount of work that you avoid by using Twisted is seriously impressive. For example, it serves boilerplate code for lots of protocol, including IRC, HTTP, SSH, etc. without having to deal with non-blocking mechanisms (threads, select, etc.).
  • The client-side Android code, uploaded to your device with a customized URL to reach your Django application. It’s worth mentioning that this particular piece of code runs on Python 2.7+, including Py3k.

What’s D-Bus?

I’ve mentioned D-Bus several times, so it’s probably worth discussing it in more detail. Broadly speaking, D-Bus is a messaging bus system for communicating between applications (e.g., on a laptop computer and Android phone) easily through specially crafted messages.

It’s mainly composed of two buses: the system bus, for system-wide stuff; and the session bus, for userland stuff. Typical messages to the system bus would be “Hey, I’ve added a new printer, notify my D-Bus enabled applications that a new printer is online”, while typical Inter-Process Communication (IPC) among applications would go to the session bus.

We use the session bus to communicate with Amarok. It’s very likely that most modern applications (under Linux environments, at least) will support this type of messaging and generally all the commands/functions that they can process are well documented. As any application with D-Bus support can be controlled under this architecture, the possibilities are nearly endless.

More info can be found here.

Behind the Scenes:

Having set up all the infrastructure, you can fire off the Android application and it will enter into an infinite loop to read incoming messages, process them with some sanity checks, and, if valid, send them to a predefined URL (i.e., the URL of your Django app), which will in-turn process the input and act accordingly. The Android client then marks the message as read and the loop continues until a message with the exact contents “exitclient” (clever, huh?) is processed, in which case the client will exit.

On the server, the Django application picks up a command to-be processed and checks if it starts with a valid instruction. If so, it connects to the Twisted server (using Telnetlib to connect via telnet) and sends the command along. Finally, Twisted parses the input, transforms it into something suitable for Amarok, and lets Amarok do its the magic! Finally, your laptop responds by playing songs, pausing, skipping, etc.

Regarding the “predefined URL”: if you want to be controlling your computer from afar, this will have to be a public URL (reachable over the Internet). Be aware that, currently, the code doesn’t implement any layer of security (SSL, etc.)—such improvements are exercises for the reader, at the moment.

What Else Can I Do With This?

Everything looks really simple so far, huh? You may be asking yourself: “Can this be extended to support nifty feature [X]?” The answer is: Yes (probably)! Given that you know how to interact with your computer using your phone properly, you can supplement the server-side code to do whatever you like. Before you know it, you’ll be shooting off lengthy processes on your computer remotely. Or, if you can cope with the electronics, you could build an interface between your computer and your favorite appliance, controlling that via SMS instructions (“Make me coffee!” comes to mind). 

 

This article originally appeared on Toptal

 



The Vital Guide to Python Interviewing

The Challenge

As a rough order of magnitude, Giles Thomas (co-founder of PythonAnywhere) estimates that there are between 1.8 and 4.3 million Python developers in the world.

So how hard can it be to find a Python developer? Well, not very hard at all if the goal is just to find someone who can legitimately list Python on their resume. But if the goal is to find a Python guru who has truly mastered the nuances and power of the language, then the challenge is most certainly a formidable one.

First and foremost, a highly-effective recruiting process is needed, as described in our post In Search of the Elite Few – Finding and Hiring the Best Developers in the Industry. Such a process can then be augmented with targeted questions and techniques, such as those provided here, that are specifically geared toward ferreting out Python virtuosos from the plethora of some-level-of-Python-experience candidates.

Python Guru or Snake in the Grass?

So you’ve found what appears to be a strong Python developer. How do you determine if he or she is, in fact, in the elite top 1% of candidates that you’re looking to hire? While there’s no magic or foolproof technique, there are certainly questions you can pose that will help determine the depth and sophistication of a candidate’s knowledge of the language. A brief sampling of such questions is provided below.

It is important to bear in mind, though, that these sample questions are intended merely as a guide. Not every “A” candidate worth hiring will be able to properly answer them all, nor does answering them all guarantee an “A” candidate. At the end of the day, hiring remains as much of an art as it does a science.

Python in the Weeds…

While it’s true that the best developers don’t waste time committing to memory that which can easily be found in a language specification or API document, there are certain key features and capabilities of any programming language that any expert can, and should, be expected to be well-versed in. Here are some Python-specific examples:

Q: Why use function decorators? Give an example.

A decorator is essentially a callable Python object that is used to modify or extend a function or class definition. One of the beauties of decorators is that a single decorator definition can be applied to multiple functions (or classes). Much can thereby be accomplished with decorators that would otherwise require lots of boilerplate (or even worse redundant!) code. Flask, for example, uses decorators as the mechanism for adding new endpoints to a web application. Examples of some of the more common uses of decorators include adding synchronization, type enforcement, logging, or pre/post conditions to a class or function.

Q: What are lambda expressions, list comprehensions and generator expressions? What are the advantages and appropriate uses of each?

Lambda expressions are a shorthand technique for creating single line, anonymous functions. Their simple, inline nature often – though not always – leads to more readable and concise code than the alternative of formal function declarations. On the other hand, their terse inline nature, by definition, very much limits what they are capable of doing and their applicability. Being anonymous and inline, the only way to use the same lambda function in multiple locations in your code is to specify it redundantly.

List comprehensions provide a concise syntax for creating lists. List comprehensions are commonly used to make lists where each element is the result of some operation(s) applied to each member of another sequence or iterable. They can also be used to create a subsequence of those elements whose members satisfy a certain condition. In Python, list comprehensions provide an alternative to using the built-in map()and filter() functions.

As the applied usage of lambda expressions and list comprehensions can overlap, opinions vary widely as to when and where to use one vs. the other. One point to bear in mind, though, is that a list comprehension executes somewhat faster than a comparable solution using map and lambda (some quick tests yielded a performance difference of roughly 10%). This is because calling a lambda function creates a new stack frame while the expression in the list comprehension is evaluated without doing so.

Generator expressions are syntactically and functionally similar to list comprehensions but there are some fairly significant differences between the ways the two operate and, accordingly, when each should be used. In a nutshell, iterating over a generator expression or list comprehension will essentially do the same thing, but the list comprehension will create the entire list in memory first while the generator expression will create the items on the fly as needed. Generator expressions can therefore be used for very large (and even infinite) sequences and their lazy (i.e., on demand) generation of values results in improved performance and lower memory usage. It is worth noting, though, that the standard Python list methods can be used on the result of a list comprehension, but not directly on that of a generator expression.

Q: Consider the two approaches below for initializing an array and the arrays that will result. How will the resulting arrays differ and why should you use one initialization approach vs. the other?

>>> # INITIALIZING AN ARRAY -- METHOD 1
...
>>> x = [[1,2,3,4]] * 3
>>> x
[[1, 2, 3, 4], [1, 2, 3, 4], [1, 2, 3, 4]]
>>>
>>>
>>> # INITIALIZING AN ARRAY -- METHOD 2
...
>>> y = [[1,2,3,4] for _ in range(3)]
>>> y
[[1, 2, 3, 4], [1, 2, 3, 4], [1, 2, 3, 4]]
>>>
>>> # WHICH METHOD SHOULD YOU USE AND WHY?

While both methods appear at first blush to produce the same result, there is an extremely significant difference between the two. Method 2 produces, as you would expect, an array of 3 elements, each of which is itself an independent 4-element array. In method 1, however, the members of the array all point to the same object. This can lead to what is most likely unanticipated and undesired behavior as shown below.

>>> # MODIFYING THE x ARRAY FROM THE PRIOR CODE SNIPPET:
>>> x[0][3] = 99
>>> x
[[1, 2, 3, 99], [1, 2, 3, 99], [1, 2, 3, 99]]
>>> # UH-OH, DON’T THINK YOU WANTED THAT TO HAPPEN!
...
>>>
>>> # MODIFYING THE y ARRAY FROM THE PRIOR CODE SNIPPET:
>>> y[0][3] = 99
>>> y
[[1, 2, 3, 99], [1, 2, 3, 4], [1, 2, 3, 4]]
>>> # THAT’S MORE LIKE WHAT YOU EXPECTED!
...

Q: What will be printed out by the second append() statement below?

>>> def append(list=[]):
...     # append the length of a list to the list
...     list.append(len(list))
...     return list
...
>>> append(['a','b'])
['a', 'b', 2]
>>>
>>> append()  # calling with no arg uses default list value of []
[0]
>>>
>>> append()  # but what happens when we AGAIN call append with no arg?

When the default value for a function argument is an expression, the expression is evaluated only once, not every time the function is called. Thus, once the list argument has been initialized to an empty array, subsequent calls to append without any argument specified will continue to use the same array to which list was originally initialized. This will therefore yield the following, presumably unexpected, behavior:

>>> append()  # first call with no arg uses default list value of []
[0]
>>> append()  # but then look what happens...
[0, 1]
>>> append()  # successive calls keep extending the same default list!
[0, 1, 2]
>>> append()  # and so on, and so on, and so on...
[0, 1, 2, 3]

Q: How might one modify the implementation of the ‘append’ method in the previous question to avoid the undesirable behavior described there?

The following alternative implementation of the append method would be one of a number of ways to avoid the undesirable behavior described in the answer to the previous question:

>>> def append(list=None):
...     if list is None:
list = []
# append the length of a list to the list
...     list.append(len(list))
...     return list
...
>>> append()
[0]
>>> append()
[0]

Q: How can you swap the values of two variables with a single line of Python code?

Consider this simple example:

>>> x = 'X'
>>> y = 'Y'

In many other languages, swapping the values of x and y requires that you to do the following:

>>> tmp = x
>>> x = y
>>> y = tmp
>>> x, y
('Y', 'X')

But in Python, makes it possible to do the swap with a single line of code (thanks to implicit tuple packing and unpacking) as follows:

>>> x,y = y,x
>>> x,y
('Y', 'X')

Q: What will be printed out by the last statement below?

>>> flist = []
>>> for i in range(3):
...     flist.append(lambda: i)
...
>>> [f() for f in flist]   # what will this print out?

In any closure in Python, variables are bound by name. Thus, the above line of code will print out the following:

[2, 2, 2]

Presumably not what the author of the above code intended!

workaround is to either create a separate function or to pass the args by name; e.g.:

>>> flist = []
>>> for i in range(3):
...     flist.append(lambda i = i : i)
...
>>> [f() for f in flist]
[0, 1, 2]

Q: What are the key differences between Python 2 and 3?

Although Python 2 is formally considered legacy at this point, its use is still widespread enough that is important for a developer to recognize the differences between Python 2 and 3.

Here are some of the key differences that a developer should be aware of:

  • Text and Data instead of Unicode and 8-bit strings. Python 3.0 uses the concepts of text and (binary) data instead of Unicode strings and 8-bit strings. The biggest ramification of this is that any attempt to mix text and data in Python 3.0 raises a TypeError (to combine the two safely, you must decode bytes or encode Unicode, but you need to know the proper encoding, e.g. UTF-8)
    • This addresses a longstanding pitfall for naïve Python programmers. In Python 2, mixing Unicode and 8-bit data would work if the string happened to contain only 7-bit (ASCII) bytes, but you would get UnicodeDecodeError if it contained non-ASCII values. Moreover, the exception would happen at the combination point, not at the point at which the non-ASCII characters were put into the str object. This behavior was a common source of confusion and consternation for neophyte Python programmers.
  • print function. The print statement has been replaced with a print() function
  • xrange – buh-bye. xrange() no longer exists (range() now behaves like xrange() used to behave, except it works with values of arbitrary size)
  • API changes:
    • zip()map() and filter() all now return iterators instead of lists
    • dict.keys()dict.items() and dict.values() now return “views” instead of lists
    • dict.iterkeys()dict.iteritems() and dict.itervalues() are no longer supported
  • Comparison operators. The ordering comparison operators (<<=>=>) now raise a TypeErrorexception when the operands don’t have a meaningful natural ordering. Some examples of the ramifications of this include:
    • Expressions like 1 < ''0 > None or len <= len are no longer valid
    • None < None now raises a TypeError instead of returning False
    • Sorting a heterogeneous list no longer makes sense – all the elements must be comparable to each other

More details on the differences between Python 2 and 3 are available here.

Q: Is Python interpreted or compiled?

As noted in Why Are There So Many Pythons?, this is, frankly, a bit of a trick question in that it is malformed. Python itself is nothing more than an interface definition (as is true with any language specification) of which there are multiple implementations. Accordingly, the question of whether “Python” is interpreted or compiled does not apply to the Python language itself; rather, it applies to each specific implementation of the Python specification.

Further complicating the answer to this question is the fact that, in the case of CPython (the most common Python implementation), the answer really is “sort of both”. Specifically, with CPython, code is first compiled and then interpreted. More precisely, it is not precompiled to native machine code, but rather to bytecode. While machine code is certainly faster, bytecode is more portable and secure. The bytecode is then interpreted in the case of CPython (or both interpreted and compiled to optimized machine code at runtime in the case of PyPy).

Q: What are some alternative implementations to CPython? When and why might you use them?

One of the more prominent alternative implementations is Jython, a Python implementation written in Java that utilizes the Java Virtual Machine (JVM). While CPython produces bytecode to run on the CPython VM, Jython produces Java bytecode to run on the JVM.

Another is IronPython, written in C# and targeting the .NET stack. IronPython runs on Microsoft’s Common Language Runtime (CLR).

As also pointed out in Why Are There So Many Pythons?, it is entirely possible to survive without ever touching a non-CPython implementation of Python, but there are advantages to be had from switching, most of which are dependent on your technology stack.

Another noteworthy alternative implementation is PyPy whose key features include:

  • Speed. Thanks to its Just-in-Time (JIT) compiler, Python programs often run faster on PyPy.
  • Memory usage. Large, memory-hungry Python programs might end up taking less space with PyPy than they do in CPython.
  • Compatibility. PyPy is highly compatible with existing python code. It supports cffi and can run popular Python libraries like Twisted and Django.
  • Sandboxing. PyPy provides the ability to run untrusted code in a fully secure way.
  • Stackless mode. PyPy comes by default with support for stackless mode, providing micro-threads for massive concurrency.

Q: What’s your approach to unit testing in Python?

The most fundamental answer to this question centers around Python’s unittest testing framework. Basically, if a candidate doesn’t mention unittest when answering this question, that should be a huge red flag.

unittest supports test automation, sharing of setup and shutdown code for tests, aggregation of tests into collections, and independence of the tests from the reporting framework. The unittest module provides classes that make it easy to support these qualities for a set of tests.

Assuming that the candidate does mention unittest (if they don’t, you may just want to end the interview right then and there!), you should also ask them to describe the key elements of the unittest framework; namely, test fixtures, test cases, test suites and test runners.

A more recent addition to the unittest framework is mock. mock allows you to replace parts of your system under test with mock objects and make assertions about how they are to be used. mock is now part of the Python standard library, available as unittest.mock in Python 3.3 onwards.

The value and power of mock are well explained in An Introduction to Mocking in Python. As noted therein, system calls are prime candidates for mocking: whether writing a script to eject a CD drive, a web server which removes antiquated cache files from /tmp, or a socket server which binds to a TCP port, these calls all feature undesired side-effects in the context of unit tests. Similarly, keeping your unit-tests efficient and performant means keeping as much “slow code” as possible out of the automated test runs, namely filesystem and network access.

[Note: This question is for Python developers who are also experienced in Java.]
Q: What are some key differences to bear in mind when coding in Python vs. Java?

Disclaimer #1. The differences between Java and Python are numerous and would likely be a topic worthy of its own (lengthy) post. Below is just a brief sampling of some key differences between the two languages.

Disclaimer #2. The intent here is not to launch into a religious battle over the merits of Python vs. Java (as much fun as that might be!). Rather, the question is really just geared at seeing how well the developer understands some practical differences between the two languages. The list below therefore deliberately avoids discussing the arguable advantages of Python over Java from a programming productivity perspective.

With the above two disclaimers in mind, here is a sampling of some key differences to bear in mind when coding in Python vs. Java:

  • Dynamic vs static typing. One of the biggest differences between the two languages is that Java is restricted to static typing whereas Python supports dynamic typing of variables.
  • Static vs. class methods. A static method in Java does not translate to a Python class method.
    • In Python, calling a class method involves an additional memory allocation that calling a static method or function does not.
    • In Java, dotted names (e.g., foo.bar.method) are looked up by the compiler, so at runtime it really doesn’t matter how many of them you have. In Python, however, the lookups occur at runtime, so “each dot counts”.
  • Method overloading. Whereas Java requires explicit specification of multiple same-named functions with different signatures, the same can be accomplished in Python with a single function that includes optional arguments with default values if not specified by the caller.
  • Single vs. double quotes. Whereas the use of single quotes vs. double quotes has significance in Java, they can be used interchangeably in Python (but no, it won’t allow beginnning the same string with a double quote and trying to end it with a single quote, or vice versa!).
  • Getters and setters (not!). Getters and setters in Python are superfluous; rather, you should use the ‘property’ built-in (that’s what it’s for!). In Python, getters and setters are a waste of both CPU and programmer time.
  • Classes are optional. Whereas Java requires every function to be defined in the context of an enclosing class definition, Python has no such requirement.
  • Indentation matters… in Python. This bites many a newbie Python programmer.

The Big Picture

An expert knowledge of Python extends well beyond the technical minutia of the language. A Python expert will have an in-depth understanding and appreciation of Python’s benefits as well as its limitations. Accordingly, here are some sample questions that can help assess this dimension of a candidate’s expertise:

Q: What is Python particularly good for? When is using Python the “right choice” for a project?

Although likes and dislikes are highly personal, a developer who is “worth his or her salt” will highlight features of the Python language that are generally considered advantageous (which also helps answer the question of what Python is “particularly good for”). Some of the more common valid answers to this question include:

  • Ease of use and ease of refactoring, thanks to the flexibility of Python’s syntax, which makes it especially useful for rapid prototyping.
  • More compact code, thanks again to Python’s syntax, along with a wealth of functionally-rich Python libraries (distributed freely with most Python language implementations).
This article originally appeared on Toptal


How To Improve ASP.NET App Performance In Web Farm With Caching

There are only two hard things in Computer Science: cache invalidation and naming things.

A Brief Introduction to Caching

Caching is a powerful technique for increasing performance through a simple trick: Instead of doing expensive work (like a complicated calculation or complex database query) every time we need a result, the system can store – or cache – the result of that work and simply supply it the next time it is requested without needing to reperform that work (and can, therefore, respond tremendously faster).

Of course, the whole idea behind caching works only as long the result we cached remains valid. And here we get to the actual hard part of the problem: How do we determine when a cached item has become invalid and needs to be recreated?

Caching is a powerful technique for increasing performance

The ASP.NET in-memory cache is extremely fast
and perfect to solve distributed web farm caching problem.

Usually, a typical web application has to deal with a much higher volume of read requests than write requests. That is why a typical web application that is designed to handle a high load is architected to be scalable and distributed, deployed as a set of web tier nodes, usually called a farm. All these facts have an impact on the applicability of caching.

In this article, we focus on the role caching can play in assuring high throughput and performance of web applications designed to handle a high load, and I am going to use the experience from one of my projects and provide an ASP.NET-based solution as an illustration.

The Problem of Handling a High Load

The actual problem I had to solve wasn’t an original one. My task was to make an ASP.NET MVC monolithic web application prototype be capable of handling a high load.

The necessary steps towards improving throughput capabilities of a monolithic web application are:

  • Enable it to run multiple copies of the web application in parallel, behind a load balancer, and serve all concurrent requests effectively (i.e., make it scalable).
  • Profile the application to reveal current performance bottlenecks and optimize them.
  • Use caching to increase read request throughput, since this typically constitutes a significant part of the overall applications load.

Caching strategies often involve use of some middleware caching server, like Memcached or Redis, to store the cached values. Despite their high adoption and proven applicability, there are some downsides to these approaches, including:

  • Network latencies introduced by accessing the separate cache servers can be comparable to the latencies of reaching the database itself.
  • The web tier’s data structures can be unsuitable for serialization and deserialization out of the box. To use cache servers, those data structures should support serialization and deserialization, which requires ongoing additional development effort.
  • Serialization and deserialization add runtime overhead with an adverse effect on performance.

All these issues were relevant in my case, so I had to explore alternative options.

How caching works

The built-in ASP.NET in-memory cache (System.Web.Caching.Cache) is extremely fast and can be used without serialization and deserialization overhead, both during the development and at the runtime. However, ASP.NET in-memory cache has also its own drawbacks:

  • Each web tier node needs its own copy of cached values. This could result in higher database tier consumption upon node cold start or recycling.
  • Each web tier node should be notified when another node makes any portion of the cache invalid by writing updated values. Since the cache is distributed and without proper synchronization, most of the nodes will return old values which is typically unacceptable.

If the additional database tier load won’t lead to a bottleneck by itself, then implementing a properly distributed cache seems like an easy task to handle, right? Well, it’s not an easy task, but it is possible. In my case, benchmarks showed that the database tier shouldn’t be a problem, as most of the work happened in the web tier. So, I decided to go with the ASP.NET in-memory cache and focus on implementing the proper synchronization.

Introducing an ASP.NET-based Solution

As explained, my solution was to use the ASP.NET in-memory cache instead of the dedicated caching server. This entails each node of the web farm having its own cache, querying the database directly, performing any necessary calculations, and storing results in a cache. This way, all cache operations will be blazing fast thanks to the in-memory nature of the cache. Typically, cached items have a clear lifetime and become stale upon some change or writing of new data. So, from the web application logic, it is usually clear when the cache item should be invalidated.

The only problem left here is that when one of the nodes invalidates a cache item in its own cache, no other node will know about this update. So, subsequent requests serviced by other nodes will deliver stale results. To address this, each node should share its cache invalidations with the other nodes. Upon receiving such invalidation, other nodes could simply drop their cached value and get a new one at the next request.

Here, Redis can come into play. The power of Redis, compared to other solutions, comes from its Pub/Sub capabilities. Every client of a Redis server can create a channel and publish some data on it. Any other client is able to listen to that channel and receive the related data, very similar to any event-driven system. This functionality can be used to exchange cache invalidation messages between the nodes, so all nodes will be able to invalidate their cache when it is needed.

A group of ASP.NET web tier nodes using a Redis backplane

ASP.NET’s in-memory cache is straightforward in some ways and complex in others. In particular, it is straightforward in that it works as a map of key/value pairs, yet there is a lot of complexity related to its invalidation strategies and dependencies.

Fortunately, typical use cases are simple enough, and it’s possible to use a default invalidation strategy for all the items, enabling each cache item to have only a single dependency at most. In my case, I ended with the following ASP.NET code for the interface of the caching service. (Note that this is not the actual code, as I omitted some details for the sake of simplicity and the proprietary license.)

public interface ICacheKey
{
string Value { get; }
}
public interface IDataCacheKey : ICacheKey { }
public interface ITouchableCacheKey : ICacheKey { }
public interface ICacheService
{
int ItemsCount { get; }
T Get<T>(IDataCacheKey key, Func<T> valueGetter);
T Get<T>(IDataCacheKey key, Func<T> valueGetter, ICacheKey dependencyKey);
}

Here, the cache service basically allows two things. First, it enables storing the result of some value getter function in a thread safe manner. Second, it ensures that the then-current value is always returned when it is requested. Once the cache item becomes stale or is explicitly evicted from the cache, the value getter is called again to retrieve a current value. The cache key was abstracted away by ICacheKey interface, mainly to avoid hard-coding of cache key strings all over the application.

To invalidate cache items, I introduced a separate service, which looked like this:

public interface ICacheInvalidator
{
bool IsSessionOpen { get; }
void OpenSession();
void CloseSession();
void Drop(IDataCacheKey key);
void Touch(ITouchableCacheKey key);
void Purge();
}

Besides basic methods of dropping items with data and touching keys, which only had dependent data items, there are a few methods related to some kind of “session”.

Our web application used Autofac for dependency injection, which is an implementation of the inversion of control (IoC) design pattern for dependencies management. This feature allows developers to create their classes without the need to worry about dependencies, as the IoC container manages that burden for them.

The cache service and cache invalidator have drastically different lifecycles regarding IoC. The cache service was registered as a singleton (one instance, shared between all clients), while the cache invalidator was registered as an instance per request (a separate instance was created for each incoming request). Why?

The answer has to do with an additional subtlety we needed to handle. The web application is using a Model-View-Controller (MVC) architecture, which helps mainly in the separation of UI and logic concerns. So, a typical controller action is wrapped into a subclass of an ActionFilterAttribute. In the ASP.NET MVC framework, such C#-attributes are used to decorate the controller’s action logic in some way. That particular attribute was responsible for opening a new database connection and starting a transaction at the beginning of the action. Also, at the end of the action, the filter attribute subclass was responsible for committing the transaction in case of success and rolling it back in the event of failure.

If cache invalidation happened right in the middle of the transaction, there could be race condition whereby the next request to that node would successfully put the old (still visible to other transactions) value back into the cache. To avoid this, all invalidations are postponed until the transaction is committed. After that, cache items are safe to evict and, in the case of a transaction failure, there is no need for cache modification at all.

That was the exact purpose of the “session”-related parts in the cache invalidator. Also, that is the purpose of its lifetime being bound to the request. The ASP.NET code looked like this:

class HybridCacheInvalidator : ICacheInvalidator
{
...
public void Drop(IDataCacheKey key)
{
if (key == null)
throw new ArgumentNullException("key");
if (!IsSessionOpen)
throw new InvalidOperationException("Session must be opened first.");
_postponedRedisMessages.Add(new Tuple<string, string>("drop", key.Value));
}
...
public void CloseSession()
{
if (!IsSessionOpen)
return;
_postponedRedisMessages.ForEach(m => PublishRedisMessageSafe(m.Item1, m.Item2));
_postponedRedisMessages = null;
}	
...
}

The PublishRedisMessageSafe method here is responsible for sending the message (second argument) to a particular channel (first argument). In fact, there are separate channels for drop and touch, so the message handler for each of them knew exactly what to do - drop/touch the key equal to the received message payload.

One of the tricky parts was to manage the connection to the Redis server properly. In the case of the server going down for any reason, the application should continue to function correctly. When Redis is back online again, the application should seamlessly start to use it again and exchange messages with other nodes again. To achieve this, I used the StackExchange.Redis library and the resulting connection management logic was implemented as follows:

class HybridCacheService : ...
{
...
public void Initialize()
{
try
{
Multiplexer = ConnectionMultiplexer.Connect(_configService.Caching.BackendServerAddress);
...
Multiplexer.ConnectionFailed += (sender, args) => UpdateConnectedState();
Multiplexer.ConnectionRestored += (sender, args) => UpdateConnectedState();
...
}
catch (Exception ex)
{
...
}
}
private void UpdateConnectedState()
{
if (Multiplexer.IsConnected && _currentCacheService is NoCacheServiceStub) {
_inProcCacheInvalidator.Purge();
_currentCacheService = _inProcCacheService;
_logger.Debug("Connection to remote Redis server restored, switched to in-proc mode.");
} else if (!Multiplexer.IsConnected && _currentCacheService is InProcCacheService) {
_currentCacheService = _noCacheStub;
_logger.Debug("Connection to remote Redis server lost, switched to no-cache mode.");
}
}
}

Here, ConnectionMultiplexer is a type from the StackExchange.Redis library, which is responsible for transparent work with underlying Redis. The important part here is that, when a particular node loses connection to Redis, it falls back to no cache mode to make sure no request will receive stale data. After the connection is restored, the node starts to use the in-memory cache again.

Here are examples of action without usage of the cache service (SomeActionWithoutCaching) and an identical operation which uses it (SomeActionUsingCache):

class SomeController : Controller
{
public ISomeService SomeService { get; set; }
public ICacheService CacheService { get; set; }
...
public ActionResult SomeActionWithoutCaching()
{
return View(
SomeService.GetModelData()
);
}
...
public ActionResult SomeActionUsingCache()
{
return View(
CacheService.Get(
/* Cache key creation omitted */,
() => SomeService.GetModelData()
);
);
}
}

A code snippet from an ISomeService implementation could look like this:

class DefaultSomeService : ISomeService
{
public ICacheInvalidator _cacheInvalidator;
...
public SomeModel GetModelData()
{
return /* Do something to get model data. */;
}
...
public void SetModelData(SomeModel model)
{
/* Do something to set model data. */
_cacheInvalidator.Drop(/* Cache key creation omitted */);
}
}

Benchmarking and Results

After the caching ASP.NET code was all set, it was time to use it in the existing web application logic, and benchmarking can be handy to decide where to put most efforts of rewriting the code to use the caching. It’s crucial to pick out a few most operationally common or critical use cases to be benchmarked. After that, a tool like Apache jMeter could be used for two things:

  • To benchmark these key use cases via HTTP requests.
  • To simulate high load for the web node under test.

To get a performance profile, any profiler which is capable of attaching to the IIS worker process could be used. In my case, I used JetBrains dotTrace Performance. After some time spent experimenting to determine the correct jMeter parameters (such as concurrent and requests count), it becomes possible to start to collect performance snapshots, which are very helpful in identifying the hotspots and bottlenecks.

In my case, some use cases showed that about 15%-45% overall code execution time was spent in the database reads with the obvious bottlenecks. After I applied caching, performance nearly doubled (i.e., was twice as fast) for most of them.

Conclusion

As you may see, my case could seem like an example of what is usually called “reinventing the wheel”: Why bother to try to create something new, when there are already best practices widely applied out there? Just set up a Memcached or Redis, and let it go.

I definitely agree that usage of best practices is usually the best option. But before blindly applying any best practice, one should ask oneself: How applicable is this “best practice”? Does it fit my case well?

The way I see it, proper options and tradeoff analysis is a must upon making any significant decision, and that was the approach I chose because the problem was not so easy. In my case, there were many factors to consider, and I did not want to take a one-size-fits-all solution when it might not be the right approach for the problem at hand.

In the end, with the proper caching in place, I did get almost 50% performance increase over the initial solution.

Source: Toptal  



Tips & Tricks for Any Developers Successful Online Portfolio

At Toptal we screen a lot of designers, so over time we have learned what goes into making a captivating and coherent portfolio. Each designer’s portfolio is like an introduction to an individual designer’s skill set and strengths and represents them to future employers, clients and other designers. It shows both past work, but also future direction. There are several things to keep in mind when building a portfolio, so here is the Toptal Guide of tips and common mistakes for portfolio design.

1. Content Comes First

The main use of the portfolio is to present your design work. Thus, the content should inform the layout and composition of the document. Consider what kind of work you have, and how it might be best presented. A UX designer may require a series of animations to describe a set of actions, whereas the visual designer may prefer spreads of full images.

The portfolio design itself is an opportunity to display your experiences and skills. However, excessive graphic flourishes shouldn’t impede the legibility of the content. Instead, consider how the backgrounds of your portfolio can augment or enhance your work. The use of similar colors as the content in the background will enhance the details of your project. Lighter content will stand out against dark backgrounds. Legibility is critical, so ensure that your portfolio can be experienced in any medium, and considers all accessibility issues such as color palettes and readability.

You should approach your portfolio in the same manner you would any project. What is the goal here? Present it in a way that makes sense to viewers who are not essentially visually savvy. Edit out projects that may be unnecessary. Your portfolio should essentially be a taster of what you can do, a preparation for the client of what to expect to see more of in the interview. The more efficiently that you can communicate who you are as a designer, the better.

2. Consider Your Target Audience

A portfolio for a client should likely be different than a portfolio shown to a blog editor, or an art director. Your professional portfolio should always cater to your target audience. Edit it accordingly. If your client needs branding, then focus on your branding work. If your client needs UX Strategy than make sure to showcase your process.

Even from client to client, or project to project your portfolio will need tweaking. If you often float between several design disciplines, as many designers do, it would be useful to curate a print designer portfolio separately from a UX or visual design portfolio.

3. Tell the Stories of Your Projects

As the design industry has evolved, so have our clients, and their appreciation for our expertise and what they hire us to do. Our process is often as interesting and important to share with them, as the final deliverables. Try to tell the story of your product backwards, from final end point through to the early stages of the design process. Share your sketches, your wireframes, your user journeys, user personas, and so on.

Showing your process allows the reader to understand how you think and work through problems. Consider this an additional opportunity to show that you have an efficient and scalable process..

4. Be Professional in Your Presentation

Attention to detail, both in textual and design content are important aspects of any visual presentation, so keep an eye on alignment, image compression, embedded fonts and other elements, as you would any project. The careful treatment of your portfolio should reflect how you will handle your client’s work.

With any presentation, your choice of typeface will impact the impression you give, so do research the meaning behind a font family, and when in doubt, ask your typography savvy friends for advice.

5. Words Are As Important As Work

Any designer should be able to discuss their projects as avidly as they can design them. Therefore your copywriting is essential. True, your work is the main draw of the portfolio - however the text, and how you write about your work can give viewers insight into your portfolio.

Not everyone who sees your work comes from a creative, or visual industry. Thus, the descriptive text that you provide for images is essential. At the earlier stages of a project, where UX is the main focus, often you will need to complement your process with clearly defined content, both visual diagrams, and textual explanation.

Text can also be important for providing the context of the project. Often much of your work is done in the background, so why not present it somehow? What was the brief, how did the project come about?

Avoid These Common Mistakes

The culture of the portfolio networks like Behance or Dribble have cultivated many bad habits and trends in portfolio design. A popular trend is the perspective view of a product on a device. However, these images often do little to effectively represent the project, and hide details and content. Clients need to see what you have worked on before, with the most logical visualisation possible. Showcasing your products in a frontal view, with an “above the fold” approach often makes more sense to the non-visual user. Usually, the best web pages and other digital content are presented with no scrolling required. Avoid sending your website portfolio as one long strip, as this is only appropriate for communicating with developers.

Ensure that you cover the bases on all portfolio formats. Today it is expected for you to have an online presence, however some clients prefer that you send a classic A4 or US letterhead sized PDF. You need to have the content ready for any type of presentation.

Try to use a consistent presentation style and content throughout the projects in your portfolio. Differentiate each project with simple solutions like different coloured backgrounds, or textures, yet within the same language.

 

Source: Toptal 

 



Getting Started with Elixir Programming Language

If you have been reading blog posts, hacker news threads, your favorite developers tweets or listening to podcasts, at this point you’ve probably heard about the Elixir programming language. The language was created by José Valim, a well known developer in the open-source world. You may know him from the Ruby on Rails MVC framework or from devise and simple_form ruby gems him and his co-workers from the Plataformatec have been working on in the last few years.

According the José Valim, Elixir was born in 2011. He had the idea to build the new language due the lack of good tools to solve the concurrency problems in the ruby world. At that time, after spending time studying concurrency and distributed focused languages, he found two languages that he liked, Erlang and Clojure which run in the JVM. He liked everything he saw in the Erlang language (Erlang VM) and he hated the things he didn’t see, like polymorphism, metaprogramming and language extendability attributes which Clojure was good at. So, Elixir was born with that in mind, to have an alternative for Clojure and a dynamic language which runs in the Erlang Virtual Machine with good extendability support.

Getting Started with Elixir Programming Language

Elixir describes itself as a dynamic, functional language with immutable state and an actor based approach to concurrency designed for building scalable and maintainable applications with a simple, modern and tidy syntax. The language runs in the Erlang Virtual Machine, a battle proof, high-performance and distributed virtual machine known for its low latency and fault tolerance characteristics.

Before we see some code, it’s worth saying that Elixir has been accepted by the community which is growing. If you want to learn Elixir today you will easily find books, libraries, conferences, meetups, podcasts, blog posts, newsletters and all sorts of learning sources out there as well as it was accepted by the Erlang creators.

Let’s see some code!

Install Elixir:

Installing Elixir is super easy in all major platforms and is an one-liner in most of them.

Arch Linux

Elixir is available on Arch Linux through the official repositories:

pacman -S elixir

Ubuntu

Installing Elixir in Ubuntu is a bit tidious. But it is easy enough nonetheless.

wget https://packages.erlang-solutions.com/erlang-solutions_1.0_all.deb && sudo dpkg -i erlang-solutions_1.0_all.deb
apt-get update
apt-get install esl-erlang
apt-get install elixir

OS X

Install Elixir in OS X using Homebrew.

brew install elixir

Meet IEx

After the installation is completed, it’s time to open your shell. You will spend a lot of time in your shell if you want to develop in Elixir.

Elixir’s interactive shell or IEx is a REPL - (Read Evaluate Print Loop) where you can explore Elixir. You can input expressions there and they will be evaluated giving you immediate feedback. Keep in mind that your code is truly evaluated and not compiled, so make sure not to run profiling nor benchmarks in the shell.

The Break Command

There’s an important thing you need to know before you start the IEx RELP - how to exit it.

You’re probably used to hitting CTRL+C to close the programs running in the terminal. If you hit CTRL+C in the IEx RELP, you will open up the Break Menu. Once in the break menu, you can hit CTRL+C again to quit the shell as well as pressing a.

I’m not going to dive into the break menu functions. But, let’s see a few IEx helpers!

Helpers

IEx provides a bunch of helpers, in order to list all of them type: h().

And this is what you should see:

Those are some of my favorites, I think they will be yours as well.

  • h as we just saw, this function will print the helper message.
  • h/1 which is the same function, but now it expects one argument.

For instance, whenever you want to see the documentation of the String strip/2 method you can easily do:

Probably the second most useful IEx helper you’re going to use while programming in Elixir is the c/2, which compiles a given elixir file (or a list) and expects as a second parameter a path to write the compiled files to.

Let’s say you are working in one of the http://exercism.io/ Elixir exersices, the Anagram exercise.

So you have implemented the Anagram module, which has the method match/2 in the anagram.exs file. As the good developer you are, you have written a few specs to make sure everything works as expected as well.

This is how your current directory looks:

Now, in order to run your tests against the Anagram module you need to run/compile the tests.

As you just saw, in order to compile a file, simply invoke the elixir executable passing as argument path to the file you want to compile.

Now let’s say you want to run the IEx REPL with the Anagram module accessible in the session context. There are two commonly used options. The first is you can require the file by using the options -r, something like iex -r anagram.exs. The second one, you can compile right from the IEx session.

Simple, just like that!

Ok, what about if you want to recompile a module? Should you exit the IEx, run it again and compile the file again? Nope! If you have a good memory, you will remember that when we listed all the helpers available in the IEx RELP, we saw something about a recompile command. Let’s see how it works.

Notice that this time, you passed as an argument the module itself and not the file path.

As we saw, IEx has a bunch of other useful helpers that will help you learn and understand better how an Elixir program works.

Basics of Elixir Types

Numbers

There are two types of numbers. Arbitrary sized integers and floating points numbers.

Integers

Integers can be written in the decimal base, hexadecimal, octal and binary.

As in Ruby, you can use underscore to separate groups of three digits when writing large numbers. For instance you could right a hundred million like this:

100_000_000

Octal:

0o444

Hexdecimal:

0xabc

Binary:

0b1011

Floats

Floare are IEEE 754 double precision. They have 16 digits of accuracy and a maximum exponent of around 10308.

Floats are written using a decimal point. There must be at least one digit before and after the point. You can also append a trailing exponent. For instance 1.0, 0.3141589e1, and 314159.0-e.

Atoms

Atoms are constants that represent names. They are immutable values. You write an atom with a leading colon : and a sequence of letters, digits, underscores, and at signs @. You can also write them with a leading colon : and an arbitrary sequence of characters enclosed by quotes.

Atoms are a very powerful tool, they are used to reference erlang functions as well as keys and Elixir methods.

Here are a few valid atoms.

:name, :first_name, :"last name",  :===, :is_it_@_question?

Booleans

Of course, booleans are true and false values. But the nice thing about them is at the end of the day, they’re just atoms.

Strings

By default, strings in Elixir are UTF-8 compliant. To use them you can have an arbitrary number of characters enclosed by " or '. You can also have interpolated expressions inside the string as well as escaped characters.

Be aware that single quoted strings are actually a list of binaries.

Anonymous Functions

As a functional language, Elixir has anonymous functions as a basic type. A simple way to write a function is fn (argument_list) -> body end. But a function can have multiple bodies with multiple argument lists, guard clauses, and so on.

Dave Thomas, in the Programming Elixir book, suggests we think of fn…end as being the quotes that surround a string literal, where instead of returning a string value we are returning a function.

Tuples

Tuple is an immutable indexed array. They are fast to return its size and slow to append new values due its immutable nature. When updating a tuple, you are actually creating a whole new copy of the tuple self.

Tuples are very often used as the return value of an array. While coding in Elixir you will very often see this, {:ok, something_else_here}.

Here’s how we write a tuple: {?a,?b,?c}.

Pattern Matching

I won’t be able to explain everything you need to know about Pattern Matching, however what you are about to read covers a lot of what you need to know to get started.

Elixir uses = as a match operator. To understand this, we kind of need to unlearn what we know about = in other traditional languages. In traditional languages the equals operator is for assignment. In Elixir, the equals operators is for pattern matching.

So, that’s the way it works values in the left hand side. If they are variables they are bound to the right hand side, if they are not variables elixir tries to match them with the right hand side.

Pin Operator

Elixir provides a way to always force pattern matching against the variable in the left hand side, the pin operator.

Lists

In Elixir, Lists look like arrays as we know it from other languages but they are not. Lists are linked structures which consist of a head and a tail.

Keyword Lists

Keyword Lists are a list of Tuple pairs.

You simply write them as lists. For instance: [{:one, 1}, 2, {:three, 3}]. There’s a shortcut for defining lists, here’s how it looks: [one: 1, three: 3].

In order to retrieve an item from a keyword list you can either use:

Keyword.get([{:one, 1}, 2, {:three, 3}], :one)

Or use the shortcut:

[{:one, 1}, 2, {:three, 3}][:one]

Because keyword lists are slow when retrieving a value, in it is an expensive operation, so if you are storing data that needs fast access you should use a Map.

Maps

Maps are an efficient collection of key/value pairs. The key can have any value you want as a key, but usually should be the same type. Different from keyword lists, Maps allow only one entry for a given key. They are efficient as they grow and they can be used in the Elixir pattern matching in general use maps when you need an associative array.

Here’s how you can write a Map:

%{ :one => 1, :two => 2, 3 => 3, "four" => 4, [] => %{}, {} => [k: :v]}

Conclusion

Elixir is awesome, easy to understand, has simple but powerful types and very useful tooling around it which will help you when beginning to learn. In this first part, we have covered the various data types Elixir programs are built on and the operators that power them. In later parts we will dive deeper into the world of Elixir - functional and concurrent programming.

Source: Toptal 



How Sequel and Sinatra Solve Ruby’s API Problem

Introduction

In recent years, the number of JavaScript single page application frameworks and mobile applications has increased substantially. This imposes a correspondingly increased demand for server-side APIs. With Ruby on Rails being one of the today’s most popular web development frameworks, it is a natural choice among many developers for creating back-end API applications.

Yet while the Ruby on Rails architectural paradigm makes it quite easy to create back-end API applications, using Rails only for the API is overkill. In fact, it’s overkill to the point that even the Rails team has recognized this and has therefore introduced a new API-only mode in version 5. With this new feature in Ruby on Rails, creating API-only applications in Rails became an even easier and more viable option.

But there are other options too. The most notable are two very mature and powerful gems, which in combination provide powerful tools for creating server-side APIs. They are Sinatra and Sequel.

Both of these gems have a very rich feature set: Sinatra serves as the domain specific language (DSL) for web applications, and Sequel serves as the object-relational mapping (ORM) layer. So, let’s take a brief look at each of them.

API With Sinatra and Sequel: Ruby Tutorial

Ruby API on a diet: introducing Sequel and Sinatra.

Sinatra

Sinatra is Rack-based web application framework. The Rack is a well known Ruby web server interface. It is used by many frameworks, like Ruby on Rails, for example, and supports lot of web servers, like WEBrick, Thin, or Puma. Sinatra provides a minimal interface for writing web applications in Ruby, and one of its most compelling features is support for middleware components. These components lie between the application and the web server, and can monitor and manipulate requests and responses.

For utilizing this Rack feature, Sinatra defines internal DSL for creating web applications. Its philosophy is very simple: Routes are represented by HTTP methods, followed by a route matching a pattern. A Ruby block within which request is processed and the response is formed.

get '/' do
'Hello from sinatra'
end

The route matching pattern can also include a named parameter. When route block is executed, a parameter value is passed to the block through the params variable.

get '/players/:sport_id' do
# Parameter value accessible through params[:sport_id]
end

Matching patterns can use splat operator * which makes parameter values available through params[:splat].

get '/players/*/:year' do
# /players/performances/2016
# Parameters - params['splat'] -> ['performances'], params[:year] -> 2016
end

This is not the end of Sinatra’s possibilities related to route matching. It can use more complex matching logic through regular expressions, as well as custom matchers.

Sinatra understands all of the standard HTTP verbs needed for creating a REST API: Get, Post, Put, Patch, Delete, and Options. Route priorities are determined by the order in which they are defined, and the first route that matches a request is the one that serves that request.

Sinatra applications can be written in two ways; using classical or modular style. The main difference between them is that, with the classical style, we can have only one Sinatra application per Ruby process. Other differences are minor enough that, in most cases, they can be ignored, and the default settings can be used.

Classical Approach

Implementing classical application is straightforward. We just have to load Sinatra and implement route handlers:

require 'sinatra'
get '/' do
'Hello from Sinatra'
end

By saving this code to demo_api_classic.rb file, we can start the application directly by executing the following command:

ruby demo_api_classic.rb

However, if the application is to be deployed with Rack handlers, like Passenger, it is better to start it with the Rack configuration config.ru file.

require './demo_api_classic'
run Sinatra::Application

With the config.ru file in place, the application is started with the following command:

rackup config.ru

Modular Approach

Modular Sinatra applications are created by subclassing either Sinatra::Base or Sinatra::Application:

require 'sinatra'
class DemoApi < Sinatra::Application
# Application code
run! if app_file == $0
end

The statement beginning with run! is used for starting the application directly, with ruby demo_api.rb, just as with the classical application. On the other hand, if the application is to be deployed with Rack, the handlers content of rackup.ru must be:

require './demo_api'
run DemoApi

Sequel

Sequel is the second tool in this set. In contrast to ActiveRecord, which is part of the Ruby on Rails, Sequel’s dependencies are very small. At the same time, it is quite feature rich and can be used for all kinds of database manipulation tasks. With its simple domain specific language, Sequel relieves the developer from all the problems with maintaining connections, constructing SQL queries, fetching data from (and sending data back to) the database.

For example, establishing a connection with the database is very simple:

DB = Sequel.connect(adapter: :postgres, database: 'my_db', host: 'localhost', user: 'db_user')

The connect method returns a database object, in this case, Sequel::Postgres::Database, which can be further used to execute raw SQL.

DB['select count(*) from players']

Alternatively, to create a new dataset object:

DB[:players]

Both of these statements create a dataset object, which is a basic Sequel entity.

One of the most important Sequel dataset features is that it does not execute queries immediately. This makes it possible to store datasets for later use and, in most cases, to chain them.

users = DB[:players].where(sport: 'tennis')

So, if a dataset does not hit the database immediately, the question is, when does it? Sequel executes SQL on the database when so-called “executable methods” are used. These methods are, to name a few, alleach,mapfirst, and last.

Sequel is extensible, and its extensibility is a result of a fundamental architectural decision to build a small core complemented with a plugin system. Features are easily added through plugins which are, actually, Ruby modules. The most important plugin is the Model plugin. It is an empty plugin which does not define any class or instance methods by itself. Instead, it includes other plugins (submodules) which define a class, instance or model dataset methods. The Model plugin enables the use of Sequel as the object-relational-mapping (ORM) tool and is often referred to as the “base plugin”.

class Player < Sequel::Model
end

The Sequel model automatically parses the database schema and sets up all necessary accessor methods for all columns. It assumes that table name is plural and is an underscored version of the model name. In case there is a need to work with databases that do not follow this naming convention, the table name can be explicitly set when the model is defined.

class Player < Sequel::Model(:player)
end

So, we now have everything we need to start building the back-end API.

Read the full article from Toptal



Meet RxJava: The Missing Reactive Programming Library for Android

If you’re an Android developer, chances are you’ve heard of RxJava. It’s one of the most discussed libraries for enabling Functional Reactive Programming (FRP) in Android development. It’s touted as the go-to framework for simplifying concurrency/asynchronous tasks inherent in mobile programming.

But… what is RxJava and how does it “simplify” things?

Functional Reactive Programming for Android: An Introduction to RxJava

Untangle your Android from too many Java threads with RxJava.

While there are lots of resources already available online explaining what RxJava is, in this article my goal is to give you a basic introduction to RxJava and specifically how it fits into Android development. I’ll also give some concrete examples and suggestions on how you can integrate it in a new or existing project.

Why Consider RxJava?

At its core, RxJava simplifies development because it raises the level of abstraction around threading. That is, as a developer you don’t have to worry too much about the details of how to perform operations that should occur on different threads. This is particularly attractive since threading is challenging to get right and, if not correctly implemented, can cause some of the most difficult bugs to debug and fix.

Granted, this doesn’t mean RxJava is bulletproof when it comes to threading and it is still important to understand what’s happening behind the scenes; however, RxJava can definitely make your life easier.

Let’s look at an example.

Network Call - RxJava vs AsyncTask

Say we want to obtain data over the network and update the UI as a result. One way to do this is to (1) create an inner AsyncTask subclass in our Activity/Fragment, (2) perform the network operation in the background, and (3) take the result of that operation and update the UI in the main thread.

public class NetworkRequestTask extends AsyncTask<Void, Void, User> {
private final int userId;
public NetworkRequestTask(int userId) {
this.userId = userId;
}
@Override protected User doInBackground(Void... params) {
return networkService.getUser(userId);
}
@Override protected void onPostExecute(User user) {
nameTextView.setText(user.getName());
// ...set other views
}
}
private void onButtonClicked(Button button) {
new NetworkRequestTask(123).execute()
}

Harmless as this may seem, this approach has some issues and limitations. Namely, memory/context leaks are easily created since NetworkRequestTask is an inner class and thus holds an implicit reference to the outer class. Also, what if we want to chain another long operation after the network call? We’d have to nest two AsyncTasks which can significantly reduce readability.

In contrast, an RxJava approach to performing a network call might look something like this:

private Subscription subscription;
private void onButtonClicked(Button button) {
subscription = networkService.getObservableUser(123)
.subscribeOn(Schedulers.io())
.observeOn(AndroidSchedulers.mainThread())
.subscribe(new Action1<User>() {
@Override public void call(User user) {
nameTextView.setText(user.getName());
// ... set other views
}
});
}
@Override protected void onDestroy() {
if (subscription != null && !subscription.isUnsubscribed()) {
subscription.unsubscribe();
}
super.onDestroy();
}

Using this approach, we solve the problem (of potential memory leaks caused by a running thread holding a reference to the outer context) by keeping a reference to the returned Subscription object. This Subscription object is then tied to the Activity/Fragment object’s #onDestroy() method to guarantee that the Action1#call operation does not execute when the Activity/Fragment needs to be destroyed.

Also, notice that that the return type of #getObservableUser(...) (i.e. an Observable<User>) is chained with further calls to it. Through this fluid API, we’re able to solve the second issue of using an AsyncTask which is that it allows further network call/long operation chaining. Pretty neat, huh?

Let’s dive deeper into some RxJava concepts.

Observable, Observer, and Operator - The 3 O’s of RxJava Core

In the RxJava world, everything can be modeled as streams. A stream emits item(s) over time, and each emission can be consumed/observed.

If you think about it, a stream is not a new concept: click events can be a stream, location updates can be a stream, push notifications can be a stream, and so on.

The stream abstraction is implemented through 3 core constructs which I like to call “the 3 O’s”; namely: theObservable, Observer, and the Operator. The Observable emits items (the stream); and the Observerconsumes those items. Emissions from Observable objects can further be modified, transformed, and manipulated by chaining Operator calls.

Observable

An Observable is the stream abstraction in RxJava. It is similar to an Iterator in that, given a sequence, it iterates through and produces those items in an orderly fashion. A consumer can then consume those items through the same interface, regardless of the underlying sequence.

Say we wanted to emit the numbers 1, 2, 3, in that order. To do so, we can use the Observable<T>#create(OnSubscribe<T>) method.

Observable<Integer> observable = Observable.create(new Observable.OnSubscribe<Integer>() {
@Override public void call(Subscriber<? super Integer> subscriber) {
subscriber.onNext(1);
subscriber.onNext(2);
subscriber.onNext(3);
subscriber.onCompleted();
}
});

Invoking subscriber.onNext(Integer) emits an item in the stream and, when the stream is finished emitting, subscriber.onCompleted() is then invoked.

This approach to creating an Observable is fairly verbose. For this reason, there are convenience methods for creating Observable instances which should be preferred in almost all cases.

The simplest way to create an Observable is using Observable#just(...). As the method name suggests, it just emits the item(s) that you pass into it as method arguments.

Observable.just(1, 2, 3); // 1, 2, 3 will be emitted, respectively

Observer

The next component to the Observable stream is the Observer (or Observers) subscribed to it. Observers are notified whenever something “interesting” happens in the stream. Observers are notified via the following events:

  • Observer#onNext(T) - invoked when an item is emitted from the stream
  • Observable#onError(Throwable) - invoked when an error has occurred within the stream
  • Observable#onCompleted() - invoked when the stream is finished emitting items.

To subscribe to a stream, simply call Observable<T>#subscribe(...) and pass in an Observer instance.

Observable<Integer> observable = Observable.just(1, 2, 3);
observable.subscribe(new Observer<Integer>() {
@Override public void onCompleted() {
Log.d("Test", "In onCompleted()");
}
@Override public void onError(Throwable e) {
Log.d("Test", "In onError()");
}
@Override public void onNext(Integer integer) {
Log.d("Test", "In onNext():" + integer);
}
});

The above code will emit the following in Logcat:

In onNext(): 1
In onNext(): 2
In onNext(): 3
In onNext(): 4
In onCompleted()

There may also be some instances where we are no longer interested in the emissions of an Observable. This is particularly relevant in Android when, for example, an Activity/Fragment needs to be reclaimed in memory.

To stop observing items, we simply need to call Subscription#unsubscribe() on the returned Subscription object.

Subscription subscription = someInfiniteObservable.subscribe(new Observer<Integer>() {
@Override public void onCompleted() {
// ...
}
@Override public void onError(Throwable e) {
// ...
}
@Override public void onNext(Integer integer) {
// ...
}
});
// Call unsubscribe when appropriate
subscription.unsubscribe();

As seen in the code snippet above, upon subscribing to an Observable, we hold the reference to the returned Subscription object and later invoke subscription#unsubscribe() when necessary. In Android, this is best invoked within Activity#onDestroy() or Fragment#onDestroy().

Operator

Items emitted by an Observable can be transformed, modified, and filtered through Operators before notifying the subscribed Observer object(s). Some of the most common operations found in functional programming (such as map, filter, reduce, etc.) can also be applied to an Observable stream. Let’s look at map as an example:

Observable.just(1, 2, 3, 4, 5).map(new Func1<Integer, Integer>() {
@Override public Integer call(Integer integer) {
return integer * 2;
}
}).subscribe(new Observer<Integer>() {
@Override public void onCompleted() {
// ...
}
@Override public void onError(Throwable e) {
// ...
}
@Override public void onNext(Integer integer) {
// ...
}
});

The code snippet above would take each emission from the Observable and multiply each by 2, producing the stream 2, 4, 6, 8, 10, respectively. Applying an Operator typically returns another Observable as a result, which is convenient as this allows us to chain multiple operations to obtain a desired result.

Given the stream above, say we wanted to only receive even numbers. This can be achieved by chaining afilter operation.

Observable.just(1, 2, 3, 4, 5).map(new Func1<Integer, Integer>() {
@Override public Integer call(Integer integer) {
return integer * 2;
}
}).filter(new Func1<Integer, Boolean>() {
@Override public Boolean call(Integer integer) {
return integer % 2 == 0;
}
}).subscribe(new Observer<Integer>() {
@Override public void onCompleted() {
// ...
}
@Override public void onError(Throwable e) {
// ...
}
@Override public void onNext(Integer integer) {
// ...
}
});


                    
  • Last comment in
  •  0 Comments


  • Service Oriented Architecture with AWS Lambda: A Step-by-Step Tutorial

    When building web applications, there are many choices to be made that can either help or hinder your application in the future once you commit to them. Choices such as language, framework, hosting, and database are crucial.

    One such choice is whether to create a service-based application using Service Oriented Architecture (SOA) or a traditional, monolithic application. This is a common architectural decision affecting startups, scale-ups, and enterprise companies alike.

    Service Oriented Architecture is used by a large number of well-known unicorns and top-tech companies such as Google, Facebook, Twitter, Instagram and Uber. Seemingly, this architecture pattern works for large companies, but can it work for you?

    Service Oriented Architecture with AWS Lambda: A Step-By-Step Tutorial

    Service Oriented Architecture with AWS Lambda: A Step-By-Step Tutorial

    In this article we will introduce the topic of Service Oriented architecture, and how AWS Lambda in combination with Python can be leveraged to easily build scalable, cost-efficient services. To demonstrate these ideas, we will build a simple image uploading and resizing service using Python, AWS Lambda, Amazon S3 and a few other relevant tools and services.

    What is Service Oriented Architecture?

    Service Oriented Architecture (SOA) isn’t new, having roots from several decades ago. In recent years its popularity as a pattern has been growing due to offering many benefits for web-facing applications.

    SOA is, in essence, the abstraction of one large application into many communicating smaller applications. This follows several best practices of software engineering such as de-coupling, separation of concerns and single-responsibility architecture.

    Implementations of SOA vary in terms of granularity: from very few services that cover large areas of functionality to many dozens or hundreds of small applications in what is termed “microservice” architecture. Regardless of the level of granularity, what is generally agreed amongst practitioners of SOA is that it is by no means a free lunch. Like many good practices in software engineering, it is an investment that will require extra planning, development and testing.

    What is AWS Lambda?

    AWS Lambda is a service offered by the Amazon Web Services platform. AWS Lambda allows you to upload code that will be run on an on-demand container managed by Amazon. AWS Lambda will manage the provisioning and managing of servers to run the code, so all that is needed from the user is a packaged set of code to run and a few configuration options to define the context in which the server runs. These managed applications are referred to as Lambda functions.

    AWS Lambda has two main modes of operation:

    Asynchronous / Event-Driven:

    Lambda functions can be run in response to an event in asynchronous mode. Any source of events, such as S3, SNS, etc. will not block and Lambda functions can take advantage of this in many ways, such as establishing a processing pipeline for some chain of events. There are many sources of information, and depending on the source events will be pushed to a Lambda function from the event source, or polled for events by AWS Lambda.

    Synchronous / Request->Response:

    For applications that require a response to be returned synchronously, Lambda can be run in synchronous mode. Typically this is used in conjunction with a service called API Gateway to return HTTP responses from AWS Lambda to an end-user, however Lambda functions can also be called synchronously via a direct call to AWS Lambda.

    AWS Lambda functions are uploaded as a zip file containing handler code in addition to any dependencies required for the operation of the handler. Once uploaded, AWS Lambda will execute this code when needed and scale the number of servers from zero to thousands when required, without any extra intervention required by the consumer.

    Lambda Functions as an Evolution of SOA

    Basic SOA is a way to structure your code-base into small applications in order to benefit an application in the ways described earlier in this article. Arising from this, the method of communication between these applications comes into focus. Event-driven SOA (aka SOA 2.0) allows for not only the traditional direct service-to-service communication of SOA 1.0, but also for events to be propagated throughout the architecture in order to communicate change.

    Event-driven architecture is a pattern that naturally promotes loose coupling and composability. By creating and reacting to events, services can be added ad-hoc to add new functionality to an existing event, and several events can be composed to provide richer functionality.

    AWS Lambda can be used as a platform to easily build SOA 2.0 applications. There are many ways to trigger a Lambda function; from the traditional message-queue approach with Amazon SNS, to events created by a file being uploaded to Amazon S3, or an email being sent with Amazon SES.

    Implementing a Simple Image Uploading Service

    We will be building a simple application to upload and retrieve images utilizing the AWS stack. This example project will contain two lambda functions: one running in request->response mode that will be used to serve our simple web frontend, and another that will detect uploaded images and resize them.

    The first lambda function will run asynchronously in response to a file-upload event triggered on the S3 bucket that will house the uploaded images. It will take the image provided and resize it to fit within a 400x400 image.

    The other lambda function will serve the HTML page, providing both the functionality for a user to view the images resized by our other Lambda function as well as an interface for uploading an image.

    Initial AWS Configuration

    Before we can begin, we will need to configure some necessary AWS services such as IAM and S3. These will be configured using the web-based AWS console. However, most of the configuration can also be achieved by using the AWS command-line utility, which we will use later.

    Creating S3 Buckets

    S3 (or Simple Storage Service) is an Amazon object-store service that offers reliable and cost-efficient storage of any data. We will be using S3 to store the images that will be uploaded, as well as the resized versions of the images we have processed.

    The S3 service can be found under the “Services” drop-down in the AWS console under the “Storage & Content Delivery” sub-section. When creating a bucket you will be prompted to enter both the bucket name as well as to select a region. Selecting a region close to your users will allow S3 to optimize for latency and cost, as well as some regulatory factors. For this example we will select the “US Standard” region. This same region will later be used for hosting the AWS Lambda functions.

    It is worth noting that S3 bucket names are required to be unique, so if the name chosen is taken you will be required to choose a new, unique name.

    For this example project, we will create two storage buckets named “test-upload” and “test-resized”. The “test-upload” bucket will be used for uploading images and storing the uploaded image before it is processed and resized. Once resized, the image will be saved into the “test-resized” bucket, and the raw uploaded image removed.

    S3 Upload Permissions

    By default, S3 Permissions are restrictive and will not allow external users or even non-administrative users to read, write, update, or delete any permissions or objects on the bucket. In order to change this, we will need to be logged in as a user with the rights to manage AWS bucket permissions.

    Assuming we are on the AWS console, we can view the permissions for our upload bucket by selecting the bucket by name, clicking on the “Properties” button in the top-right of the screen, and opening the collapsed “Permissions” section.

    In order to allow anonymous users to upload to this bucket, we will need to edit the bucket policy to allow the specific permission that allows upload to be allowed. This is accomplished through a JSON-based configuration policy. These kind of JSON policies are used widely throughout AWS in conjunction with the IAM service. Upon clicking on the “Edit Bucket Policy” button, simply paste the following text and click “Save” to allow public image uploads:

    {
    "Version": "2008-10-17",
    "Id": "Policy1346097257207",
    "Statement": [
    {
    "Sid": "Allow anonymous upload to /",
    "Effect": "Allow",
    "Principal": {
    "AWS": "*"
    },
    "Action": "s3:PutObject",
    "Resource": "arn:aws:s3:::test-upload/*"
    }
    ]
    }
    
    

    After doing this, we can verify the bucket policy is correct by attempting to upload an image to the bucket. The following cURL command will do the trick:

    curl https://test-upload.s3.amazonaws.com -F 'key=test.jpeg' -F 'file=@test.jpeg'
    
    

    If a 200-range response is returned, we will know that the configuration for the upload bucket has been successfully applied. Our S3 buckets should now be (mostly) configured. We will return later to this service in the console in order to connect our image upload events to the invocation of our resize function.

    IAM Permissions for Lambda

    Lambda roles all run within a permission context, in this case a “role” defined by the IAM service. This role defines any and all permissions that the Lambda function has during its invocation. For the purposes of this example project, we will create a generic role that will be used between both of the Lambda functions. However, in a production scenario finer granularity in permission definitions is recommended to ensure that any security exploitations are isolated to only the permission context that was defined.

    The IAM service can be found within the “Security & Identity” sub-section of the “Services” drop-down. The IAM service is a very powerful tool for managing access across AWS services, and the interface provided may be a bit over-whelming at first if you are not familiar with similar tools.

    Once on the IAM dashboard page, the “Roles” sub-section can be found on the left-hand side of the page. From here we can use the “Create New Role” button to bring up a multi-step wizard to define the permissions of the role. Let’s use “lambda_role” as the name of our generic permission. After continuing from the name definition page, you will be presented with the option to select a role type. As we only require S3 access, click on “AWS Service Roles” and within the selection box select “AWS Lambda”. You will be presented with a page of policies that can be attached to this role. Select the “AmazonS3FullAccess” policy and continue to the next step to confirm the role to be created.

    It is important to note the name and the ARN (Amazon Resource Name) of the created role. This will be used when creating a new Lambda function to identify the role that is to be used for function invocation.

    Note: AWS Lambda will automatically log all output from function invocations in AWS Cloudwatch, a logging service. If this functionality is desired, which is recommended for a production environment, permission to write to a Cloudwatch log stream must be added to the policies for this role.

    The Code!

    Overview

    Now we are ready to start coding. We will assume at this point you have set up the “awscli” command. If you have not, you can follow the instructions at https://aws.amazon.com/cli/ to set up awscli on your computer.

    Note: the code used in these examples is made shorter for ease of screen-viewing. For a more complete version visit the repository at https://github.com/gxx/aws-lambda-python/.

    Read the full article on Toptal 



    The Six Commandments of Good Code: Write Code that Stands the Test of Time

    Humans have only been grappling with the art and science of computer programming for roughly half a century. Compared to most arts and sciences, computer science is in many ways still just a toddler, walking into walls, tripping over its own feet, and occasionally throwing food across the table. As a consequence of its relative youth, I don’t believe we have a consensus yet on what a proper definition of “good code” is, as that definition continues to evolve. Some will say “good code” is code with 100% test coverage. Others will say it’s super fast and has a killer performance and will run acceptably on 10 year old hardware. While these are all laudable goals for software developers, however I venture to throw another target into the mix: maintainability. Specifically, “good code” is code that is easily and readily maintainable by an organization (not just by its author!) and will live for longer than just the sprint it was written in. The following are some things I’ve discovered in my career as an engineer at big companies and small, in the USA and abroad, that seem to correlate with maintainable, “good” software.

    Never settle for code that just "works." Write superior code.

    Commandment #1: Treat Your Code the Way You Want Other’s Code to Treat You

    I’m far from the first person to write that the primary audience for your code is not the compiler/computer, but whomever next has to read, understand, maintain, and enhance the code (which will not necessarily be you 6 months from now). Any engineer worth their pay can produce code that “works”; what distinguishes a superb engineer is that they can write maintainable code efficiently that supports a business long term, and have the skill to solve problems simply and in a clear and maintainable way.

    In any programming language, it is possible to write good code or bad code. Assuming we judge a programming language by how well it facilitates writing good code (it should at least be one of the top criteria, anyway), any programming language can be “good” or “bad” depending on how it is used (or abused).

    An example of a language that by many is considered ‘clean’ and readable is Python. The language itself enforces some level of white space discipline and the built in APIs are plentiful and fairly consistent. That said, it’s possible to create unspeakable monsters. For example, one can define a class and define/redefine/undefine any and every method on that class during runtime (often referred to as monkey patching). This technique naturally leads to at best an inconsistent API and at worst an impossible to debug monster. One might naively think,”sure, but nobody does that!” Unfortunately they do, and it doesn’t take long browsing pypi before you run into substantial (and popular!) libraries that (ab)use monkey patching extensively as the core of their APIs. I recently used a networking library whose entire API changes depending on the network state of an object. Imagine, for example, calling client.connect() and sometimes getting a MethodDoesNotExist error instead of HostNotFound or NetworkUnavailable.

    Commandment #2: Good Code Is Easily Read and Understood, in Part and in Whole

    Good code is easily read and understood, in part and in whole, by others (as well as by the author in the future, trying to avoid the “Did I really write that?” syndrome).

    By “in part” I mean that, if I open up some module or function in the code, I should be able to understand what it does without having to also read the entire rest of the codebase. It should be as intuitive and self-documenting as possible.

    Code that constantly references minute details that affect behavior from other (seemingly irrelevant) portions of the codebase is like reading a book where you have to reference the footnotes or an appendix at the end of every sentence. You’d never get through the first page!

    Some other thoughts on “local” readability:

    • Well encapsulated code tends to be more readable, separating concerns at every level.

    • Names matter. Activate Thinking Fast and Slow’ssystem 2 way in which the brain forms thoughts and put some actual, careful thought into variable and method names. The few extra seconds can pay significant dividends. A well-named variable can make the code much more intuitive, whereas a poorly-named variable can lead to headfakes and confusion.

    • Cleverness is the enemy. When using fancy techniques, paradigms, or operations (such as list comprehensions or ternary operators), be careful to use them in a way that makes your code morereadable, not just shorter.

    • Consistency is a good thing. Consistency in style, both in terms of how you place braces but also in terms of operations, improves readability greatly.

    • Separation of concerns. A given project manages an innumerable number of locally important assumptions at various points in the codebase. Expose each part of the codebase to as few of those concerns as possible. Say you had a people management system where a person object may sometimes have a null last name. To somebody writing code in a page that displays person objects, that could be really awkward! And unless you maintain a handbook of “Awkward and non obvious assumptions our codebase has” (I know I don’t) your display page programmer is not going to know last names can be null and is probably going to write code with a null pointer exception in it when the last name-being null case shows up. Instead handle these cases with well thought out APIs and contracts that different pieces of your codebase use to interact with each other.

    Commandment #3: Good Code Has a Well Thought-out Layout and Architecture to Make Managing State Obvious

    State is the enemy. Why? Because it is the single most complex part of any application and needs to be dealt with very deliberately and thoughtfully. Common problems include database inconsistencies, partial UI updates where new data isn’t reflected everywhere, out of order operations, or just mind numbingly complex code with if statements and branches everywhere leading to difficult to read and even harder to maintain code. Putting state on a pedestal to be treated with great care, and being extremely consistent and deliberate with regard to how state is accessed and modified, dramatically simplifies your codebase. Some languages (Haskell for example) enforce this at a programmatic and syntactic level. You’d be amazed how much the clarity of your codebase can improve if you have libraries of pure functions that access no external state, and then a small surface area of stateful code which references the outside pure functionality.

    Commandment #4: Good Code Doesn’t Reinvent the Wheel, it Stands on the Shoulders of Giants

    Before potentially reinventing a wheel, think about how common the problem is you’re trying to solve or the function is you’re trying to perform. Somebody may have already implemented a solution you can leverage. Take the time to think about and research any such options, if appropriate and available.

    That said, a completely reasonable counter-argument is that dependencies don’t come for “free” without any downside. By using a 3rd party or open source library that adds some interesting functionality, you are making the commitment to, and becoming dependent upon, that library. That’s a big commitment; if it’s a giant library and you only need a small bit of functionality do you really want the burden of updating the whole library if you upgrade, for example, to Python 3.x? And moreover, if you encounter a bug or want to enhance the functionality, you’re either dependent on the author (or vendor) to supply the fix or enhancement, or, if it’s open source, find yourself in the position of exploring a (potentially substantial) codebase you’re completely unfamiliar with trying to fix or modify an obscure bit of functionality.

    Certainly the more well used the code you’re dependent upon is, the less likely you’ll have to invest time yourself into maintenance. The bottom line is that it’s worthwhile for you to do your own research and make your own evaluation of whether or not to include outside technology and how much maintenance that particular technology will add to your stack.

    Below are some of the more common examples of things you should probably not be reinventing in the modern age in your project (unless these ARE your projects).

    Databases

    Figure out which of CAP you need for your project, then chose the database with the right properties. Database doesn’t just mean MySQL anymore, you can chose from:

    • “Traditional” Schema’ed SQL: Postgres / MySQL / MariaDB / MemSQL / Amazon RDS, etc.
    • Key Value Stores: Redis / Memcache / Riak
    • NoSQL: MongoDB/Cassandra
    • Hosted DBs: AWS RDS / DynamoDB / AppEngine Datastore
    • Heavy lifting: Amazon MR / Hadoop (Hive/Pig) / Cloudera / Google Big Query
    • Crazy stuff: Erlang’s Mnesia, iOS’s Core Data

    Data Abstraction Layers

    You should, in most circumstances, not be writing raw queries to whatever database you happen to chose to use. More likely than not, there exists a library to sit in between the DB and your application code, separating the concerns of managing concurrent database sessions and details of the schema from your main code. At the very least, you should never have raw queries or SQL inline in the middle of your application code. Rather, wrap it in a function and centralize all the functions in a file called something really obvious (e.g., “queries.py”). A line like users = load_users(), for example, is infinitely easier to read than users = db.query(“SELECT username, foo, bar from users LIMIT 10 ORDER BY ID”). This type of centralization also makes it much easier to have consistent style in your queries, and limits the number of places to go to change the queries should the schema change.

    Other Common Libraries and Tools to Consider Leveraging

    • Queuing or Pub/Sub Services. Take your pick of AMQP providers, ZeroMQ, RabbitMQ, Amazon SQS
    • Storage. Amazon S3, Google Cloud Storage
    • Monitoring: Graphite/Hosted Graphite, AWS Cloud Watch, New Relic
    • Log Collection / Aggregation. LogglySplunk

    Auto Scaling

    • Auto Scaling. Heroku, AWS Beanstalk, AppEngine, AWS Opsworks, Digital Ocean

    Commandment #5: Don’t Cross the Streams!

    There are many good models for programming designpub/subactorsMVC etc. Choose whichever you like best, and stick to it. Different kinds of logic dealing with different kinds of data should be physically isolated in the codebase (again, this separation of concerns concept and reducing cognitive load on the future-reader). The code which updates your UI should be physically distinct from the code that calculates what goes into the UI, for example.

    Commandment #6: When Possible, Let the Computer Do the Work

    If the compiler can catch logical errors in your code and prevent either bad behavior, bugs, or outright crashes, we absolutely should take advantage of that. Of course, some languages have compilers that make this easier than others. Haskell, for example, has a famously strict compiler that results in programmers spending most of their effort just getting code to compile. Once it compiles though, “it just works”. For those of you who’ve either never written in a strongly typed functional language this may seem ridiculous or impossible, but don’t take my word for it. Seriously, click on some of these links, it’s absolutely possible to live in a world without runtime errors. And it really is that magical.

    Admittedly, not every language has a compiler or a syntax that lends itself to much (or in some cases any!) compile-time checking. For those that don’t, take a few minutes to research what optional strictness checks you can enable in your project and evaluate if they make sense for you. A short, non-comprehensive list of some common ones I’ve used lately for languages with lenient runtimes include:

    Conclusion

    This is by no means an exhaustive or the perfect list of commandments for producing “good” (i.e., easily maintainable) code. That said, if every codebase I ever had to pick up in the future followed even half of the concepts in this list, I will have many fewer gray hairs and might even be able to add an extra 5 years on the end of my life. And I’ll certainly find work more enjoyable and less stressful.

    This article is from Toptal



    The Most Common Mobile Apps Mistakes

    The mobile app market is saturated with competition. Trends turn over quickly, but no niche can last very long without several competitors jumping onto the bandwagon. These conditions result in a high failure rate across the board for the mobile app market. Only 20% of downloaded apps see users return after the first use, whereas 3% of apps remain in use after a month.

    If any part of an app is undesirable, or slow to get the hang of, users are more likely to install a new one, rather than stick it out with the imperfect product. Nothing is wasted for the consumer when disposing of an app - except for the efforts of the designers and developers, that is. So, why is it that so many apps fail? Is this a predictable phenomenon that app designers and developers should accept? For clients, is this success rate acceptable? What does it take to bring your designs into the top 3% of prosperous apps?

    The common mistakes span from failing to maintain consistency throughout the lifespan of an app, to attracting users in the first place. How can apps be designed with intuitive simplicity, without becoming repetitive and boring? How can an app offer pleasing details, without losing sight of a greater purpose? Most apps live and die in the first few days, so here are the top ten most common mistakes that designers can avoid.

    Only 3% of mobile apps are in use after being downloaded.

    Only 3% of mobile apps are in use after being downloaded.

    Common Mistake #1: A Poor First Impression

    Often the first use, or first day with an app, is the most critical period to hook a potential user. The first impression is so critical that it could be an umbrella point for the rest of this top ten. If anything goes wrong, or appears confusing or boring, potential users are quickly disinterested. Although, the proper balance for first impressions is tricky to handle. In some cases, a lengthy onboarding, or process to discover necessary features can bores users. Yet, an instantly stimulating app may disregard the need for a proper tutorial, and promote confusion. Find the balance between an app that is immediately intuitive, but also introduces the users to the most exciting, engaging features quickly. Keep in mind that when users are coming to your app, they’re seeing it for the first time. Go through a proper beta testing process to learn how others perceive your app from the beginning. What seems obvious to the design team, may not be for newcomers.

    Improper Onboarding

    Onboarding is the step by step process of introducing a user to your app. Although it can be a good way to get someone quickly oriented, onboarding can also be a drawn out process that stands in the way of your users and their content. Often these tutorials are too long, and are likely swiped through blindly.

    Sometimes, users have seen your app used in public or elsewhere, such that they get the point and just want to jump in. So, allow for a sort of quick exit strategy to avoid entirely blocking out the app upon its first use. To ensure that the onboarding process is in fact effective, consider which values this can communicate and how. The onboarding process should demonstrate the value of the app in order to hook a user, rather than just an explanation.

    Go easy on the intro animation

    Some designers address the issue of a good first impression with gripping intro animations to dazzle new users. But, keep in mind that every time someone wants to run the app, they’re going to have to sit through the same thing over and over. If the app serves a daily function, then this will tire your users quickly. Ten seconds of someone’s day for a logo to swipe across the screen and maybe spin a couple times don’t really seem worth it after a while.

    Common Mistake #2: Designing an App Without Purpose

    Avoid entering the design process without succinct intentions. Apps are often designed and developed in order to follow trends, rather than to solve a problem, fill a niche, or offer a distinct service. What is the ambition for the app? For the designer and their team, the sense of purpose will affect every step of a project. This sensibility will guide each decision from the branding or marketing of an app, to the wireframe format, and button aesthetic. If the purpose is clear, each piece of the app will communicate and function as a coherent whole. Therefore, have the design and development team continually consider their decisions within a greater goal. As the project progresses, the initial ambition may change. This is okay, as long as the vision remains coherent.

    Conveying this vision to your potential users means that they will understand what value the app brings to their life. Thus, this vision is an important thing to communicate in a first impression. The question becomes how quickly can you convince users of your vision for the app? How it will improve a person’s life, or provide some sort of enjoyment or comfort. If this ambition is conveyed quickly, then as long as your app is in fact useful, it will make it into the 3%.

    Often joining a pre-existing market, or app niche, means that there are apps to study while designing your own. Thus, be careful how you choose to ‘re-purpose’ what is already out there. Study the existing app market, rather than skimming over it. Then, improve upon existing products with intent, rather than thoughtlessly imitating.

    Common Mistake #3: Missing Out On UX Design Mapping

    Be careful not to skip over a thoughtful planning of an app’s UX architecture before jumping into design work. Even before getting to a wireframing stage, the flow and structure of an app should be mapped out. Designers are often too excited to produce aesthetics and details. This results in a culture of designers who generally under appreciate UX, and the necessary logic or navigation within an app. Slow down. Sketch out the flow of the app first before worrying too much about the finer brush strokes. Often apps fail from an overarching lack of flow and organization, rather than imperfect details. However, once the design process takes off always keep the big picture in mind. The details and aesthetic should then clearly evoke the greater concept.

    Common Mistake #4: Disregarding App Development Budget

    As soon as the basis of the app is sketched, this is a good time to get a budget from the development team. This way you don’t reach the end of the project and suddenly need to start cutting critical features. As your design career develops, always take note of the average costs of constructing your concepts so that your design thinking responds to economic restraints. Budgets should be useful design constraints to work within.

    Many failed apps try to cram too many features in from launch.

    Many failed apps try to cram too many features in from launch.

    Common Mistake #5: Cramming in Design Features

    Hopefully, rigorous wireframing will make the distinction between necessary and excessive functions clear. The platform is already the ultimate swiss army knife, so your app doesn’t need to be. Not only will cramming an app with features lead to a likely disorienting User Experience, but an overloaded app will also be difficult to market. If the use of the app is difficult to explain in a concise way, it’s likely trying to do too much. Paring down features is always hard, but it’s necessary. Often, the best strategy might be to gain trust in the beginning with a single or few features, then later in the life of the app can new ones be ‘tested’. This way, the additional features are less likely to interfere with the crucial first few days of an apps’ life.

    Common Mistake #6: Dismissing App Context

    Although the conditions of most design offices practically operate within a vacuum, app designers must be aware of wider contexts. Although purpose and ambition are important, they become irrelevant if not directed within the proper context. Remember that although you and your design team may know your app very well, and find its interfacing obvious, this may not be the case for first time users, or different demographics.

    Consider the immediate context or situation in which the app is intended to be used. Given the social situation, how long might the person expect to be on the app for? What else might be helpful for them to stumble upon given the circumstance? For example, UBER’s interface excels at being used very quickly. This means that for the most part, there isn’t much room for other content. This is perfect because when a user is out with friends and needing to book a ride, your conversation is hardly interrupted in the process. UBER hides a lot of support content deep within the app, but it only appears once the scenario calls for it.

    Who is the target audience for the app? How might the type of user affect how the design of the app? Perhaps, consider that an app targeted for a younger user may be able to take more liberties in assuming a certain level of intuition from the user. Whereas, many functions may need to be pointed out for a less tech savvy user. Is your app meant to be accessed quickly and for a short period of time? Or, is this an app with lots of content that allows users to stay a while? How will the design convey this form of use?

    A good app design should consider the context in which it is used.

    A good ap

    p design should consider the context in which it is used.

    Common Mistake #7: Underestimating Crossing Platforms

    Often apps are developed quickly as a response to changing markets or advancing competitors. This often results in web content being dragged into the mobile platform. A constant issue, which you’d think would be widely understood by now, is that often apps and other mobile content make poor transitions between the desktop, or mobile platforms. No longer can mobile design get away with scaling down web content in the hope of getting a business quickly into the mobile market. The web to mobile transition doesn’t just mean scaling everything down, but also being able to work with less. Functions, navigation and content must all be conveyed with a more minimal strategy. Another common issue appears when an app developing team aspires to release a product simultaneously on all platforms, and through different app stores. This often results in poor compatibility, or a generally buggy, unpolished app.The gymnastics of balancing multiple platforms may be too much to add onto the launch of an app. However, it doesn’t hurt to sometimes take it slowly with one OS at a time, and iron out the major issues, before worrying about compatibility between platforms.

    Common Mistake #8: Overcomplicating App Design

    The famous architect Mies Van der Rohe once said, “It’s better to be good than to be unique”. Ensure that your design is meeting the brief before you start breaking the box or adding flourishes. When a designer finds themselves adding things in order to make a composition more appealing or exciting, these choices will likely lack much value. Continue to ask throughout the design process, how much can I remove? Instead of designing additively, design reductively. What isn’t needed? This method is directed as much towards content, concept and function as it is aesthetics. Over complexity is often a result of a design unnecessarily breaking conventions. Several symbols and interfaces are standard within our visual and tactile language. Will your product really benefit from reworking these standards? Standard icons have proven themselves to be universally intuitive. Thus, they are often the quickest way to provide visual cues without cluttering a screen. Don’t let your design flourishes get in the way of the actual content, or function of the app. Often, apps are not given enough white space. The need for white space is a graphic concept that has transcended both digital and print, thus it shouldn’t be underrated. Give elements on the screen room to breath so that all of the work you put into navigation and UX can be felt.

    The app design process can be reductive, rather than additive.

    The app design process can be reductive, rather than additive.

    Common Mistake #9: Design Inconsistencies

    To the point on simplicity, if a design is going to introduce new standards, they have to at least be consistent across the app. Each new function or piece of content doesn’t necessarily have to be an opportunity to introduce a new design concept. Are texts uniformly formatted? Do UI elements behave in predictable, yet pleasing ways throughout the app? Design consistency must find the balance between existing within common visual language, as well as avoiding being aesthetically stagnant. The balance between intuitive consistency and boredom is a fine line.

    Common Mistake #10: Under Utilizing App Beta Testing

    All designers should analyze the use of their apps with some sort of feedback loop in order to learn what is and isn’t working. A common mistake in testing is for a team to do their beta testing in-house. You need to bring in fresh eyes in order to really dig into the drafts of the app. Send out an ad for beta testers and work with a select audience before going public. This can be a great way to iron out details, edit down features, and find what’s missing. Although, beta testing can be time consuming, it may be a better alternative to developing an app that flops. Anticipate that testing often takes 8 weeks for some developers to do it properly. Avoid using friends or colleagues as testers as they may not criticize the app with the honesty that you need. Using app blogs or website to review your app is another way to test the app in a public setting without a full launch. If you’re having a hard time paring down features for your app, this is a good opportunity to see what elements matter or not.

    The app design market is a battleground, so designing products which are only adequate just isn’t enough. Find a way to hook users from the beginning - communicate, and demonstrate the critical values and features as soon as you can. To be able to do this, your design team must have a coherent vision of what the app is hoping to achieve. In order to establish this ambition, a rigorous story-boarding process can iron out what is and isn’t imperative. Consider which types of users your app may best fit with. Then refine and refine until absolutely nothing else can be taken away from the project without it falling apart.

    This article was written by KENT MUNDLE, Toptal Technical Editor




    Copyright(c) 2017 - PythonBlogs.com
    By using this website, you signify your acceptance of Terms and Conditions and Privacy Policy
    All rights reserved