The Vital Guide to Python Interviewing

The Challenge

As a rough order of magnitude, Giles Thomas (co-founder of PythonAnywhere) estimates that there are between 1.8 and 4.3 million Python developers in the world.

So how hard can it be to find a Python developer? Well, not very hard at all if the goal is just to find someone who can legitimately list Python on their resume. But if the goal is to find a Python guru who has truly mastered the nuances and power of the language, then the challenge is most certainly a formidable one.

First and foremost, a highly-effective recruiting process is needed, as described in our post In Search of the Elite Few – Finding and Hiring the Best Developers in the Industry. Such a process can then be augmented with targeted questions and techniques, such as those provided here, that are specifically geared toward ferreting out Python virtuosos from the plethora of some-level-of-Python-experience candidates.

Python Guru or Snake in the Grass?

So you’ve found what appears to be a strong Python developer. How do you determine if he or she is, in fact, in the elite top 1% of candidates that you’re looking to hire? While there’s no magic or foolproof technique, there are certainly questions you can pose that will help determine the depth and sophistication of a candidate’s knowledge of the language. A brief sampling of such questions is provided below.

It is important to bear in mind, though, that these sample questions are intended merely as a guide. Not every “A” candidate worth hiring will be able to properly answer them all, nor does answering them all guarantee an “A” candidate. At the end of the day, hiring remains as much of an art as it does a science.

Python in the Weeds…

While it’s true that the best developers don’t waste time committing to memory that which can easily be found in a language specification or API document, there are certain key features and capabilities of any programming language that any expert can, and should, be expected to be well-versed in. Here are some Python-specific examples:

Q: Why use function decorators? Give an example.

A decorator is essentially a callable Python object that is used to modify or extend a function or class definition. One of the beauties of decorators is that a single decorator definition can be applied to multiple functions (or classes). Much can thereby be accomplished with decorators that would otherwise require lots of boilerplate (or even worse redundant!) code. Flask, for example, uses decorators as the mechanism for adding new endpoints to a web application. Examples of some of the more common uses of decorators include adding synchronization, type enforcement, logging, or pre/post conditions to a class or function.

Q: What are lambda expressions, list comprehensions and generator expressions? What are the advantages and appropriate uses of each?

Lambda expressions are a shorthand technique for creating single line, anonymous functions. Their simple, inline nature often – though not always – leads to more readable and concise code than the alternative of formal function declarations. On the other hand, their terse inline nature, by definition, very much limits what they are capable of doing and their applicability. Being anonymous and inline, the only way to use the same lambda function in multiple locations in your code is to specify it redundantly.

List comprehensions provide a concise syntax for creating lists. List comprehensions are commonly used to make lists where each element is the result of some operation(s) applied to each member of another sequence or iterable. They can also be used to create a subsequence of those elements whose members satisfy a certain condition. In Python, list comprehensions provide an alternative to using the built-in map()and filter() functions.

As the applied usage of lambda expressions and list comprehensions can overlap, opinions vary widely as to when and where to use one vs. the other. One point to bear in mind, though, is that a list comprehension executes somewhat faster than a comparable solution using map and lambda (some quick tests yielded a performance difference of roughly 10%). This is because calling a lambda function creates a new stack frame while the expression in the list comprehension is evaluated without doing so.

Generator expressions are syntactically and functionally similar to list comprehensions but there are some fairly significant differences between the ways the two operate and, accordingly, when each should be used. In a nutshell, iterating over a generator expression or list comprehension will essentially do the same thing, but the list comprehension will create the entire list in memory first while the generator expression will create the items on the fly as needed. Generator expressions can therefore be used for very large (and even infinite) sequences and their lazy (i.e., on demand) generation of values results in improved performance and lower memory usage. It is worth noting, though, that the standard Python list methods can be used on the result of a list comprehension, but not directly on that of a generator expression.

Q: Consider the two approaches below for initializing an array and the arrays that will result. How will the resulting arrays differ and why should you use one initialization approach vs. the other?

>>> # INITIALIZING AN ARRAY -- METHOD 1
...
>>> x = [[1,2,3,4]] * 3
>>> x
[[1, 2, 3, 4], [1, 2, 3, 4], [1, 2, 3, 4]]
>>>
>>>
>>> # INITIALIZING AN ARRAY -- METHOD 2
...
>>> y = [[1,2,3,4] for _ in range(3)]
>>> y
[[1, 2, 3, 4], [1, 2, 3, 4], [1, 2, 3, 4]]
>>>
>>> # WHICH METHOD SHOULD YOU USE AND WHY?

While both methods appear at first blush to produce the same result, there is an extremely significant difference between the two. Method 2 produces, as you would expect, an array of 3 elements, each of which is itself an independent 4-element array. In method 1, however, the members of the array all point to the same object. This can lead to what is most likely unanticipated and undesired behavior as shown below.

>>> # MODIFYING THE x ARRAY FROM THE PRIOR CODE SNIPPET:
>>> x[0][3] = 99
>>> x
[[1, 2, 3, 99], [1, 2, 3, 99], [1, 2, 3, 99]]
>>> # UH-OH, DON’T THINK YOU WANTED THAT TO HAPPEN!
...
>>>
>>> # MODIFYING THE y ARRAY FROM THE PRIOR CODE SNIPPET:
>>> y[0][3] = 99
>>> y
[[1, 2, 3, 99], [1, 2, 3, 4], [1, 2, 3, 4]]
>>> # THAT’S MORE LIKE WHAT YOU EXPECTED!
...

Q: What will be printed out by the second append() statement below?

>>> def append(list=[]):
...     # append the length of a list to the list
...     list.append(len(list))
...     return list
...
>>> append(['a','b'])
['a', 'b', 2]
>>>
>>> append()  # calling with no arg uses default list value of []
[0]
>>>
>>> append()  # but what happens when we AGAIN call append with no arg?

When the default value for a function argument is an expression, the expression is evaluated only once, not every time the function is called. Thus, once the list argument has been initialized to an empty array, subsequent calls to append without any argument specified will continue to use the same array to which list was originally initialized. This will therefore yield the following, presumably unexpected, behavior:

>>> append()  # first call with no arg uses default list value of []
[0]
>>> append()  # but then look what happens...
[0, 1]
>>> append()  # successive calls keep extending the same default list!
[0, 1, 2]
>>> append()  # and so on, and so on, and so on...
[0, 1, 2, 3]

Q: How might one modify the implementation of the ‘append’ method in the previous question to avoid the undesirable behavior described there?

The following alternative implementation of the append method would be one of a number of ways to avoid the undesirable behavior described in the answer to the previous question:

>>> def append(list=None):
...     if list is None:
list = []
# append the length of a list to the list
...     list.append(len(list))
...     return list
...
>>> append()
[0]
>>> append()
[0]

Q: How can you swap the values of two variables with a single line of Python code?

Consider this simple example:

>>> x = 'X'
>>> y = 'Y'

In many other languages, swapping the values of x and y requires that you to do the following:

>>> tmp = x
>>> x = y
>>> y = tmp
>>> x, y
('Y', 'X')

But in Python, makes it possible to do the swap with a single line of code (thanks to implicit tuple packing and unpacking) as follows:

>>> x,y = y,x
>>> x,y
('Y', 'X')

Q: What will be printed out by the last statement below?

>>> flist = []
>>> for i in range(3):
...     flist.append(lambda: i)
...
>>> [f() for f in flist]   # what will this print out?

In any closure in Python, variables are bound by name. Thus, the above line of code will print out the following:

[2, 2, 2]

Presumably not what the author of the above code intended!

workaround is to either create a separate function or to pass the args by name; e.g.:

>>> flist = []
>>> for i in range(3):
...     flist.append(lambda i = i : i)
...
>>> [f() for f in flist]
[0, 1, 2]

Q: What are the key differences between Python 2 and 3?

Although Python 2 is formally considered legacy at this point, its use is still widespread enough that is important for a developer to recognize the differences between Python 2 and 3.

Here are some of the key differences that a developer should be aware of:

  • Text and Data instead of Unicode and 8-bit strings. Python 3.0 uses the concepts of text and (binary) data instead of Unicode strings and 8-bit strings. The biggest ramification of this is that any attempt to mix text and data in Python 3.0 raises a TypeError (to combine the two safely, you must decode bytes or encode Unicode, but you need to know the proper encoding, e.g. UTF-8)
    • This addresses a longstanding pitfall for naïve Python programmers. In Python 2, mixing Unicode and 8-bit data would work if the string happened to contain only 7-bit (ASCII) bytes, but you would get UnicodeDecodeError if it contained non-ASCII values. Moreover, the exception would happen at the combination point, not at the point at which the non-ASCII characters were put into the str object. This behavior was a common source of confusion and consternation for neophyte Python programmers.
  • print function. The print statement has been replaced with a print() function
  • xrange – buh-bye. xrange() no longer exists (range() now behaves like xrange() used to behave, except it works with values of arbitrary size)
  • API changes:
    • zip()map() and filter() all now return iterators instead of lists
    • dict.keys()dict.items() and dict.values() now return “views” instead of lists
    • dict.iterkeys()dict.iteritems() and dict.itervalues() are no longer supported
  • Comparison operators. The ordering comparison operators (<<=>=>) now raise a TypeErrorexception when the operands don’t have a meaningful natural ordering. Some examples of the ramifications of this include:
    • Expressions like 1 < ''0 > None or len <= len are no longer valid
    • None < None now raises a TypeError instead of returning False
    • Sorting a heterogeneous list no longer makes sense – all the elements must be comparable to each other

More details on the differences between Python 2 and 3 are available here.

Q: Is Python interpreted or compiled?

As noted in Why Are There So Many Pythons?, this is, frankly, a bit of a trick question in that it is malformed. Python itself is nothing more than an interface definition (as is true with any language specification) of which there are multiple implementations. Accordingly, the question of whether “Python” is interpreted or compiled does not apply to the Python language itself; rather, it applies to each specific implementation of the Python specification.

Further complicating the answer to this question is the fact that, in the case of CPython (the most common Python implementation), the answer really is “sort of both”. Specifically, with CPython, code is first compiled and then interpreted. More precisely, it is not precompiled to native machine code, but rather to bytecode. While machine code is certainly faster, bytecode is more portable and secure. The bytecode is then interpreted in the case of CPython (or both interpreted and compiled to optimized machine code at runtime in the case of PyPy).

Q: What are some alternative implementations to CPython? When and why might you use them?

One of the more prominent alternative implementations is Jython, a Python implementation written in Java that utilizes the Java Virtual Machine (JVM). While CPython produces bytecode to run on the CPython VM, Jython produces Java bytecode to run on the JVM.

Another is IronPython, written in C# and targeting the .NET stack. IronPython runs on Microsoft’s Common Language Runtime (CLR).

As also pointed out in Why Are There So Many Pythons?, it is entirely possible to survive without ever touching a non-CPython implementation of Python, but there are advantages to be had from switching, most of which are dependent on your technology stack.

Another noteworthy alternative implementation is PyPy whose key features include:

  • Speed. Thanks to its Just-in-Time (JIT) compiler, Python programs often run faster on PyPy.
  • Memory usage. Large, memory-hungry Python programs might end up taking less space with PyPy than they do in CPython.
  • Compatibility. PyPy is highly compatible with existing python code. It supports cffi and can run popular Python libraries like Twisted and Django.
  • Sandboxing. PyPy provides the ability to run untrusted code in a fully secure way.
  • Stackless mode. PyPy comes by default with support for stackless mode, providing micro-threads for massive concurrency.

Q: What’s your approach to unit testing in Python?

The most fundamental answer to this question centers around Python’s unittest testing framework. Basically, if a candidate doesn’t mention unittest when answering this question, that should be a huge red flag.

unittest supports test automation, sharing of setup and shutdown code for tests, aggregation of tests into collections, and independence of the tests from the reporting framework. The unittest module provides classes that make it easy to support these qualities for a set of tests.

Assuming that the candidate does mention unittest (if they don’t, you may just want to end the interview right then and there!), you should also ask them to describe the key elements of the unittest framework; namely, test fixtures, test cases, test suites and test runners.

A more recent addition to the unittest framework is mock. mock allows you to replace parts of your system under test with mock objects and make assertions about how they are to be used. mock is now part of the Python standard library, available as unittest.mock in Python 3.3 onwards.

The value and power of mock are well explained in An Introduction to Mocking in Python. As noted therein, system calls are prime candidates for mocking: whether writing a script to eject a CD drive, a web server which removes antiquated cache files from /tmp, or a socket server which binds to a TCP port, these calls all feature undesired side-effects in the context of unit tests. Similarly, keeping your unit-tests efficient and performant means keeping as much “slow code” as possible out of the automated test runs, namely filesystem and network access.

[Note: This question is for Python developers who are also experienced in Java.]
Q: What are some key differences to bear in mind when coding in Python vs. Java?

Disclaimer #1. The differences between Java and Python are numerous and would likely be a topic worthy of its own (lengthy) post. Below is just a brief sampling of some key differences between the two languages.

Disclaimer #2. The intent here is not to launch into a religious battle over the merits of Python vs. Java (as much fun as that might be!). Rather, the question is really just geared at seeing how well the developer understands some practical differences between the two languages. The list below therefore deliberately avoids discussing the arguable advantages of Python over Java from a programming productivity perspective.

With the above two disclaimers in mind, here is a sampling of some key differences to bear in mind when coding in Python vs. Java:

  • Dynamic vs static typing. One of the biggest differences between the two languages is that Java is restricted to static typing whereas Python supports dynamic typing of variables.
  • Static vs. class methods. A static method in Java does not translate to a Python class method.
    • In Python, calling a class method involves an additional memory allocation that calling a static method or function does not.
    • In Java, dotted names (e.g., foo.bar.method) are looked up by the compiler, so at runtime it really doesn’t matter how many of them you have. In Python, however, the lookups occur at runtime, so “each dot counts”.
  • Method overloading. Whereas Java requires explicit specification of multiple same-named functions with different signatures, the same can be accomplished in Python with a single function that includes optional arguments with default values if not specified by the caller.
  • Single vs. double quotes. Whereas the use of single quotes vs. double quotes has significance in Java, they can be used interchangeably in Python (but no, it won’t allow beginnning the same string with a double quote and trying to end it with a single quote, or vice versa!).
  • Getters and setters (not!). Getters and setters in Python are superfluous; rather, you should use the ‘property’ built-in (that’s what it’s for!). In Python, getters and setters are a waste of both CPU and programmer time.
  • Classes are optional. Whereas Java requires every function to be defined in the context of an enclosing class definition, Python has no such requirement.
  • Indentation matters… in Python. This bites many a newbie Python programmer.

The Big Picture

An expert knowledge of Python extends well beyond the technical minutia of the language. A Python expert will have an in-depth understanding and appreciation of Python’s benefits as well as its limitations. Accordingly, here are some sample questions that can help assess this dimension of a candidate’s expertise:

Q: What is Python particularly good for? When is using Python the “right choice” for a project?

Although likes and dislikes are highly personal, a developer who is “worth his or her salt” will highlight features of the Python language that are generally considered advantageous (which also helps answer the question of what Python is “particularly good for”). Some of the more common valid answers to this question include:

  • Ease of use and ease of refactoring, thanks to the flexibility of Python’s syntax, which makes it especially useful for rapid prototyping.
  • More compact code, thanks again to Python’s syntax, along with a wealth of functionally-rich Python libraries (distributed freely with most Python language implementations).
This article originally appeared on Toptal

How To Improve ASP.NET App Performance In Web Farm With Caching

There are only two hard things in Computer Science: cache invalidation and naming things.

A Brief Introduction to Caching

Caching is a powerful technique for increasing performance through a simple trick: Instead of doing expensive work (like a complicated calculation or complex database query) every time we need a result, the system can store – or cache – the result of that work and simply supply it the next time it is requested without needing to reperform that work (and can, therefore, respond tremendously faster).

Of course, the whole idea behind caching works only as long the result we cached remains valid. And here we get to the actual hard part of the problem: How do we determine when a cached item has become invalid and needs to be recreated?

Caching is a powerful technique for increasing performance

The ASP.NET in-memory cache is extremely fast
and perfect to solve distributed web farm caching problem.

Usually, a typical web application has to deal with a much higher volume of read requests than write requests. That is why a typical web application that is designed to handle a high load is architected to be scalable and distributed, deployed as a set of web tier nodes, usually called a farm. All these facts have an impact on the applicability of caching.

In this article, we focus on the role caching can play in assuring high throughput and performance of web applications designed to handle a high load, and I am going to use the experience from one of my projects and provide an ASP.NET-based solution as an illustration.

The Problem of Handling a High Load

The actual problem I had to solve wasn’t an original one. My task was to make an ASP.NET MVC monolithic web application prototype be capable of handling a high load.

The necessary steps towards improving throughput capabilities of a monolithic web application are:

  • Enable it to run multiple copies of the web application in parallel, behind a load balancer, and serve all concurrent requests effectively (i.e., make it scalable).
  • Profile the application to reveal current performance bottlenecks and optimize them.
  • Use caching to increase read request throughput, since this typically constitutes a significant part of the overall applications load.

Caching strategies often involve use of some middleware caching server, like Memcached or Redis, to store the cached values. Despite their high adoption and proven applicability, there are some downsides to these approaches, including:

  • Network latencies introduced by accessing the separate cache servers can be comparable to the latencies of reaching the database itself.
  • The web tier’s data structures can be unsuitable for serialization and deserialization out of the box. To use cache servers, those data structures should support serialization and deserialization, which requires ongoing additional development effort.
  • Serialization and deserialization add runtime overhead with an adverse effect on performance.

All these issues were relevant in my case, so I had to explore alternative options.

How caching works

The built-in ASP.NET in-memory cache (System.Web.Caching.Cache) is extremely fast and can be used without serialization and deserialization overhead, both during the development and at the runtime. However, ASP.NET in-memory cache has also its own drawbacks:

  • Each web tier node needs its own copy of cached values. This could result in higher database tier consumption upon node cold start or recycling.
  • Each web tier node should be notified when another node makes any portion of the cache invalid by writing updated values. Since the cache is distributed and without proper synchronization, most of the nodes will return old values which is typically unacceptable.

If the additional database tier load won’t lead to a bottleneck by itself, then implementing a properly distributed cache seems like an easy task to handle, right? Well, it’s not an easy task, but it is possible. In my case, benchmarks showed that the database tier shouldn’t be a problem, as most of the work happened in the web tier. So, I decided to go with the ASP.NET in-memory cache and focus on implementing the proper synchronization.

Introducing an ASP.NET-based Solution

As explained, my solution was to use the ASP.NET in-memory cache instead of the dedicated caching server. This entails each node of the web farm having its own cache, querying the database directly, performing any necessary calculations, and storing results in a cache. This way, all cache operations will be blazing fast thanks to the in-memory nature of the cache. Typically, cached items have a clear lifetime and become stale upon some change or writing of new data. So, from the web application logic, it is usually clear when the cache item should be invalidated.

The only problem left here is that when one of the nodes invalidates a cache item in its own cache, no other node will know about this update. So, subsequent requests serviced by other nodes will deliver stale results. To address this, each node should share its cache invalidations with the other nodes. Upon receiving such invalidation, other nodes could simply drop their cached value and get a new one at the next request.

Here, Redis can come into play. The power of Redis, compared to other solutions, comes from its Pub/Sub capabilities. Every client of a Redis server can create a channel and publish some data on it. Any other client is able to listen to that channel and receive the related data, very similar to any event-driven system. This functionality can be used to exchange cache invalidation messages between the nodes, so all nodes will be able to invalidate their cache when it is needed.

A group of ASP.NET web tier nodes using a Redis backplane

ASP.NET’s in-memory cache is straightforward in some ways and complex in others. In particular, it is straightforward in that it works as a map of key/value pairs, yet there is a lot of complexity related to its invalidation strategies and dependencies.

Fortunately, typical use cases are simple enough, and it’s possible to use a default invalidation strategy for all the items, enabling each cache item to have only a single dependency at most. In my case, I ended with the following ASP.NET code for the interface of the caching service. (Note that this is not the actual code, as I omitted some details for the sake of simplicity and the proprietary license.)

public interface ICacheKey
{
string Value { get; }
}
public interface IDataCacheKey : ICacheKey { }
public interface ITouchableCacheKey : ICacheKey { }
public interface ICacheService
{
int ItemsCount { get; }
T Get<T>(IDataCacheKey key, Func<T> valueGetter);
T Get<T>(IDataCacheKey key, Func<T> valueGetter, ICacheKey dependencyKey);
}

Here, the cache service basically allows two things. First, it enables storing the result of some value getter function in a thread safe manner. Second, it ensures that the then-current value is always returned when it is requested. Once the cache item becomes stale or is explicitly evicted from the cache, the value getter is called again to retrieve a current value. The cache key was abstracted away by ICacheKey interface, mainly to avoid hard-coding of cache key strings all over the application.

To invalidate cache items, I introduced a separate service, which looked like this:

public interface ICacheInvalidator
{
bool IsSessionOpen { get; }
void OpenSession();
void CloseSession();
void Drop(IDataCacheKey key);
void Touch(ITouchableCacheKey key);
void Purge();
}

Besides basic methods of dropping items with data and touching keys, which only had dependent data items, there are a few methods related to some kind of “session”.

Our web application used Autofac for dependency injection, which is an implementation of the inversion of control (IoC) design pattern for dependencies management. This feature allows developers to create their classes without the need to worry about dependencies, as the IoC container manages that burden for them.

The cache service and cache invalidator have drastically different lifecycles regarding IoC. The cache service was registered as a singleton (one instance, shared between all clients), while the cache invalidator was registered as an instance per request (a separate instance was created for each incoming request). Why?

The answer has to do with an additional subtlety we needed to handle. The web application is using a Model-View-Controller (MVC) architecture, which helps mainly in the separation of UI and logic concerns. So, a typical controller action is wrapped into a subclass of an ActionFilterAttribute. In the ASP.NET MVC framework, such C#-attributes are used to decorate the controller’s action logic in some way. That particular attribute was responsible for opening a new database connection and starting a transaction at the beginning of the action. Also, at the end of the action, the filter attribute subclass was responsible for committing the transaction in case of success and rolling it back in the event of failure.

If cache invalidation happened right in the middle of the transaction, there could be race condition whereby the next request to that node would successfully put the old (still visible to other transactions) value back into the cache. To avoid this, all invalidations are postponed until the transaction is committed. After that, cache items are safe to evict and, in the case of a transaction failure, there is no need for cache modification at all.

That was the exact purpose of the “session”-related parts in the cache invalidator. Also, that is the purpose of its lifetime being bound to the request. The ASP.NET code looked like this:

class HybridCacheInvalidator : ICacheInvalidator
{
...
public void Drop(IDataCacheKey key)
{
if (key == null)
throw new ArgumentNullException("key");
if (!IsSessionOpen)
throw new InvalidOperationException("Session must be opened first.");
_postponedRedisMessages.Add(new Tuple<string, string>("drop", key.Value));
}
...
public void CloseSession()
{
if (!IsSessionOpen)
return;
_postponedRedisMessages.ForEach(m => PublishRedisMessageSafe(m.Item1, m.Item2));
_postponedRedisMessages = null;
}	
...
}

The PublishRedisMessageSafe method here is responsible for sending the message (second argument) to a particular channel (first argument). In fact, there are separate channels for drop and touch, so the message handler for each of them knew exactly what to do - drop/touch the key equal to the received message payload.

One of the tricky parts was to manage the connection to the Redis server properly. In the case of the server going down for any reason, the application should continue to function correctly. When Redis is back online again, the application should seamlessly start to use it again and exchange messages with other nodes again. To achieve this, I used the StackExchange.Redis library and the resulting connection management logic was implemented as follows:

class HybridCacheService : ...
{
...
public void Initialize()
{
try
{
Multiplexer = ConnectionMultiplexer.Connect(_configService.Caching.BackendServerAddress);
...
Multiplexer.ConnectionFailed += (sender, args) => UpdateConnectedState();
Multiplexer.ConnectionRestored += (sender, args) => UpdateConnectedState();
...
}
catch (Exception ex)
{
...
}
}
private void UpdateConnectedState()
{
if (Multiplexer.IsConnected && _currentCacheService is NoCacheServiceStub) {
_inProcCacheInvalidator.Purge();
_currentCacheService = _inProcCacheService;
_logger.Debug("Connection to remote Redis server restored, switched to in-proc mode.");
} else if (!Multiplexer.IsConnected && _currentCacheService is InProcCacheService) {
_currentCacheService = _noCacheStub;
_logger.Debug("Connection to remote Redis server lost, switched to no-cache mode.");
}
}
}

Here, ConnectionMultiplexer is a type from the StackExchange.Redis library, which is responsible for transparent work with underlying Redis. The important part here is that, when a particular node loses connection to Redis, it falls back to no cache mode to make sure no request will receive stale data. After the connection is restored, the node starts to use the in-memory cache again.

Here are examples of action without usage of the cache service (SomeActionWithoutCaching) and an identical operation which uses it (SomeActionUsingCache):

class SomeController : Controller
{
public ISomeService SomeService { get; set; }
public ICacheService CacheService { get; set; }
...
public ActionResult SomeActionWithoutCaching()
{
return View(
SomeService.GetModelData()
);
}
...
public ActionResult SomeActionUsingCache()
{
return View(
CacheService.Get(
/* Cache key creation omitted */,
() => SomeService.GetModelData()
);
);
}
}

A code snippet from an ISomeService implementation could look like this:

class DefaultSomeService : ISomeService
{
public ICacheInvalidator _cacheInvalidator;
...
public SomeModel GetModelData()
{
return /* Do something to get model data. */;
}
...
public void SetModelData(SomeModel model)
{
/* Do something to set model data. */
_cacheInvalidator.Drop(/* Cache key creation omitted */);
}
}

Benchmarking and Results

After the caching ASP.NET code was all set, it was time to use it in the existing web application logic, and benchmarking can be handy to decide where to put most efforts of rewriting the code to use the caching. It’s crucial to pick out a few most operationally common or critical use cases to be benchmarked. After that, a tool like Apache jMeter could be used for two things:

  • To benchmark these key use cases via HTTP requests.
  • To simulate high load for the web node under test.

To get a performance profile, any profiler which is capable of attaching to the IIS worker process could be used. In my case, I used JetBrains dotTrace Performance. After some time spent experimenting to determine the correct jMeter parameters (such as concurrent and requests count), it becomes possible to start to collect performance snapshots, which are very helpful in identifying the hotspots and bottlenecks.

In my case, some use cases showed that about 15%-45% overall code execution time was spent in the database reads with the obvious bottlenecks. After I applied caching, performance nearly doubled (i.e., was twice as fast) for most of them.

Conclusion

As you may see, my case could seem like an example of what is usually called “reinventing the wheel”: Why bother to try to create something new, when there are already best practices widely applied out there? Just set up a Memcached or Redis, and let it go.

I definitely agree that usage of best practices is usually the best option. But before blindly applying any best practice, one should ask oneself: How applicable is this “best practice”? Does it fit my case well?

The way I see it, proper options and tradeoff analysis is a must upon making any significant decision, and that was the approach I chose because the problem was not so easy. In my case, there were many factors to consider, and I did not want to take a one-size-fits-all solution when it might not be the right approach for the problem at hand.

In the end, with the proper caching in place, I did get almost 50% performance increase over the initial solution.

Source: Toptal  

Tips & Tricks for Any Developers Successful Online Portfolio

At Toptal we screen a lot of designers, so over time we have learned what goes into making a captivating and coherent portfolio. Each designer’s portfolio is like an introduction to an individual designer’s skill set and strengths and represents them to future employers, clients and other designers. It shows both past work, but also future direction. There are several things to keep in mind when building a portfolio, so here is the Toptal Guide of tips and common mistakes for portfolio design.

1. Content Comes First

The main use of the portfolio is to present your design work. Thus, the content should inform the layout and composition of the document. Consider what kind of work you have, and how it might be best presented. A UX designer may require a series of animations to describe a set of actions, whereas the visual designer may prefer spreads of full images.

The portfolio design itself is an opportunity to display your experiences and skills. However, excessive graphic flourishes shouldn’t impede the legibility of the content. Instead, consider how the backgrounds of your portfolio can augment or enhance your work. The use of similar colors as the content in the background will enhance the details of your project. Lighter content will stand out against dark backgrounds. Legibility is critical, so ensure that your portfolio can be experienced in any medium, and considers all accessibility issues such as color palettes and readability.

You should approach your portfolio in the same manner you would any project. What is the goal here? Present it in a way that makes sense to viewers who are not essentially visually savvy. Edit out projects that may be unnecessary. Your portfolio should essentially be a taster of what you can do, a preparation for the client of what to expect to see more of in the interview. The more efficiently that you can communicate who you are as a designer, the better.

2. Consider Your Target Audience

A portfolio for a client should likely be different than a portfolio shown to a blog editor, or an art director. Your professional portfolio should always cater to your target audience. Edit it accordingly. If your client needs branding, then focus on your branding work. If your client needs UX Strategy than make sure to showcase your process.

Even from client to client, or project to project your portfolio will need tweaking. If you often float between several design disciplines, as many designers do, it would be useful to curate a print designer portfolio separately from a UX or visual design portfolio.

3. Tell the Stories of Your Projects

As the design industry has evolved, so have our clients, and their appreciation for our expertise and what they hire us to do. Our process is often as interesting and important to share with them, as the final deliverables. Try to tell the story of your product backwards, from final end point through to the early stages of the design process. Share your sketches, your wireframes, your user journeys, user personas, and so on.

Showing your process allows the reader to understand how you think and work through problems. Consider this an additional opportunity to show that you have an efficient and scalable process..

4. Be Professional in Your Presentation

Attention to detail, both in textual and design content are important aspects of any visual presentation, so keep an eye on alignment, image compression, embedded fonts and other elements, as you would any project. The careful treatment of your portfolio should reflect how you will handle your client’s work.

With any presentation, your choice of typeface will impact the impression you give, so do research the meaning behind a font family, and when in doubt, ask your typography savvy friends for advice.

5. Words Are As Important As Work

Any designer should be able to discuss their projects as avidly as they can design them. Therefore your copywriting is essential. True, your work is the main draw of the portfolio - however the text, and how you write about your work can give viewers insight into your portfolio.

Not everyone who sees your work comes from a creative, or visual industry. Thus, the descriptive text that you provide for images is essential. At the earlier stages of a project, where UX is the main focus, often you will need to complement your process with clearly defined content, both visual diagrams, and textual explanation.

Text can also be important for providing the context of the project. Often much of your work is done in the background, so why not present it somehow? What was the brief, how did the project come about?

Avoid These Common Mistakes

The culture of the portfolio networks like Behance or Dribble have cultivated many bad habits and trends in portfolio design. A popular trend is the perspective view of a product on a device. However, these images often do little to effectively represent the project, and hide details and content. Clients need to see what you have worked on before, with the most logical visualisation possible. Showcasing your products in a frontal view, with an “above the fold” approach often makes more sense to the non-visual user. Usually, the best web pages and other digital content are presented with no scrolling required. Avoid sending your website portfolio as one long strip, as this is only appropriate for communicating with developers.

Ensure that you cover the bases on all portfolio formats. Today it is expected for you to have an online presence, however some clients prefer that you send a classic A4 or US letterhead sized PDF. You need to have the content ready for any type of presentation.

Try to use a consistent presentation style and content throughout the projects in your portfolio. Differentiate each project with simple solutions like different coloured backgrounds, or textures, yet within the same language.

 

Source: Toptal 

 

Getting Started with Elixir Programming Language

If you have been reading blog posts, hacker news threads, your favorite developers tweets or listening to podcasts, at this point you’ve probably heard about the Elixir programming language. The language was created by José Valim, a well known developer in the open-source world. You may know him from the Ruby on Rails MVC framework or from devise and simple_form ruby gems him and his co-workers from the Plataformatec have been working on in the last few years.

According the José Valim, Elixir was born in 2011. He had the idea to build the new language due the lack of good tools to solve the concurrency problems in the ruby world. At that time, after spending time studying concurrency and distributed focused languages, he found two languages that he liked, Erlang and Clojure which run in the JVM. He liked everything he saw in the Erlang language (Erlang VM) and he hated the things he didn’t see, like polymorphism, metaprogramming and language extendability attributes which Clojure was good at. So, Elixir was born with that in mind, to have an alternative for Clojure and a dynamic language which runs in the Erlang Virtual Machine with good extendability support.

Getting Started with Elixir Programming Language

Elixir describes itself as a dynamic, functional language with immutable state and an actor based approach to concurrency designed for building scalable and maintainable applications with a simple, modern and tidy syntax. The language runs in the Erlang Virtual Machine, a battle proof, high-performance and distributed virtual machine known for its low latency and fault tolerance characteristics.

Before we see some code, it’s worth saying that Elixir has been accepted by the community which is growing. If you want to learn Elixir today you will easily find books, libraries, conferences, meetups, podcasts, blog posts, newsletters and all sorts of learning sources out there as well as it was accepted by the Erlang creators.

Let’s see some code!

Install Elixir:

Installing Elixir is super easy in all major platforms and is an one-liner in most of them.

Arch Linux

Elixir is available on Arch Linux through the official repositories:

pacman -S elixir

Ubuntu

Installing Elixir in Ubuntu is a bit tidious. But it is easy enough nonetheless.

wget https://packages.erlang-solutions.com/erlang-solutions_1.0_all.deb && sudo dpkg -i erlang-solutions_1.0_all.deb
apt-get update
apt-get install esl-erlang
apt-get install elixir

OS X

Install Elixir in OS X using Homebrew.

brew install elixir

Meet IEx

After the installation is completed, it’s time to open your shell. You will spend a lot of time in your shell if you want to develop in Elixir.

Elixir’s interactive shell or IEx is a REPL - (Read Evaluate Print Loop) where you can explore Elixir. You can input expressions there and they will be evaluated giving you immediate feedback. Keep in mind that your code is truly evaluated and not compiled, so make sure not to run profiling nor benchmarks in the shell.

The Break Command

There’s an important thing you need to know before you start the IEx RELP - how to exit it.

You’re probably used to hitting CTRL+C to close the programs running in the terminal. If you hit CTRL+C in the IEx RELP, you will open up the Break Menu. Once in the break menu, you can hit CTRL+C again to quit the shell as well as pressing a.

I’m not going to dive into the break menu functions. But, let’s see a few IEx helpers!

Helpers

IEx provides a bunch of helpers, in order to list all of them type: h().

And this is what you should see:

Those are some of my favorites, I think they will be yours as well.

  • h as we just saw, this function will print the helper message.
  • h/1 which is the same function, but now it expects one argument.

For instance, whenever you want to see the documentation of the String strip/2 method you can easily do:

Probably the second most useful IEx helper you’re going to use while programming in Elixir is the c/2, which compiles a given elixir file (or a list) and expects as a second parameter a path to write the compiled files to.

Let’s say you are working in one of the http://exercism.io/ Elixir exersices, the Anagram exercise.

So you have implemented the Anagram module, which has the method match/2 in the anagram.exs file. As the good developer you are, you have written a few specs to make sure everything works as expected as well.

This is how your current directory looks:

Now, in order to run your tests against the Anagram module you need to run/compile the tests.

As you just saw, in order to compile a file, simply invoke the elixir executable passing as argument path to the file you want to compile.

Now let’s say you want to run the IEx REPL with the Anagram module accessible in the session context. There are two commonly used options. The first is you can require the file by using the options -r, something like iex -r anagram.exs. The second one, you can compile right from the IEx session.

Simple, just like that!

Ok, what about if you want to recompile a module? Should you exit the IEx, run it again and compile the file again? Nope! If you have a good memory, you will remember that when we listed all the helpers available in the IEx RELP, we saw something about a recompile command. Let’s see how it works.

Notice that this time, you passed as an argument the module itself and not the file path.

As we saw, IEx has a bunch of other useful helpers that will help you learn and understand better how an Elixir program works.

Basics of Elixir Types

Numbers

There are two types of numbers. Arbitrary sized integers and floating points numbers.

Integers

Integers can be written in the decimal base, hexadecimal, octal and binary.

As in Ruby, you can use underscore to separate groups of three digits when writing large numbers. For instance you could right a hundred million like this:

100_000_000

Octal:

0o444

Hexdecimal:

0xabc

Binary:

0b1011

Floats

Floare are IEEE 754 double precision. They have 16 digits of accuracy and a maximum exponent of around 10308.

Floats are written using a decimal point. There must be at least one digit before and after the point. You can also append a trailing exponent. For instance 1.0, 0.3141589e1, and 314159.0-e.

Atoms

Atoms are constants that represent names. They are immutable values. You write an atom with a leading colon : and a sequence of letters, digits, underscores, and at signs @. You can also write them with a leading colon : and an arbitrary sequence of characters enclosed by quotes.

Atoms are a very powerful tool, they are used to reference erlang functions as well as keys and Elixir methods.

Here are a few valid atoms.

:name, :first_name, :"last name",  :===, :is_it_@_question?

Booleans

Of course, booleans are true and false values. But the nice thing about them is at the end of the day, they’re just atoms.

Strings

By default, strings in Elixir are UTF-8 compliant. To use them you can have an arbitrary number of characters enclosed by " or '. You can also have interpolated expressions inside the string as well as escaped characters.

Be aware that single quoted strings are actually a list of binaries.

Anonymous Functions

As a functional language, Elixir has anonymous functions as a basic type. A simple way to write a function is fn (argument_list) -> body end. But a function can have multiple bodies with multiple argument lists, guard clauses, and so on.

Dave Thomas, in the Programming Elixir book, suggests we think of fn…end as being the quotes that surround a string literal, where instead of returning a string value we are returning a function.

Tuples

Tuple is an immutable indexed array. They are fast to return its size and slow to append new values due its immutable nature. When updating a tuple, you are actually creating a whole new copy of the tuple self.

Tuples are very often used as the return value of an array. While coding in Elixir you will very often see this, {:ok, something_else_here}.

Here’s how we write a tuple: {?a,?b,?c}.

Pattern Matching

I won’t be able to explain everything you need to know about Pattern Matching, however what you are about to read covers a lot of what you need to know to get started.

Elixir uses = as a match operator. To understand this, we kind of need to unlearn what we know about = in other traditional languages. In traditional languages the equals operator is for assignment. In Elixir, the equals operators is for pattern matching.

So, that’s the way it works values in the left hand side. If they are variables they are bound to the right hand side, if they are not variables elixir tries to match them with the right hand side.

Pin Operator

Elixir provides a way to always force pattern matching against the variable in the left hand side, the pin operator.

Lists

In Elixir, Lists look like arrays as we know it from other languages but they are not. Lists are linked structures which consist of a head and a tail.

Keyword Lists

Keyword Lists are a list of Tuple pairs.

You simply write them as lists. For instance: [{:one, 1}, 2, {:three, 3}]. There’s a shortcut for defining lists, here’s how it looks: [one: 1, three: 3].

In order to retrieve an item from a keyword list you can either use:

Keyword.get([{:one, 1}, 2, {:three, 3}], :one)

Or use the shortcut:

[{:one, 1}, 2, {:three, 3}][:one]

Because keyword lists are slow when retrieving a value, in it is an expensive operation, so if you are storing data that needs fast access you should use a Map.

Maps

Maps are an efficient collection of key/value pairs. The key can have any value you want as a key, but usually should be the same type. Different from keyword lists, Maps allow only one entry for a given key. They are efficient as they grow and they can be used in the Elixir pattern matching in general use maps when you need an associative array.

Here’s how you can write a Map:

%{ :one => 1, :two => 2, 3 => 3, "four" => 4, [] => %{}, {} => [k: :v]}

Conclusion

Elixir is awesome, easy to understand, has simple but powerful types and very useful tooling around it which will help you when beginning to learn. In this first part, we have covered the various data types Elixir programs are built on and the operators that power them. In later parts we will dive deeper into the world of Elixir - functional and concurrent programming.

Source: Toptal