Plataformatec Blog http://blog.plataformatec.com.br Plataformatec's place to talk about Ruby, Ruby on Rails and software engineering Thu, 24 Jul 2014 14:05:31 +0000 en-US hourly 1 http://wordpress.org/?v=3.9.1 The new HTML sanitizer in Rails 4.2 http://blog.plataformatec.com.br/2014/07/the-new-html-sanitizer-in-rails-4-2/ http://blog.plataformatec.com.br/2014/07/the-new-html-sanitizer-in-rails-4-2/#comments Thu, 24 Jul 2014 12:00:08 +0000 http://blog.plataformatec.com.br/?p=4134 ]]> The article below was originally written by Kasper Timm Hansen (@kaspth on github & twitter) about his work during the Google Summer of Code 2013.

Kasper and I worked a lot changing the underlying implementation of the sanitize helper to give Rails developers a more robust, faster and secure solution to sanitize user input.

This new implementation should be fully backward compatible, with no changes to the API, which should make the update easier.

You can see more information about the previous and the new implementation on this talk I presented in a Brazillian conference this year (the slides are in English).

Now, I’ll let Kasper share his words with you.

Scrubbing Rails Free of HTML-scanner

Everyone, at least one time, has already needed to use the sanitize method to scrub some pesky HTML away.

<%= sanitize @article.body %>

If you were to run this on Rails 4.1 (and before) this would take advantage of the html-scanner, a vendored library inside Rails, for the sanitization. Since the summer of 2013 I have been working to destroy that notion by wiping the traces of html-scanner throughout Rails. Before you become concerned of my mental health, I didn’t do this unwarranted. I’m one of the Google Summer of Code students working on Ruby on Rails. My project proposal was to kick html-scanner to the curb (technical term) and grab a hold of Loofah instead. Why did the old library need replacing, though?

The out washed HTML-scanner

html-scanner has been with us for a long time now. The copyright notice in the library clocks it in at 2006, when Assaf Arkin created it. This library relies on Regular Expressions to recognize HTML (and XML) elements. This made the code more brittle. It was easier to introduce errors via complex Regular Expressions, which also gave it a higher potential for security issues.

The Rails Team wanted something more robust and faster, so we picked Loofah. Loofah uses Nokogiri for parsing, which provides a Ruby interface to either a C or Java parser depending on the Ruby implementation you use. This means Loofah is fast. It’s up to 60 to 100% faster than html-scanner on larger documents and fragments.

I started by taking a look at the SanitizeHelper in Action View, which consists of four methods and some settings. The four methods of the are sanitize, sanitize_css, strip_tags and strip_links.

Let’s take a look at the sanitize method.

Comparing with the old implementation, sanitize still uses the WhiteListSanitizer class to do it’s HTML stripping. However, since Action View was pulled out of Action Pack and both needed to use this functionality, we’ve extracted this to it’s own gem.

Developers meet Rails::Html::WhiteListSanitizer

When you use sanitize, you’re really using WhiteListSanitizer‘s sanitize method. Let me show you the new version.

def sanitize(html, options = {})
  return nil unless html
  return html if html.empty?

No surprises here.

  loofah_fragment = Loofah.fragment(html)

The first trace of Loofah. A fragment is a part of a document, but without a DOCTYPE declaration and html and body tags. A piece of a document essentially. Internally Nokogiri creates a document and pulls the parsed html out of the body tag, leaving us with a fragment.

  if scrubber = options[:scrubber]
    # No duck typing, Loofah ensures subclass of Loofah::Scrubber
    loofah_fragment.scrub!(scrubber)

You can pass your own Scrubber to sanitize! Giving you the power to choose if and how elements are sanitized. As the comment alludes, any scrubber has to be either a subclass of Loofah::Scrubber or it can wrap a block. I’ll show an example later.

  elsif allowed_tags(options) || allowed_attributes(options)
    @permit_scrubber.tags = allowed_tags(options)
    @permit_scrubber.attributes = allowed_attributes(options)
    loofah_fragment.scrub!(@permit_scrubber)

We have been very keen on maintaining backwards compatibility throughout this project, so you can still supply Enumerables of tags and attributes to sanitize. That’s what the PermitScrubber used here handles. It manages these options and makes them work independently. If you pass one it’ll use the standard behavior for the other. See the documentation on what the standard behavior is.
You can also set the allowed tags and attributes on the class level. Like this:

Rails::Html::Sanitizer.allowed_tags = Set.new %w(for your health)

That’s simply what allowed_tags and allowed_attributes methods are there for. They’ll return the tags or attributes from the options and fallback to the class level setting if any.

  else
    remove_xpaths(loofah_fragment, XPATHS_TO_REMOVE)
    loofah_fragment.scrub!(:strip)
  end

The StripScrubber built in to Loofah will strip the tags but leave the contents of elements. Which is usually what we want. We use remove_xpaths to remove elements along with their subtrees in the few instances where we don’t. If you have trouble with the syntax above, they’re XPath Selectors.

  loofah_fragment.to_s
end

Lastly we’ll take the elements and extract the remaining markup with to_s. Internally Nokogiri will call either to_xml or to_html depending on the kind of document or fragment you have.

Rub, buff or clean it off, however you like

So there you have it. I could go through how the other sanitizers work, but they’re not that complex. So go code spelunking in the source.

If this was the first time you’ve seen a Loofah::Scrubber, be sure to check out the source for PermitScrubber and see an example of how to implement one. You can also subclass PermitScrubber and get the sanitization you need without worrying about the implementation details of stripping elements and scrubbing attributes. Take a look at TargetScrubber – the weird PermitScrubber – and how it uses that to get scrubbing fast.

Before I scrub off though, I promised you an example of a custom scrubber. I’ll use the option that wraps a block here, but you could easily create a subclass of Loofah::Scrubber (in a helper maybe?) and override scrub(node). So here goes:

<%= sanitize @article.body,
  scrubber: Loofah::Scrubber.new { |node| node.name = "script" } %>

The code above changes all the HTML tags included in the article body to be a tag <script>.

<sarcasm>
If you’re going to introduce bugs, why not make everything a potential risk of running arbitrary code?
</sarcasm>


]]>
http://blog.plataformatec.com.br/2014/07/the-new-html-sanitizer-in-rails-4-2/feed/ 0
Rails 4 and PostgreSQL Arrays http://blog.plataformatec.com.br/2014/07/rails-4-and-postgresql-arrays/ http://blog.plataformatec.com.br/2014/07/rails-4-and-postgresql-arrays/#comments Tue, 15 Jul 2014 12:00:20 +0000 http://blog.plataformatec.com.br/?p=4115 ]]> Rails 4 supports arrays fields for PostgreSQL in a nice way, although it is not a very known feature. In order to demonstrate its usage it’s useful to explain the context where this was used.

PostgreSQL Arrays and Rails Migrations

Suppose we have a Product model with the following fields: name, category_id and tags. The name field will be a simple string, category_id will be the foreign key of a record in the Category model and tags will be created by inputting a string of comma-separated words, so: “one, two, forty two” will become the tags: “one”, “two” and “forty two” respectively.

Creating these tables via migrations is nothing new, except for the column tags which will have the Array type in this case. To create this kind of column we use the following syntax in our migration:

create_table :categories do |t|
  t.string :name, null: false
end
 
create_table :products do |t|
  t.string :name, null: false
  t.references :category, null: false
  t.text :tags, array: true, default: []
end

Let’s explore what we can do with this kind of field using the postgres console:

$ rails db
> INSERT INTO products(name, category_id, tags) VALUES('T-Shirt', 3, '{clothing, summer}');
> INSERT INTO products(name, category_id, tags) VALUES('Sweater', 3, ARRAY['clothing', 'winter']);
> SELECT * FROM products;
1  |  T-Shirt  |  3  | {clothing, summer}
2  |  Sweater  |  3  | {clothing, winter}

As we can see we need to specify each tag following this syntax:

‘{ val1, val2, … }’ or ARRAY['val1', 'val2', ...]

Let’s play a little more to understand how this column behaves when queried:

> SELECT * FROM products WHERE tags = '{clothing, summer}';
1  |  T-Shirt  |  3  | {clothing, summer}
 
> SELECT * FROM products WHERE tags = '{summer, clothing}';
(0 rows)
 
> SELECT * FROM products WHERE 'winter' = ANY(tags);
2  |  Sweater  |  3  |  {clothing, winter}

As this example demonstrates, searching for records by an array with its values in the order they were inserted works, but with the same values in a different order does not. We were also able to find a record searching for a specific tag using the ANY function.

There’s a lot more to talk about arrays in PostgreSQL, but for our example this is enough. You can find more information at the PostgreSQL official documentation about arrays and its functions.

How Rails treats PostgreSQL arrays

It’s also valuable to see how to use the array field within Rails, let’s try:

$ rails c
 
Product.create(name: 'Shoes', category: Category.first, tags: ['a', 'b', 'c'])
#> 
 
Product.find(26).tags
#> ["a", "b", "c"]

So Rails treats an array column in PostgreSQL as an Array in Ruby, pretty reasonable!

Validations

We want each product to be unique, let’s see some examples to clarify this concept.

Given we have the following product:

Product.create(name: 'Shoes', category: Category.first, tags: ['a', 'b', 'c'])

We can easily create another one if we change the name attribute:

Product.create(name: 'Slippers', category: Category.first, tags: ['a', 'b', 'c'])

We can also create another product with different tags:

Product.create(name: 'Shoes', category: Category.first, tags: ['a', 'b'])

But we don’t want to create a product with the same attributes, even if the tags are in a different order:

Product.create(name: 'Shoes', category: Category.first, tags: ['a', 'c', 'b'])
#> false

As PostgreSQL only finds records by tags given the exact order in which they were inserted, then how can we ensure the uniqueness of a product with tags in an order-independent way?

After much thought we decided that a good approach would involve creating an unique index with all the columns in the products table but with tags sorted when a row is inserted in the database. Something like:

CREATE UNIQUE INDEX index_products_on_category_id_and_name_and_tags
ON products USING btree (category_id, name, sort_array(tags));

And sort_array is our custom function responsible for sorting the array, since PostgreSQL does not have a built in function like this.

Creating a custom function in PostgreSQL using PL/pgSQL

To create a custom function we used the PL/pgSQL language, and since we are adding database specific code like this we can’t use the default schema.rb anymore. Let’s change this in config/application.rb:

# Use SQL instead of AR schema dumper when creating the database
config.active_record.schema_format = :sql

With this configuration set, our schema.rb file will be replaced by a structure.sql file without side effects, our current migrations don’t need to be changed at all. Now we can create a migration with our sort_array code:

def up
  execute <<-SQL
    CREATE FUNCTION sort_array(unsorted_array anyarray) RETURNS anyarray AS $$
      BEGIN
        RETURN (SELECT ARRAY_AGG(val) AS sorted_array
        FROM (SELECT UNNEST(unsorted_array) AS val ORDER BY val) AS sorted_vals);
      END;
    $$ LANGUAGE plpgsql IMMUTABLE STRICT;
 
    CREATE UNIQUE INDEX index_products_on_category_id_and_name_and_tags ON products USING btree (category_id, name, sort_array(tags));
  SQL
end
 
def down
  execute <<-SQL
    DROP INDEX IF EXISTS index_products_on_category_id_and_name_and_tags;
    DROP FUNCTION IF EXISTS sort_array(unsorted_array anyarray);
  SQL
end

Now, let’s take it slow and understand step by step

CREATE FUNCTION sort_array(unsorted_array anyarray) RETURNS anyarray

The line above tells that we are creating a function named sort_array and that it receives a parameter named unsorted_array of type anyarray and returns something of this same type. This anyarray, in fact, is a pseudo-type that indicates that a function accepts any array data type.

RETURN (SELECT ARRAY_AGG(val) AS sorted_array
FROM (SELECT UNNEST(unsorted_array) AS val ORDER BY val) AS sorted_vals);

The trick here is the use of the function unnest that expands an Array to a set of rows. Now we can order these rows and after that we use another function called array_agg that concatenates the input into a new Array.

$$ LANGUAGE plpgsql IMMUTABLE STRICT;

The last trick is the use of the keywords IMMUTABLE  and STRICT. With the first one we guarantee that our function will always return the same output given the same input, we can’t use it in our index if we don’t specify so. The other one tells that our function will always return null if some of the parameters are not specified.

And that’s it! With this we can check for uniqueness in a performant way with some method like:

def duplicate_product_exists?
  relation = self.class.
    where(category_id: category_id).
    where('lower(name) = lower(?)', name).
    where('sort_array(tags) = sort_array(ARRAY[?])', tags)
 
  relation = relation.where.not(id: id) if persisted?
 
  relation.exists?
end

Case insensitive arrays

There is still a problem with our code though, the index is not case insensitive!  What if a user inserts a product with tags ['a', 'b'] and another one inserts the same product but with tags ['A', 'b']? Now we have duplication in our database! We have to deal with this, but unfortunately this will increase the complexity of our sort_array function a little bit. To fix this problem we only need to change one single line:

From this:

FROM (SELECT UNNEST(unsorted_array) AS val ORDER BY val) AS sorted_vals);

To:

FROM
(SELECT
  UNNEST(string_to_array(LOWER(array_to_string(unsorted_array, ',')), ','))
  AS val ORDER BY val)
AS sorted_vals);

The difference is that instead of passing unsorted_array directly to the function unnest we are transforming it in an String, calling lower on it and transforming it back to an Array before passing it on. With this change it doesn’t matter if the user inserts ['a'] or ['A'], every tag will be saved in lowercase in the index. Problem solved!

As we can see, it’s not an easy task to deal with uniqueness and arrays in the database, but the overall result was great.

Would you solve this problem in a different way? Share with us!

]]>
http://blog.plataformatec.com.br/2014/07/rails-4-and-postgresql-arrays/feed/ 8
Ruby blocks precedence http://blog.plataformatec.com.br/2014/07/ruby-blocks-precedence/ http://blog.plataformatec.com.br/2014/07/ruby-blocks-precedence/#comments Tue, 01 Jul 2014 12:00:24 +0000 http://blog.plataformatec.com.br/?p=4102 ]]> When we start programming with Ruby, one of the first niceties we learn about are the Ruby blocks. In the beginning it’s easy to get tricked by the two existing forms of blocks and when to use each:

%w(a b c).each { |char| puts char }
%w(a b c).each do |char| puts char end

The Ruby Community has sort of created a “guideline” for when to use one versus another: for short or inline blocks, use curly brackets {..}, for longer or multiline blocks, use the do..end format. But did you know there is actually a slight difference between them? So sit tight, we’ll cover it now.

Operators Precedence

Languages contain operators, and these operators must obey a precedence rule so that the interpreter knows the order of execution, which means one operator will be executed first if it has higher precedence than others in a piece of code. Consider the following example:

a || b && c

What operation gets executed first, a || b, or b && c? This is where operator precedence takes action. In this case, the code is the same as this:

a || (b && c)

Which means && has higher precedence than || in Ruby. However, if you want the condition a || b to be evaluated first, you can enforce it with the use of ():

(a || b) && c

This way you are explicitly telling the interpreter that the condition inside the () should be executed first.

What about blocks?

It turns out blocks have precedence too! Lets see an example that mimics the Rails router with the redirect method:

def get(path, options = {}, &block)
  puts "get received block? #{block_given?}"
end
 
def redirect(&block)
  puts "redirect received block? #{block_given?}"
end
 
puts '=> brackets { }'
get 'eggs', to: redirect { 'eggs and bacon' }
 
puts
 
puts '=> do..end'
get 'eggs', to: redirect do 'eggs and bacon' end

This example shows a rather common code in Rails apps: a get route that redirects to some other route in the app (some arguments from the real redirect block were omitted for clarity). And all these methods do is outputting whether they received a block or not.

At a glance these two calls to get + redirect could be considered exact the same, however they behave differently because of the blocks precedence. Can you guess what’s the output? Take a look:

=> brackets {..}
redirect received block? true
get received block? false
 
=> do..end
redirect received block? false
get received block? true

The curly brackets have higher precedence than the do..end, which means the block with {..} will attach to the inner method, in this example redirect, whereas the do..end will attach to the outer method, get.

Wrapping up

This blog post originated from a real Rails issue, where you can read a little bit more about the subject and see that even Rails got it wrong in its documentation (which is now fixed). The precedence is a subtle but important difference between {..} and do..end blocks, so be careful not to be caught off guard by it.

Do you know any other interesting fact about Ruby blocks that people may not be aware of? Or maybe you learned something tricky about Ruby recently? Please share it on the comments section, we would love to hear.


]]>
http://blog.plataformatec.com.br/2014/07/ruby-blocks-precedence/feed/ 11
The Symptoms of Low Internal Software Quality http://blog.plataformatec.com.br/2014/06/the-symptoms-of-low-internal-software-quality/ http://blog.plataformatec.com.br/2014/06/the-symptoms-of-low-internal-software-quality/#comments Tue, 24 Jun 2014 12:00:14 +0000 http://blog.plataformatec.com.br/?p=3934 ]]> This post is part of a collection of posts we’re publishing on the subjects of low internal software quality, refactoring and rewrite.

Not only physical matter deteriorates, software does too

It’s known that physical matter deteriorates. People accept that and have always dealt with it. What people don’t accept so easily is that software “deteriorates” too. Unlike physical matter, it doesn’t happen due to some physical or chemical phenomenon. It usually happens because of some business change or people change. Let me give you an example.

Imagine you’re leading the tech or product team of a startup; you’re the CTO. You already launched your product’s first version, and it was a success. Your business model was validated, and now you’re in a growth stage. That’s awesome! But it has its costs, and it brings a new set of challenges.

The first version of your product is working, but the codebase is not in the shape you’ll need from now on. Maybe your team’s velocity is not as good as it used be. Your team keeps complaining about the code quality. The CEO and the product director want new features, and your current projections will not meet the business needs.

It’s not uncommon that one of the main sources of all these problems is the poor quality of your product’s codebase. You may need a refactor1 or a rewrite.

When the codebase is not in good shape, everyone can get frustrated

If the internal quality of your product is not good, everyone becomes frustrated.

Your whole team, including developers, will get frustrated because they would like to ship features faster, but the current code quality and architecture are not helping.

The IT, product, and software departments suffer because they’re not able to meet the expectations of the other departments.

The customer also suffers because of frequent bugs, how long it takes for them to be resolved, and how long it takes new features to be launched.

You get the picture.

Identifying the symptoms

It’s the leader’s job (let’s say the CTO) to identify when a refactor or a rewrite is needed. In order to do that, he or she can look around for some symptoms, like the ones below:

  • Everything is hard: Almost every feature or bug fix your team needs to do is hard. It was not always like that. You remember the good old days when your team was fast and everything ran smoothly.
  • Slow velocity: Your team’s velocity decreased or is decreasing. When you were building the first version of your product, it was fast to develop a new feature, and your team used to build lots of them every iteration. Now it’s different.
  • Slow test suite: Your test suite takes 10x, 20x, 30x more time to run than before.
  • Bugs that don’t go away: Your team fixes a bug, then in a week or so it appears again. Every now and then your team is fixing a regression bug.
  • Your team is demotivated: Your team keeps complaining that working in the project is not as productive as it was in the past. A single person can’t build one feature alone; there are too many moving parts.
  • Knowledge silos: There are some parts of the software that only a single developer knows well enough to maintain. It’s difficult for the rest of the team to work with that specific code.
  • New developer ramp-up time is taking too long: When new developers join the team, it takes too much time for them to be fully productive.

The reason you got into one of these situations is probably not a technical one. Maybe you needed to deliver too much, too fast while you were building the first version of your product. Maybe your team didn’t have the maturity and experience in the past they have now. Analyzing the root cause is important too, but you need to do something else. You need to solve your problem.

If you’re experiencing the symptoms above, you probably have a low internal software quality problem. Recognizing the symptoms is already a big step. The next step is to think of solutions. Some solutions you may take include refactoring or a rewrite process.

Refactor or rewrite?

There’s no definitive guide about when you should do a big refactor or a rewrite, because it depends a lot on your context. That said, there are some rules of thumb that you should consider when evaluating which solution to go with:

When to rewrite

  • The technology you use is outdated, and it’s not maintained anymore.
  • Your software is really slow, and changing the architecture is not enough or is not viable.
  • The supply of software developers that know the technology you use is low and decreasing.
  • There are new technologies that offer a significant advantage compared to what you’re using.

When to refactor

  • The technology you use is still maintained and relevant.
  • It’s viable to improve your application in an incremental fashion.
  • The problem you’re solving is just technical and not a business one.

Choosing one of these options is not an easy decision, and once you go with one of them, there will be an entire new set of concerns you’ll encounter. Stay tuned, in our next blog posts we’ll talk about what to consider when doing a big refactor or a rewrite.

Now I would like to know about your experiences. Have you ever been in a similar situation? How did you identify that your problem was low internal software quality? Please share with us!


  1. I prefer the term “code refurbishment”, but people aren’t generally used to it. So I’ll use refactoring in this blog post for the sake of clarity. 

]]>
http://blog.plataformatec.com.br/2014/06/the-symptoms-of-low-internal-software-quality/feed/ 3
Comparing protocols and extensions in Swift and Elixir http://blog.plataformatec.com.br/2014/06/comparing-protocols-and-extensions-in-swift-and-elixir/ http://blog.plataformatec.com.br/2014/06/comparing-protocols-and-extensions-in-swift-and-elixir/#comments Tue, 10 Jun 2014 13:55:16 +0000 http://blog.plataformatec.com.br/?p=4046 ]]> Swift has been recently announced by Apple and I have been reading the docs and playing with the language out of curiority. I was pleasantly surprised with many features in the language, like the handling of optional values (and types) and with immutability being promoted throughout the language.

The language also feels extensible. For extensibility, I am using the same criteria we use for Elixir, which is the ability to implement language constructs using the language itself.

For example, in many languages the short-circuiting && operator is defined as special part of the language. In those languages, you can’t reimplement the operator using the constructs provided by the language.

In Elixir, however, you can implement the && operator as a macro:

defmacro left && right do
  quote do
    case unquote(left) do
      false -> false
      _ -> unquote(right)
    end
  end
end

In Swift, you can also implement operators and easily define the && operator with the help of the @auto_closure attribute:

func &&(lhs: LogicValue, rhs: @auto_closure () -> LogicValue) -> Bool {
    if lhs {
        if rhs() == true {
            return true
        }
    }
    return false
}

The @auto_closure attribute automatically wraps the tagged argument in a closure, allowing you to control when it is executed and therefore implement the short-circuiting property of the && operator.

However, one of the features I suspect that will actually hurt extensibility in Swift is the Extensions feature. I have compared the protocols implementation in Swift with the ones found in Elixir and Clojure on Twitter and, as developers have asked for a more detailed explanation, I am writing this blog post as result!

Extensions

The extension feature in Swift has many use cases. You can read them all in more detail in their documentation. For now, we will cover the general case and discuss the protocol case, which is the bulk of this blog post.

Following the example in Apple documentation itself:

extension Double {
    var km: Double { return self * 1_000.0 }
    var m: Double { return self }
    var cm: Double { return self / 100.0 }
    var mm: Double { return self / 1_000.0 }
    var ft: Double { return self / 3.28084 }
}

let oneInch = 25.4.mm
println("One inch is \(oneInch) meters")
// prints "One inch is 0.0254 meters"

let threeFeet = 3.ft
println("Three feet is \(threeFeet) meters")
// prints "Three feet is 0.914399970739201 meters"

In the example above, we are extending the Double type, adding our own computed properties. Those extensions are global and, if you are Ruby developer, it will remind you of monkey patching in Ruby. However, in Ruby classes are always open, and here the extension is always explicit (which I personally consider to be a benefit).

What troubles extensions is exactly the fact they are global. While I understand some extensions would be useful to define globally, they always come with the possibility of namespace pollution and name conflicts. Two libraries can define the same extensions to the Double type that behave slightly different, leading to bugs.

This has always been a hot topic in the Ruby community with Refinements being proposed in late 2010 as a solution to the problem. At this moment, it is unclear if extensions can be scoped in any way in Swift.

The case for protocols

Protocols are a fantastic feature in Swift. Per the documentation: “a protocol defines a blueprint of methods, properties, and other requirements that suit a particular task or piece of functionality”.

Let’s see their example:

protocol FullyNamed {
    var fullName: String { get }
}

struct Person: FullyNamed {
    var fullName: String
}

let john = Person(fullName: "John Appleseed")
// john.fullName is "John Appleseed"

In the example above we defined a FullyNamed protocol and implemented it while defining the Person struct. The benefit of protocols is that the compiler can now guarantee the struct complies with the definitions specified in the protocol. In case the protocol changes in the future, you will know immediately by recompiling your project.

I have been long advocating this feature for Ruby. For example, imagine you have the following Ruby code:

class Person
  attr_accessor :first, :last

  def full_name
    first + " " + last
  end
end

And you have a method somewhere that expects an object that implements full_name:

def print_full_name(obj)
  puts obj.full_name
end

At some point, you may want to print the title too:

def print_full_name(obj)
  if title = obj.title
    puts title + " " + obj.full_name
  else
    puts obj.full_name
  end
end

Your contract has now changed but there is no mechanism to notify implementations of such change. This is particularly cumbersome because sometimes such changes are done by accident, when you don’t want to actually modify the contract.

This issue has happened multiple times in Rails. Before Rails 3, there was no official contract between the controller and the model and between the view and the model. This meant that, while Rails worked fine with Active Record (Rails’ built-in model layer), every Rails release could possibly break integration with other models because the contract suddenly became larger due to changes in the implementation.

Since Rails 3, we actually define a contract for those interactions, but there is still no way to:

  • guarantee an object complies with the contract (besides extensive use of tests)
  • guarantee controllers and views obey the contract (besides extensive use of tests)

Similar to real-life contracts, unless you write it down and sign it, there is no guarantee both parts will actually maintain it.

The ideal solution is to be able to define multiple, tiny protocols. Someone using Swift would rather define multiple protocols for the controller and view layers:

protocol URL {
    func toParam() -> String
}

protocol FormErrors {
    var errors: Dict<String, Array[String]>
}

The interesting aspect about Swift protocols is that you can define and implement protocols for any given type, at any time. The trouble though is that the implementation of the protocols are defined in the class/struct itself and, as such, they change the class/struct globally.

Protocols and Extensions

Since protocols in Swift are implemented directly in the class/struct, be it during definition or via extension, the protocol implementation ends up changing the class/struct globally. To see the issue with this, imagine that you have two different libraries relying on different JSON protocols:

protocol JSONA {
    func toJSON(precision: Integer) -> String
}

protocol JSONB {
    func toJSON(scale: Integer) -> String
}

If the protocols above have different specifications on how the precision argument must be handled, we will be able to implement only one of the two protocols above. That’s because implementing any of the protocols above means adding a toJSON(Integer) method to the class/struct and there can be only one of them per class/struct.

Furthermore, if implementing protocols means globally adding method to classes and structs, it can actually hinder the use of protocols as a whole, as the concerns to avoid name clashes and to avoid namespace pollution will speak louder than the protocol benefits.

Let’s contrast this with protocols in Elixir:

defprotocol JSONA do
  def to_json(data, precision)
end

defprotocol JSONB do
  def to_json(data, scale)
end

defimpl JSONA, for: Integer do
  def to_json(data, _precision) do
    Integer.to_string(data)
  end
end

JSONA.to_json(1, 10)
#=> 1

Elixir protocols are heavily influenced by Clojure protocols where the implementation of a protocol is tied to the protocol itself and not to the data type implementing the protocol. This means you can implement both JSONA and JSONB protocols for the same data types and they won’t clash!

Protocols in Elixir work by dispatching on the first argument of the protocol function. So when you invoke JSONA.to_json(1, 10), Elixir checks the first argument, sees it is an integer and dispatches to the appropriate implementation.

What is interesting is that we can actually emulate this functionality in Swift! In Swift we can define the same method multiple times, as long as the type signatures do not clash. So if we use static methods and extension, we can emulate the behaviour above:

// Define a class to act as protocol dispatch
class JSON {
}

// Implement it for Double
extension JSON {
    class func toJSON(double: Double) -> String {
        return String(double)
    }
}

// Someone may implement it later for Float too
extension JSON {
    class func toJSON(float: Float) -> String {
        return String(float)
    }
}

JSON.toJSON(2.3)

The example above emulates the dynamic dispatch ability found in Elixir and Clojure which guarantees no clashes in multiple implementations. After all, if someone defines a JSONB class, all the implementations would live in the JSONB class.

Since dynamic dispatch is already available, we hope protocols in Swift are improved to support local implementations instead of changing classes/structs globally.

Summing up

Swift is a very new language and in active development. The documentation so far doesn’t cover topics like exceptions, the module system and concurrency, which indicates there are many more exciting aspects to build, discuss and develop.

It is the first time I am excited to do some mobile development. Plus the Swift playground may become a fantastic way to introduce programming.

Finally, I would personally love if Swift protocols evolved to support non-global implementations. Protocols are a very extensible mechanism to define and implement contracts and it would be a pity to see their potential hindered due to the global side-effects it may cause to the codebase.

]]>
http://blog.plataformatec.com.br/2014/06/comparing-protocols-and-extensions-in-swift-and-elixir/feed/ 5
How to deal with user stories that are not finished in one sprint http://blog.plataformatec.com.br/2014/06/how-to-deal-with-user-stories-that-are-not-finished-in-one-sprint/ http://blog.plataformatec.com.br/2014/06/how-to-deal-with-user-stories-that-are-not-finished-in-one-sprint/#comments Tue, 03 Jun 2014 12:00:44 +0000 http://blog.plataformatec.com.br/?p=4026 ]]> One of the most common questions discussed among the Agile Community is what should be done when a team doesn’t finish a user story (US) in a sprint? How can people track the progress made on an incomplete user story? In this blog post, I’ll share our approach to this question.

According to the community, when a developer finishes their work in the last few hours of an iteration, they must try to help their teammates finish their work. Otherwise, it’s recommended they help prepare the next cycle of work, analyzing the next user stories, refactoring a piece of code that could be better implemented, or writing tests. It is not advisable for a developer to get a new user story started if they won’t be able to finish this in the same cycle. However, this first approach is not always possible because user stories can be underestimated or something can happen that would delay the delivery of the user story.

A second alternative is to split the user story into two smaller ones and develop the one that can be finished on time. The first user story’s points are credited in the current cycle, the second one’s are credited in the next cycle. This approach improves the visibility of what was done in the current cycle. However, it hurts the agile philosophy, in some way it would be a delivery without business value for the customer.

The third way is for the unfinished user story to go to the next cycle with the original estimate. When it gets completed, the user story’s full effort estimate gets credited to the velocity of the new iteration. This could skew the average velocity metric, so be careful, because this is important to the Product Owner (PO) for forecasting and planning. Also beware to not have a bunch of backlog items almost done: one user story delivered has more value than a lot of user stories 90% complete.

How do we do it?

Usually, we use the first and third approaches in the following way:

  • We try to concentrate efforts on work that is closest to delivery. As soon as a developer finishes the first US, they will verify if someone needs help with finishing a task or if some user story in the current cycle has defects that need to be fixed. Keeping the work-in-progress as low as possible helps to focus on what matters most. This process is repeated until the end.
  • If this list is empty and the cycle is almost over, the developer looks for the smallest or most valuable user story (depending of project’s context) to be done.
  • If the user story has not been finished by the end of the cycle, this US shifts to the next cycle with the original estimate. However, when we plan the next cycle, we’ll consider just the missing points to finish the US.
  • When this US gets finished, we credit the whole user story’s estimate in our velocity.
  • If the developer after resuming the work on the US in the next iteration realizes that the user story was overestimated or underestimated, usually, we don’t change the estimate on the story itself, but we update our ruler score with the real estimate, as lesson learned.

Note that we do not use these exact steps every single time, everything depends and adapts according to the context of the project or the moment. The most important thing is you prioritizing to deliver maximum business value to the customer.

And you? What do you do when you have an unfinished user story in your cycle? Share with us yours experiences!

]]>
http://blog.plataformatec.com.br/2014/06/how-to-deal-with-user-stories-that-are-not-finished-in-one-sprint/feed/ 0