Posts by Rafael França

The article below was originally written by Kasper Timm Hansen (@kaspth on github & twitter) about his work during the Google Summer of Code 2013.

Kasper and I worked a lot changing the underlying implementation of the sanitize helper to give Rails developers a more robust, faster and secure solution to sanitize user input.

This new implementation should be fully backward compatible, with no changes to the API, which should make the update easier.

You can see more information about the previous and the new implementation on this talk I presented in a Brazillian conference this year (the slides are in English).

Now, I’ll let Kasper share his words with you.

Scrubbing Rails Free of HTML-scanner

Everyone, at least one time, has already needed to use the sanitize method to scrub some pesky HTML away.

<%= sanitize @article.body %>

If you were to run this on Rails 4.1 (and before) this would take advantage of the html-scanner, a vendored library inside Rails, for the sanitization. Since the summer of 2013 I have been working to destroy that notion by wiping the traces of html-scanner throughout Rails. Before you become concerned of my mental health, I didn’t do this unwarranted. I’m one of the Google Summer of Code students working on Ruby on Rails. My project proposal was to kick html-scanner to the curb (technical term) and grab a hold of Loofah instead. Why did the old library need replacing, though?

The out washed HTML-scanner

html-scanner has been with us for a long time now. The copyright notice in the library clocks it in at 2006, when Assaf Arkin created it. This library relies on Regular Expressions to recognize HTML (and XML) elements. This made the code more brittle. It was easier to introduce errors via complex Regular Expressions, which also gave it a higher potential for security issues.

The Rails Team wanted something more robust and faster, so we picked Loofah. Loofah uses Nokogiri for parsing, which provides a Ruby interface to either a C or Java parser depending on the Ruby implementation you use. This means Loofah is fast. It’s up to 60 to 100% faster than html-scanner on larger documents and fragments.

I started by taking a look at the SanitizeHelper in Action View, which consists of four methods and some settings. The four methods of the are sanitize, sanitize_css, strip_tags and strip_links.

Let’s take a look at the sanitize method.

Comparing with the old implementation, sanitize still uses the WhiteListSanitizer class to do it’s HTML stripping. However, since Action View was pulled out of Action Pack and both needed to use this functionality, we’ve extracted this to it’s own gem.

Developers meet Rails::Html::WhiteListSanitizer

When you use sanitize, you’re really using WhiteListSanitizer‘s sanitize method. Let me show you the new version.

def sanitize(html, options = {})
  return nil unless html
  return html if html.empty?

No surprises here.

  loofah_fragment = Loofah.fragment(html)

The first trace of Loofah. A fragment is a part of a document, but without a DOCTYPE declaration and html and body tags. A piece of a document essentially. Internally Nokogiri creates a document and pulls the parsed html out of the body tag, leaving us with a fragment.

  if scrubber = options[:scrubber]
    # No duck typing, Loofah ensures subclass of Loofah::Scrubber
    loofah_fragment.scrub!(scrubber)

You can pass your own Scrubber to sanitize! Giving you the power to choose if and how elements are sanitized. As the comment alludes, any scrubber has to be either a subclass of Loofah::Scrubber or it can wrap a block. I’ll show an example later.

  elsif allowed_tags(options) || allowed_attributes(options)
    @permit_scrubber.tags = allowed_tags(options)
    @permit_scrubber.attributes = allowed_attributes(options)
    loofah_fragment.scrub!(@permit_scrubber)

We have been very keen on maintaining backwards compatibility throughout this project, so you can still supply Enumerables of tags and attributes to sanitize. That’s what the PermitScrubber used here handles. It manages these options and makes them work independently. If you pass one it’ll use the standard behavior for the other. See the documentation on what the standard behavior is.
You can also set the allowed tags and attributes on the class level. Like this:

Rails::Html::Sanitizer.allowed_tags = Set.new %w(for your health)

That’s simply what allowed_tags and allowed_attributes methods are there for. They’ll return the tags or attributes from the options and fallback to the class level setting if any.

  else
    remove_xpaths(loofah_fragment, XPATHS_TO_REMOVE)
    loofah_fragment.scrub!(:strip)
  end

The StripScrubber built in to Loofah will strip the tags but leave the contents of elements. Which is usually what we want. We use remove_xpaths to remove elements along with their subtrees in the few instances where we don’t. If you have trouble with the syntax above, they’re XPath Selectors.

  loofah_fragment.to_s
end

Lastly we’ll take the elements and extract the remaining markup with to_s. Internally Nokogiri will call either to_xml or to_html depending on the kind of document or fragment you have.

Rub, buff or clean it off, however you like

So there you have it. I could go through how the other sanitizers work, but they’re not that complex. So go code spelunking in the source.

If this was the first time you’ve seen a Loofah::Scrubber, be sure to check out the source for PermitScrubber and see an example of how to implement one. You can also subclass PermitScrubber and get the sanitization you need without worrying about the implementation details of stripping elements and scrubbing attributes. Take a look at TargetScrubber – the weird PermitScrubber – and how it uses that to get scrubbing fast.

Before I scrub off though, I promised you an example of a custom scrubber. I’ll use the option that wraps a block here, but you could easily create a subclass of Loofah::Scrubber (in a helper maybe?) and override scrub(node). So here goes:

<%= sanitize @article.body,
  scrubber: Loofah::Scrubber.new { |node| node.name = "script" } %>

The code above changes all the HTML tags included in the article body to be a tag <script>.

<sarcasm>
If you’re going to introduce bugs, why not make everything a potential risk of running arbitrary code?
</sarcasm>


We just released Simple Form 3.1.0.rc1 with support to Bootstrap 3.

To make it possible, we leveled up the Wrapper API to make it more extensible and to allow developers to directly configure it instead of relying on global state. After such improvements, it was very easy to change the Simple Form configuration to work with Bootstrap 3. I’ll talk more about this in a future post.

Integrating Bootstrap with Simple Form should be as easy as it was before. We are working on the documentation before the final release but you can find examples about how to integrate Simple Form with Bootstrap in our sample application. The app source code is available too.

Besides this main feature, Simple Form’s new 3.1.0 comes with a lot of enhancements. You can find the whole list of changes in the CHANGELOG file.

I hope you are excited about this release as much as we are. Please try out the release candidate and if you find an issue, report at the Simple Form issues tracker.

We plan to release the final version in a month or so, and we’ll write a new blog post with more details about Simple Form 3.1.

Acknowledgments

We, at Plataformatec, have worked hard to get this release out, with Lauro Caetano and Rafael França (it’s me!) working together on this final sprint. Also kudos to Gonzalo Rodríguez-Baltanás Díaz and the whole community for helping us to improve the sample application.


There is a XSS vulnerability on Simple Form’s label, hint and error options.

Versions affected: >= 1.1.1
Not affected: < 1.1.1
Fixed versions: 3.0.1, 2.1.1

Impact

When Simple Form creates a label, hint or error message it marks the text as being HTML safe, even though it may contain HTML tags. In applications where the text of these helpers can be provided by the users, malicious values can be provided and Simple Form will mark it as safe.

Releases

The 3.0.1 and 2.1.1 releases are available at the normal locations.

Workarounds

If you are unable to upgrade, you can change your code to escape the input before sending to Simple Form

f.input :name, label: html_escape(params[:label])

Patches

To aid users who aren’t able to upgrade immediately we have provided patches. They are in git-am format and consist of a single changeset.

Credits

Thank you to Paul McMahon from Doorkeeper for reporting the issue and working with us in a fix.

Setting up a machine for development is always a manual, annoying task. Here at Plataformatec we have a lot of projects that have different software dependencies, so every time one of our members has to start working on a new project, he needs to setup his machine to meet the project’s requirements, just like every team member of that project already did.

As such, we always end up with some questions: Is this project using MySQL, PostgreSQL or MongoDB? Do I need to install a JavaScript runtime? What about QT? Which Ruby version do I need?

Every team member joining an existing project gets stuck on those same questions, while all they want is to do what they do best, write software.

The problem is even worse when the person is a newcomer, as he may not have any previous knowledge about which software we use.

So, to help our team mates, the newcomers and to make easier and faster to get things done, we started to automate this setup, initially working in our own tool until the day GitHub released Boxen.

About Boxen

Boxen is a tool for automating your machine setup. It works on top of the Puppet project and has some utilities and default modules to make easier to provide and set up services and projects.

Also, Boxen gives you a way to define a personal manifest where you can put personal configurations like you favorite text editor, setting up your dotfiles, or other utility softwares.

How we use Boxen

As a consultancy company we have more than one project running at the same time. Besides that, one of our practices is team rotation, which means that every now and then people move to different teams. Boxen helped us to make those exchanges smoother.

Given that one person has already set up his project to use Boxen, the next person who joins the team will only have to run one command to get his machine ready to work.

The same occurs with problems that someone may have when setting up his machine. Once someone fixes the issue, everyone on the team will take advantage of that fix.

Right now we have some projects set up to use Boxen, and some of them are open source projects like Rails, Simple Form and Elixir.

Our Rails project configuration is open source for those who are interested to contribute to Rails.

Boxen is not only useful for tech people. Every person in the company can use it. Right now we have 10 people using Boxen, where 9 of them are developers and 1 is a project manager.

Conclusion

We have found Boxen to be very useful for our company and we have made it our default tool to set up our machines. Also, we think it is a great way to spread and perpetuate knowledge across your team about the project’s infrastructure and configuration.

I, in particular, learned a lot about systems operation and configuration in the path of making Boxen part of our toolset. Also, I’m very happy to say that I have been granted commit access to the Boxen organization and we will help the GitHub guys and the Boxen team to move this project forward.

And you, what is your team using to set up your machines?

Update 08/13/2012

Since the new deprecation policy the composed_of removal was reverted. We are still discussing the future of this feature.

The reason

A few days ago we started a discussion about the removal of the composed_of method from Active Record. I started this discussion because when I was tagging every Rails issue according to its framework, I found some of them related to composed_of, that are not only hard to fix but would add more complexity for a feature that, in my opinion, is not adding any visible value to the framework.

In this presentation, Aaron Patterson talks about three types of features: Cosmetic, Refactoring, and Course Correction. Aaron defines cosmetic features as a feature that adds dubious value and unknown debt (I highly recommend that you watch the whole presentation to see more about these types of features). This is exactly what I think about composed_of. At this time this feature is adding more debt than value to the Rails code base and applications, so the Rails team have decided to remove this method.

The plan

The removal plan is simple, deprecate in 3.2 and remove in 4.0. This means that you need to stop using this feature and implement it in another way.

The Rails team have chosen this path because this feature can be implemented using plain ruby methods for getters and setters. You will see how in the next section.

Implementation

In the simplest case, when you have only one attribute and needs to instantiate an object with the value of this attribute, you can use the serialize feature with a custom serializer:

class MoneySerializer
  def dump(money)
    money.value
  end
 
  def load(value)
    Money.new(value)
  end
end
 
class Account < ActiveRecord::Base
  serialize :balance, MoneySerializer.new
end

To use it with multiple attributes you can do the following:

class Person < ActiveRecord::Base
  def address
    @address ||= Address.new(address_street, address_city)
  end
 
  def address=(address)
    self[:address_street] = address.street
    self[:address_city]   = address.city
 
    @address = address
  end
end

Benefits for Rails developers

I already talked about what this removal can provide to Rails maintainers, but what benefits does it bring to Rails developers?

I think that the best advantages are:

  • It is easier to test the composite objects;
  • It is easier to understand the lazy methods;
  • It is easier to customize it without resorting to options like :converter, :constructor and :allow_nil.

Wrapping up

I strongly recommend that you read the whole discussion in the pull request. You will find more examples and additional information there.

Also, I want to thank @steveklabnik for working on this feature and the awesome work that he has been doing on the Rails Issues Team.

Finally, I want to invite you to help the Rails team to fix, test, and track issues. About half of the issues are related to the Active Record framework and we need to work on them. As a regular Rails contributor and Rails developer, I think there is still a lot we can do to improve the Rails code base, so join us.

The Carnival is over in Brazil but we are still partying at Plataformatec by bringing you a complete new release of SimpleForm. This time is not a small bump though, it’s a shiny new version: SimpleForm 2.0, that comes with a bunch of new features and customizations, a new wrapper API to create custom input stacks and a great integration with Twitter Bootstrap.

Wrappers API

The new wrappers API is here in place of the old components option (besides some other *_tag and *_class configs), to add more flexibility to the way you build SimpleForm inputs. Here is an example of the default wrapper config that ships with SimpleForm when you run its install generator:

config.wrappers :default, :class => :input,
  :hint_class => :field_with_hint, :error_class => :field_with_errors do |b|
  ## Extensions enabled by default
  # Any of these extensions can be disabled for a
  # given input by passing: `f.input EXTENSION_NAME => false`.
  # You can make any of these extensions optional by
  # renaming `b.use` to `b.optional`.
 
  # Determines whether to use HTML5 (:email, :url, ...)
  # and required attributes
  b.use :html5
 
  # Calculates placeholders automatically from I18n
  # You can also pass a string as f.input :placeholder => "Placeholder"
  b.use :placeholder
 
  ## Optional extensions
  # They are disabled unless you pass `f.input EXTENSION_NAME => :lookup`
  # to the input. If so, they will retrieve the values from the model
  # if any exists. If you want to enable the lookup for any of those
  # extensions by default, you can change `b.optional` to `b.use`.
 
  # Calculates maxlength from length validations for string inputs
  b.optional :maxlength
 
  # Calculates pattern from format validations for string inputs
  b.optional :pattern
 
  # Calculates min and max from length validations for numeric inputs
  b.optional :min_max
 
  # Calculates readonly automatically from readonly attributes
  b.optional :readonly
 
  ## Inputs
  b.use :label_input
  b.use :hint,  :wrap_with => { :tag => :span, :class => :hint }
  b.use :error, :wrap_with => { :tag => :span, :class => :error }
end

Wrappers are used by the form builder to generate a complete input. You can remove any component from the wrapper, change the order or even add your own to the stack.

The :default wrapper is going to be used in all forms by default. You can also select which wrapper to use per form, by naming them:

# Given you added this wrapper in your SimpleForm initializer:
config.wrappers :small do |b|
  b.use :placeholder
  b.use :label_input
end
 
# Uses the :small wrapper for all inputs in this form.
simple_form_for @user, :wrapper => :small do |f|
  f.input :name
end

Or you can just pick a different wrapper in a specific input if you want:

# Uses the default wrapper for other inputs, and :small for :name.
simple_form_for @user do |f|
  f.input :name, :wrapper => :small
end

You can see a more detailed description of the new wrappers API in the documentation.

Twitter Bootstrap

The second big change in SimpleForm 2.0 is out of the box Bootstrap integration. SimpleForm now ships with a generator option to initialize your application with a set of specific wrappers customized for Bootstrap. To get them, just run in your terminal, inside a Rails application (with SimpleForm already installed):

rails generate simple_form:install --bootstrap

This gives you the default SimpleForm initializer in config/initializers/simple_form.rb with some extra integration code added for Bootstrap. For example, here is the default wrapper:

config.wrappers :bootstrap, :tag => 'div', :class => 'control-group', 
  :error_class => 'error' do |b|
  b.use :placeholder
  b.use :label
  b.wrapper :tag => 'div', :class => 'controls' do |ba|
    ba.use :input
    ba.use :error, :wrap_with => { :tag => 'span', :class => 'help-inline' }
    ba.use :hint,  :wrap_with => { :tag => 'p', :class => 'help-block' }
  end
end

This wrapper is setup with the same structure that Bootstrap expects and is set to be the default wrapper in your application. This is the killer feature in SimpleForm 2.0: the Bootstrap integration is not inside SimpleForm but all in your application. This means that, if you want to move away or customize Bootstrap in the future, you don’t need to monkey patch SimpleForm, everything is in your app!

We’ve set up a live example application showing most of the SimpleForm inputs integrated with Twitter Bootstrap, make sure you check it out! The application code is on github.

Keep reading this blog post to find out the other changes and deprecations that gave SimpleForm all this extra flexibility, allowing it to be easily integrated with Twitter Bootstrap 2.0.

New configs

SimpleForm 2.0 comes with some new configs to ease its integration with Bootstrap and to make your daily work even more flexible:

  • default_wrapper: defines the default wrapper to be used when no one is given.
  • button_class: defines a class to add for all buttons.
  • boolean_style: change the way booleans (mainly check boxes and radio buttons) are shown: :inline (the default) uses the same structure as before, checkbox + label; :nested (generated for new apps) puts the checkbox inside the label, as label > checkbox.
  • collection_wrapper_class: class to add in all collections (check boxes / radio buttons), given collection_wrapper_tag is set.
  • item_wrapper_class: class to add to all items in a collection.
  • generate_additional_classes_for: allows you to specify whether to generate the extra css classes for inputs, labels and wrappers. By default SimpleForm always generate all classes, such as input type and required info, to all of them. You can be more selective and tell SimpleForm to just add such classes to the input or wrapper, by changing this config.

Deprecations

In order to create the new wrappers API, we had to deprecate some configs and change some helpers, so here is a basic summary of what is being deprecated:

Configs

  • translate: By making placeholder and hint optional options in the wrappers API, you can already disable the automatic translation attempt that happens for these components. labels, on the other hand, are always used in forms, so we added a special config for them: translate_labels.
  • html5: this config is now part of the wrappers API, with b.use :html5, so the config option has been deprecated.
  • error_notification_id: in favor of using error_notification_class only.
  • wrapper_tag=, wrapper_class=, wrapper_error_class=, error_tag=, error_class=, hint_tag=, hint_class=, components=: all these were moved to the wrappers API structure, and are not required anymore.

Helpers

  • :radio input type: In order to integrate with Bootstrap, we had to get rid of the :as => :radio and use :as => :radio_buttons instead. The former still works, but will give you a bunch of deprecation warnings. CSS class names changed accordingly as well
  • collection_radio: has changed to collection_radio_buttons to follow the :as => :radio_buttons change. Its label class has changed as well based on the helper name.

Wrapping up

SimpleForm 2.0 comes with a lot of new features, in special the new wrappers API, to make it flexible enough to allow you to customize inputs as much as possible in an easier way, and to bring you the integrated Bootstrap structure.

Make sure you check out the new SimpleForm README and also the CHANGELOG for a full list of changes. We’ve also created an special wiki page to help you Upgrading to SimpleForm 2.0.

If you find any trouble while migrating to 2.0, or any issue with Bootstrap integration, or any other issue, please let us know in the issues tracker. And if you have any questions, make sure to send them to the mailing list, there are a lot of people there to help you.

All our development team and an amazing number of contributors put a lot of effort into this new release and we hope you will enjoy it. SimpleForm 2.0 + Bootstrap: from us, for you, with love.

Thoughts about SimpleForm 2.0? Please let us know in the comments.