Cognito Broadcast
New Features in Ruby 2.4
Faster regular expressions with Regexp#match?
Ruby 2.4 adds a new #match?
method for regular expressions which is three times faster than any Regexp
method in Ruby 2.3:
Regexp#match?: 2630002.5 i/s Regexp#===: 872217.5 i/s - 3.02x slower Regexp#=~: 859713.0 i/s - 3.06x slower Regexp#match: 539361.3 i/s - 4.88x slower
Expand benchmark source
When you call Regexp#===
, Regexp#=~
, or Regexp#match
, Ruby sets the $~
global variable with the resulting MatchData
:
/^foo (\w+)$/ =~ 'foo bar' # => 0 $~ # => #<MatchData "foo bar" 1:"bar"> /^foo (\w+)$/.match('foo baz') # => #<MatchData "foo baz" 1:"baz"> $~ # => #<MatchData "foo baz" 1:"baz"> /^foo (\w+)$/ === 'foo qux' # => true $~ # => #<MatchData "foo qux" 1:"qux">
Regexp#match?
returns a boolean and avoids building a MatchData
object or updating global state:
/^foo (\w+)$/.match?('foo wow') # => true $~ # => nil
By skipping the global variable Ruby is able to avoid work allocating memory for the MatchData
.
New #sum
method for Enumerable
You can now call #sum
on any Enumerable
object:
[1, 1, 2, 3, 5, 8, 13, 21].sum # => 54
The #sum
method has an optional parameter which defaults to 0. This value is the starting value of a summation meaning that [].sum
is 0
.
If you are calling #sum
on an array of non-integers then you need to provide your own initial value:
class ShoppingList
attr_reader :items
def initialize(*items)
@items = items
end
def +(other)
ShoppingList.new(*items, *other.items)
end
end
eggs = ShoppingList.new('eggs') # => #<ShoppingList:0x007f952282e7b8 @items=["eggs"]>
milk = ShoppingList.new('milks') # => #<ShoppingList:0x007f952282ce68 @items=["milks"]>
cheese = ShoppingList.new('cheese') # => #<ShoppingList:0x007f95228271e8 @items=["cheese"]>
eggs + milk + cheese # => #<ShoppingList:0x007f95228261d0 @items=["eggs", "milks", "cheese"]>
[eggs, milk, cheese].sum # => #<TypeError: ShoppingList can't be coerced into Integer>
[eggs, milk, cheese].sum(ShoppingList.new) # => #<ShoppingList:0x007f9522824cb8 @items=["eggs", "milks", "cheese"]>
On the last line an empty shopping list (ShoppingList.new
) is supplied as the initial value.
New methods for testing if directories or files are empty
In Ruby 2.4 you can test whether directories and files are empty using the File
and Dir
modules:
Dir.empty?('empty_directory') # => true Dir.empty?('directory_with_files') # => false File.empty?('contains_text.txt') # => false File.empty?('empty.txt') # => true
The File.empty?
method is equivalent to File.zero?
which is already available in all supported Ruby versions:
File.zero?('contains_text.txt') # => false File.zero?('empty.txt') # => true
Unfortunately these methods are not available for Pathname
yet.
Extract named captures from Regexp
match results
In Ruby 2.4 you can called #named_captures
on a Regexp match result and get a hash containing your named capture groups and the data they extracted:
pattern = /(?<first_name>John) (?<last_name>\w+)/ pattern.match('John Backus').named_captures # => { "first_name" => "John", "last_name" => "Backus" }
Ruby 2.4 also adds a #values_at
method for extracting just the named captures which you care about:
pattern = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/ pattern.match('2016-02-01').values_at(:year, :month) # => ["2016", "02"]
The #values_at
method also works for positional capture groups:
pattern = /(\d{4})-(\d{2})-(\d{2})$/ pattern.match('2016-07-18').values_at(1, 3) # => ["2016", "18"]
New Integer#digits
method
If you want to access a digit in a certain position within an integer (from right to left) then you can use Integer#digits
:
123.digits # => [3, 2, 1] 123.digits[0] # => 3 # Equivalent behavior in Ruby 2.3: 123.to_s.chars.map(&:to_i).reverse # => [3, 2, 1]
If you want to know positional digit information given a non-decimal base, you can pass in a different radix. For example, to lookup positional digit information for a hexadecimal integer you can pass in 16
:
0x7b.digits(16) # => [11, 7]
0x7b.digits(16).map { |digit| digit.to_s(16) } # => ["b", "7"]
Improvements to the Logger
interface
The Logger
library in Ruby 2.3 can be a bit cumbersome to setup:
logger1 = Logger.new(STDOUT) logger1.level = :info logger1.progname = 'LOG1' logger1.debug('This is ignored') logger1.info('This is logged') # >> I, [2016-07-17T23:45:30.571508 #19837] INFO -- LOG1: This is logged
Ruby 2.4 moves this configuration to Logger
’s constructor:
logger2 = Logger.new(STDOUT, level: :info, progname: 'LOG2') logger2.debug('This is ignored') logger2.info('This is logged') # >> I, [2016-07-17T23:45:30.571556 #19837] INFO -- LOG2: This is logged
Parse CLI options into a Hash
Parsing command line flags with OptionParser
often involves a lot of boilerplate in order to compile the options down into a hash:
require 'optparse'
require 'optparse/date'
require 'optparse/uri'
config = {}
cli =
OptionParser.new do |options|
options.define('--from=DATE', Date) do |from|
config[:from] = from
end
options.define('--url=ENDPOINT', URI) do |url|
config[:url] = url
end
options.define('--names=LIST', Array) do |names|
config[:names] = names
end
end
Now you can provide a hash via the :into
keyword argument when parsing arguments:
require 'optparse'
require 'optparse/date'
require 'optparse/uri'
cli =
OptionParser.new do |options|
options.define '--from=DATE', Date
options.define '--url=ENDPOINT', URI
options.define '--names=LIST', Array
end
config = {}
args = %w[
--from 2016-02-03
--url https://blog.blockscore.com/
--names John,Daniel,Delmer
]
cli.parse(args, into: config)
config.keys # => [:from, :url, :names]
config[:from] # => #<Date: 2016-02-03 ((2457422j,0s,0n),+0s,2299161j)>
config[:url] # => #<URI::HTTPS https://blog.blockscore.com/>
config[:names] # => ["John", "Daniel", "Delmer"]
Faster Array#min
and Array#max
In Ruby 2.4 the Array
class defines its own #min
and #max
instance methods. This change dramatically speeds up the #min
and #max
methods on Array
:
Array#min: 35.1 i/s Enumerable#min: 21.8 i/s - 1.61x slower
Expand benchmark source
Simplified integers
Until Ruby 2.4 you had to manage many numeric types:
# Find classes which subclass the base "Numeric" class:
numerics = ObjectSpace.each_object(Module).select { |mod| mod < Numeric }
# In Ruby 2.3:
numerics # => [Complex, Rational, Bignum, Float, Fixnum, Integer, BigDecimal]
# In Ruby 2.4:
numerics # => [Complex, Rational, Float, Integer, BigDecimal]
Now Fixnum
and Bignum
are implementation details that Ruby manages for you. This should help avoid subtle bugs like this:
def categorize_number(num)
case num
when Fixnum then 'fixed number!'
when Float then 'floating point!'
end
end
# In Ruby 2.3:
categorize_number(2) # => "fixed number!"
categorize_number(2.0) # => "floating point!"
categorize_number(2 ** 500) # => nil
# In Ruby 2.4:
categorize_number(2) # => "fixed number!"
categorize_number(2.0) # => "floating point!"
categorize_number(2 ** 500) # => "fixed number!"
If you have Bignum
or Fixnum
hardcoded in your source code that is fine. These constants now point to Integer
:
Fixnum # => Integer Bignum # => Integer Integer # => Integer
New arguments supported for float modifiers
#round
, #ceil
, #floor
, and #truncate
now accept a precision argument
4.55.ceil(1) # => 4.6 4.55.floor(1) # => 4.5 4.55.truncate(1) # => 4.5 4.55.round(1) # => 4.6
These methods all work the same on Integer
as well:
4.ceil(1) # => 4.0 4.floor(1) # => 4.0 4.truncate(1) # => 4.0 4.round(1) # => 4.0
Case sensitivity for unicode characters
Consider the following sentence:
My name is JOHN. That is spelled J-Ο-H-N
Calling #downcase
on this string in Ruby 2.3 produces this output:
my name is john. that is spelled J-Ο-H-N
This is because “J-Ο-H-N” in the string above is written with unicode characters.
Ruby’s letter casing methods now handle unicode properly:
sentence = "\uff2a-\u039f-\uff28-\uff2e" sentence # => "J-Ο-H-N" sentence.downcase # => "j-ο-h-n" sentence.downcase.capitalize # => "J-ο-h-n" sentence.downcase.capitalize.swapcase # => "j-Ο-H-N"
New option to specify size of a new string
When creating a string you can now define a :capacity
option which will tell Ruby how much memory it should allocate for your string. This can help performance as Ruby can avoid reallocations as you increase the size of the string in question:
With capacity: 37225.1 i/s Without capacity: 16031.3 i/s - 2.32x slower
Expand benchmark source
Fixed matching behavior for symbols
Ruby 2.3’s Symbol#match
returned the match position even though String#match
returns MatchData
. This inconsistency is fixed in Ruby 2.4:
# Ruby 2.3 behavior: 'foo bar'.match(/^foo (\w+)$/) # => #<MatchData "foo bar" 1:"bar"> :'foo bar'.match(/^foo (\w+)$/) # => 0 # Ruby 2.4 behavior: 'foo bar'.match(/^foo (\w+)$/) # => #<MatchData "foo bar" 1:"bar"> :'foo bar'.match(/^foo (\w+)$/) # => #<MatchData "foo bar" 1:"bar">
Multiple assignment inside of conditionals
You can now assign multiple variables within a conditional:
branch1 = if (foo, bar = %w[foo bar]) 'truthy' else 'falsey' end branch2 = if (foo, bar = nil) 'truthy' else 'falsey' end branch1 # => "truthy" branch2 # => "falsey"
You probably shouldn’t do that though.
Exception reporting improvements for threading
If you encounter an exception within a thread then Ruby defaults to silently swallowing up that error:
puts 'Starting some parallel work' thread = Thread.new do sleep 1 fail 'something very bad happened!' end sleep 2 puts 'Done!'
$ ruby parallel-work.rb Starting some parallel work Done!
If you want to fail the entire process when an exception happens within a thread then you can use Thread.abort_on_exception = true
. Adding this to the parallel-work.rb
script above would change the output to:
$ ruby parallel-work.rb Starting some parallel work parallel-work.rb:9:in 'block in <main>': something very bad happened! (RuntimeError)
In Ruby 2.4 you now have a middle ground between errors being silently ignored and aborting your entire program. Instead of abort_on_exception
you can set Thread.report_on_exception = true
:
$ ruby parallel-work.rb Starting some parallel work #<Thread:0x007ffa628a62b8@parallel-work.rb:6 run> terminated with exception: parallel-work.rb:9:in 'block in <main>': something very bad happened! (RuntimeError) Done!