Regexp#match?
Ruby 2.4 adds a new #match?
method for regular expressions which is three times faster than any Regexp
method in Ruby 2.3:
Regexp#match?: 2630002.5 i/s
Regexp#===: 872217.5 i/s - 3.02x slower
Regexp#=~: 859713.0 i/s - 3.06x slower
Regexp#match: 539361.3 i/s - 4.88x slower
When you call Regexp#===
, Regexp#=~
, or Regexp#match
, Ruby sets the $~
global variable with the resulting MatchData
:
/^foo (\w+)$/ =~ 'foo bar' # => 0
$~ # => #<MatchData "foo bar" 1:"bar">
/^foo (\w+)$/.match('foo baz') # => #<MatchData "foo baz" 1:"baz">
$~ # => #<MatchData "foo baz" 1:"baz">
/^foo (\w+)$/ === 'foo qux' # => true
$~ # => #<MatchData "foo qux" 1:"qux">
Regexp#match?
returns a boolean and avoids building a MatchData
object or updating global state:
/^foo (\w+)$/.match?('foo wow') # => true
$~ # => nil
By skipping the global variable Ruby is able to avoid work allocating memory for the MatchData
.
#sum
method for Enumerable
You can now call #sum
on any Enumerable
object:
[1, 1, 2, 3, 5, 8, 13, 21].sum # => 54
The #sum
method has an optional parameter which defaults to 0. This value is the starting value of a summation meaning that [].sum
is 0
.
If you are calling #sum
on an array of non-integers then you need to provide your own initial value:
class ShoppingList
attr_reader :items
def initialize(*items)
@items = items
end
def +(other)
ShoppingList.new(*items, *other.items)
end
end
eggs = ShoppingList.new('eggs') # => #<ShoppingList:0x007f952282e7b8 @items=["eggs"]>
milk = ShoppingList.new('milks') # => #<ShoppingList:0x007f952282ce68 @items=["milks"]>
cheese = ShoppingList.new('cheese') # => #<ShoppingList:0x007f95228271e8 @items=["cheese"]>
eggs + milk + cheese # => #<ShoppingList:0x007f95228261d0 @items=["eggs", "milks", "cheese"]>
[eggs, milk, cheese].sum # => #<TypeError: ShoppingList can't be coerced into Integer>
[eggs, milk, cheese].sum(ShoppingList.new) # => #<ShoppingList:0x007f9522824cb8 @items=["eggs", "milks", "cheese"]>
On the last line an empty shopping list (ShoppingList.new
) is supplied as the initial value.
In Ruby 2.4 you can test whether directories and files are empty using the File
and Dir
modules:
Dir.empty?('empty_directory') # => true
Dir.empty?('directory_with_files') # => false
File.empty?('contains_text.txt') # => false
File.empty?('empty.txt') # => true
The File.empty?
method is equivalent to File.zero?
which is already available in all supported Ruby versions:
File.zero?('contains_text.txt') # => false
File.zero?('empty.txt') # => true
Unfortunately these methods are not available for Pathname
yet.
Regexp
match resultsIn Ruby 2.4 you can called #named_captures
on a Regexp match result and get a hash containing your named capture groups and the data they extracted:
pattern = /(?<first_name>John) (?<last_name>\w+)/
pattern.match('John Backus').named_captures # => { "first_name" => "John", "last_name" => "Backus" }
Ruby 2.4 also adds a #values_at
method for extracting just the named captures which you care about:
pattern = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/
pattern.match('2016-02-01').values_at(:year, :month) # => ["2016", "02"]
The #values_at
method also works for positional capture groups:
pattern = /(\d{4})-(\d{2})-(\d{2})$/
pattern.match('2016-07-18').values_at(1, 3) # => ["2016", "18"]
Integer#digits
methodIf you want to access a digit in a certain position within an integer (from right to left) then you can use Integer#digits
:
123.digits # => [3, 2, 1]
123.digits[0] # => 3
# Equivalent behavior in Ruby 2.3:
123.to_s.chars.map(&:to_i).reverse # => [3, 2, 1]
If you want to know positional digit information given a non-decimal base, you can pass in a different radix. For example, to lookup positional digit information for a hexadecimal integer you can pass in 16
:
0x7b.digits(16) # => [11, 7]
0x7b.digits(16).map { |digit| digit.to_s(16) } # => ["b", "7"]
Logger
interfaceThe Logger
library in Ruby 2.3 can be a bit cumbersome to setup:
logger1 = Logger.new(STDOUT)
logger1.level = :info
logger1.progname = 'LOG1'
logger1.debug('This is ignored')
logger1.info('This is logged')
# >> I, [2016-07-17T23:45:30.571508 #19837] INFO -- LOG1: This is logged
Ruby 2.4 moves this configuration to Logger
’s constructor:
logger2 = Logger.new(STDOUT, level: :info, progname: 'LOG2')
logger2.debug('This is ignored')
logger2.info('This is logged')
# >> I, [2016-07-17T23:45:30.571556 #19837] INFO -- LOG2: This is logged
Parsing command line flags with OptionParser
often involves a lot of boilerplate in order to compile the options down into a hash:
require 'optparse'
require 'optparse/date'
require 'optparse/uri'
config = {}
cli =
OptionParser.new do |options|
options.define('--from=DATE', Date) do |from|
config[:from] = from
end
options.define('--url=ENDPOINT', URI) do |url|
config[:url] = url
end
options.define('--names=LIST', Array) do |names|
config[:names] = names
end
end
Now you can provide a hash via the :into
keyword argument when parsing arguments:
require 'optparse'
require 'optparse/date'
require 'optparse/uri'
cli =
OptionParser.new do |options|
options.define '--from=DATE', Date
options.define '--url=ENDPOINT', URI
options.define '--names=LIST', Array
end
config = {}
args = %w[
--from 2016-02-03
--url https://blog.blockscore.com/
--names John,Daniel,Delmer
]
cli.parse(args, into: config)
config.keys # => [:from, :url, :names]
config[:from] # => #<Date: 2016-02-03 ((2457422j,0s,0n),+0s,2299161j)>
config[:url] # => #<URI::HTTPS https://blog.blockscore.com/>
config[:names] # => ["John", "Daniel", "Delmer"]
Array#min
and Array#max
In Ruby 2.4 the Array
class defines its own #min
and #max
instance methods. This change dramatically speeds up the #min
and #max
methods on Array
:
Array#min: 35.1 i/s
Enumerable#min: 21.8 i/s - 1.61x slower
Until Ruby 2.4 you had to manage many numeric types:
# Find classes which subclass the base "Numeric" class:
numerics = ObjectSpace.each_object(Module).select { |mod| mod < Numeric }
# In Ruby 2.3:
numerics # => [Complex, Rational, Bignum, Float, Fixnum, Integer, BigDecimal]
# In Ruby 2.4:
numerics # => [Complex, Rational, Float, Integer, BigDecimal]
Now Fixnum
and Bignum
are implementation details that Ruby manages for you. This should help avoid subtle bugs like this:
def categorize_number(num)
case num
when Fixnum then 'fixed number!'
when Float then 'floating point!'
end
end
# In Ruby 2.3:
categorize_number(2) # => "fixed number!"
categorize_number(2.0) # => "floating point!"
categorize_number(2 ** 500) # => nil
# In Ruby 2.4:
categorize_number(2) # => "fixed number!"
categorize_number(2.0) # => "floating point!"
categorize_number(2 ** 500) # => "fixed number!"
If you have Bignum
or Fixnum
hardcoded in your source code that is fine. These constants now point to Integer
:
Fixnum # => Integer
Bignum # => Integer
Integer # => Integer
#round
, #ceil
, #floor
, and #truncate
now accept a precision argument
4.55.ceil(1) # => 4.6
4.55.floor(1) # => 4.5
4.55.truncate(1) # => 4.5
4.55.round(1) # => 4.6
These methods all work the same on Integer
as well:
4.ceil(1) # => 4.0
4.floor(1) # => 4.0
4.truncate(1) # => 4.0
4.round(1) # => 4.0
Consider the following sentence:
My name is JOHN. That is spelled J-Ο-H-N
Calling #downcase
on this string in Ruby 2.3 produces this output:
my name is john. that is spelled J-Ο-H-N
This is because “J-Ο-H-N” in the string above is written with unicode characters.
Ruby’s letter casing methods now handle unicode properly:
sentence = "\uff2a-\u039f-\uff28-\uff2e"
sentence # => "J-Ο-H-N"
sentence.downcase # => "j-ο-h-n"
sentence.downcase.capitalize # => "J-ο-h-n"
sentence.downcase.capitalize.swapcase # => "j-Ο-H-N"
When creating a string you can now define a :capacity
option which will tell Ruby how much memory it should allocate for your string. This can help performance as Ruby can avoid reallocations as you increase the size of the string in question:
With capacity: 37225.1 i/s
Without capacity: 16031.3 i/s - 2.32x slower
Ruby 2.3’s Symbol#match
returned the match position even though String#match
returns MatchData
. This inconsistency is fixed in Ruby 2.4:
# Ruby 2.3 behavior:
'foo bar'.match(/^foo (\w+)$/) # => #<MatchData "foo bar" 1:"bar">
:'foo bar'.match(/^foo (\w+)$/) # => 0
# Ruby 2.4 behavior:
'foo bar'.match(/^foo (\w+)$/) # => #<MatchData "foo bar" 1:"bar">
:'foo bar'.match(/^foo (\w+)$/) # => #<MatchData "foo bar" 1:"bar">
You can now assign multiple variables within a conditional:
branch1 =
if (foo, bar = %w[foo bar])
'truthy'
else
'falsey'
end
branch2 =
if (foo, bar = nil)
'truthy'
else
'falsey'
end
branch1 # => "truthy"
branch2 # => "falsey"
You probably shouldn’t do that though.
If you encounter an exception within a thread then Ruby defaults to silently swallowing up that error:
puts 'Starting some parallel work'
thread =
Thread.new do
sleep 1
fail 'something very bad happened!'
end
sleep 2
puts 'Done!'
$ ruby parallel-work.rb
Starting some parallel work
Done!
If you want to fail the entire process when an exception happens within a thread then you can use Thread.abort_on_exception = true
. Adding this to the parallel-work.rb
script above would change the output to:
$ ruby parallel-work.rb
Starting some parallel work
parallel-work.rb:9:in 'block in <main>': something very bad happened! (RuntimeError)
In Ruby 2.4 you now have a middle ground between errors being silently ignored and aborting your entire program. Instead of abort_on_exception
you can set Thread.report_on_exception = true
:
$ ruby parallel-work.rb
Starting some parallel work
#<Thread:0x007ffa628a62b8@parallel-work.rb:6 run> terminated with exception:
parallel-work.rb:9:in 'block in <main>': something very bad happened! (RuntimeError)
Done!