Counting html files with ruby and bash

At work, I had the task to count how many html files where located within a specific folder and its subfolders.

I did it first in Ruby, then in bash. I will show the source code and explain the solution afterwards.

First, the Ruby solution:

puts Dir.glob('**/*.html').count

I invoke the glob method of the Dir class and send it a glob pattern. This pattern refers to all html files within the current directory and within all sub directories. The method returns an array of all found files. This array has a count method which returns the number of entries. The number is printed to the console using the puts command. Read it like this: The program puts the number to the console.

You can invoke this directly from the command line using ruby -e as shown below:

ruby-e "puts Dir.glob('**/*.html').count"

Note, that you have to be careful how to use the or signs. For more information on the Dir class and its methods, please refer to the ruby api on the Dir class.

In bash, the solution is as follows:

find -name "*.html" | wc -l

It consists of two parts: First, all files which ends with .html are found within the current directory and its subfolders using the find command. Each found file will be send to the STDOUT. Normally, this would print this to the console. However, using |  (the pipe character) we redirect the found files to the wc command which stands for word count. This command can count the characters, words or lines of a file/input. In our case, we want it to count the lines by passing it the -l flag. For more information on pipes on unix, read this.

Both solutions work! 🙂

Learning Ruby – for buildr

I decided to learn ruby for using buildr as my build system for the diploma thesis. So, I looked for a book your guide to learn it. And I found an excellent and funny one which represents the syntax, semantic and much more of ruby.

It can be downloaded here:

I now summarize the key features of ruby to be able to read and understand ruby as fast as possible:

  • name or name_asd or name1_a_1 is variable
  • Name is a constant (as its The Empire State Building)
  • $varname is a global variable (everybody behaves the same way if you give him a $)
  • @varname is an instance variable  (speak: attribute)
  • @@varname is a class variable (speak: attribute for all)
  • :name is a symbol
  • { print “test” } is a block as it is do print “test” end
  • { |x,y| x + y } where x,y is a block argument representing input parameters of the block
  • (1..5) or (‘a’ .. ‘z’) are ranges, like an accordeon (a third dot includes the last element, too)
  • [1, 2, 3] is an array
  • {‘xml’ => ‘eXtensible Markup Language’, ‘sql’ => ‘Structured Query Language’} is a dictionary
  • /regex/ contains a regular expression
  • nil represents emptyness
  • << is the concatenation operator, which is a method
  • puts print something on console
  • gets reads from user input
  • def is used to define methods
  • case is used with when to switch over the characteristics of a variable
  • initialize is the default constructor
  • < means inheritance or extends from the java perspektive
  • class classname creates a class and has to be ended using the keyword end
  • wrap code together using modules
  • attr_accessor :picks, :purchased is a shortcut for creating getter and setter for the variables picks and purchased
    attr_reader :picks, :purchased is used for creating only getters

I thinks, this should now suffice to work with buildr. I will update this while I dig through buildr to automate my transformation process.