Tuesday, December 29, 2009

Bit (not Byte) Manipulation in Ruby

I was recently tasked with creating a rough version of the Lempel-Ziv 77 encoder/decoder engine for use in most operating systems (i.e. Windows, Linux, Mac). The application would need to read a binary file and compress it or decompress it to another binary file. The compression algorithm involved a format bit specifying compressed or literal bytes to follow and then distance and length bits of instructions for compressed data. Such an application would clearly involve a good deal of bit manipulation and consequently require a solid bit manipulation library.

The logical language of choice to me was C++ because of its proximity to the memory, inherent ease of bit manipulation, and presence on every computer since I was born. Unfortunately, I can probably barely compile a "hello, world!" application in C++ =( Next I considered Java since it's open source and present on most people's computers. However, my Java skillz have sadly dwindled since college to the point that I frustratingly discarded that project about an hour after I started. Finally, I decided upon Ruby as my language of choice -- mainly because I like coding in Ruby.

My project got off to a good start until I realized that the original research I'd done on manipulating bits in Ruby had been incomplete. Ruby inherently manages characters and bytes synonymously, but bits are another story. Based on the loose typing model of Ruby, any use of bits throughout my code was being converted to their numeric string representation behind the scenes. For example, 0xff was ending up as the string "255" when I was writing it to a file.

Finally, after much worrying, reading of documentation, online research, and irb investigation, I had an answer.
  • Bytes can be specified in Ruby per bit as such, 255 = 0b1111_1111 (each four bits are separated by an underscore). This was important for me since I was doing a lot of shifting and didn't want to worry about the actual numerical values in my unit testing.
  • Bytes can be written explicitly to files in Ruby using the << operator along with Array.pack.
File.open("foo.txt", "wb+") { |f| f << [0xff].pack("c") }
  • Bytes can be easily read using File.each_byte
  • The byte code for a given character can be accessed using: "a"[0]
  • Binary file manipulation involving windows must be done using the "b" flag when opening the file. Otherwise, the windows file system will treat certain bytes as termination characters and ignore the remainder of the file. I learned this the hard way because each_byte would just inexplicably stop reading in bytes from my file before the file was finished.
After I had all of this figured out, Ruby proved to a very nice environment for writing the app.

New Class Templates in C#

My team finally upgraded to .NET 3.5 along with Visual Studio 2008 recently, which has been a huge source of happiness for me. However, since the upgrade, I've been getting annoyed with VS's insistence that I always include the Linq library in each of my new classes. I typically remove all of the default using directives anyway, but since I don't automatically reference the System assembly that contains the Linq definition, I was getting a pre-compile error from R# every time I added a new item to a project. In addition, I've been growing tired of having to type "public" each time I add a new class (since the default is nothing). So, I decided to take action.

A little bit of web searching lead me to the Visual Studio Template Reference. Each time a new item is created in Visual Studio, VS finds the template definition that matches the item and generates the code to match. Templates support logical control flow and variable replacement. By default the C# class templates are stored in "\Program Files\Microsoft Visual Studio 9.0\Common7\IDE\ItemTemplates\CSharp\Code\1033", with similar templates stored nearby. The instructions are geared the reader toward creating their own templates for new types of items, which is a very useful tool (but I can do that with resharper), but I really just wanted to change the default. So I went in and did it: no more "using" and always public.

Before my changes would take effect, I first had to close visual studio and then run the following command,

devenv /installvstemplates

which will rebuild the VS templates run-time cache folder.

Now I can focus on all the fun features of 3.5!

Wednesday, December 9, 2009

VI - Macros

Abstracting repetitive steps of work in an on-the-fly macro is a task I find myself doing almost daily during development or database scripting. After hammering a pretty good one today, I was wondering to myself how to go about saving it. I did some Google searching ("save vi macro", "copy vi macro"), but came up empty. Then I realized that I already knew the answer. Whenever I record a new vi macro, I record to the q register. So I knew that my macro must be sitting as plain old text in the q register. So, I pasted the q register to my screen and lo and behold, there was my macro! I created a macros text file to track all my commonly used macros with descriptions of what they do. Yay!