There have been a lot of rough spots in D over the years, which is something to be expected in a language being developed by, in the early days, a one-man band. As more people have joined the development process, the wrinkles have been steadily ironed out. And as that has happened, I’ve been using D more and more. I love it, of course. It’s a wonderful language with a great deal of potential. But as I’ve used it more, I’ve found myself frustrated on occasion when dealing with Phobos.
The version of Phobos in D1 was heavily criticized as being subpar, resulting in the community-driven Tango project. With D2, that criticism disappeared. Personally, I’ll say one good thing about D1 Phobos that I miss: it’s intuitive.
A couple of days ago I was putting together a script to add some automation to my process of adding bindings to Derelict (yes, I’m moving into the 21st century). For a particular piece of code, I wanted to take a block of text and split it on a specific character. I know I’ve used std.string.split before. And I’m quite certain I’ve used a version that takes a parameter specifying the character to split on (it may have been D1). Well, times have changed.
The documentation for std.string contains the following:
IMPORTANT NOTE: Beginning with version 2.052, the following symbols have been generalized beyond strings and moved to different modules. This action was prompted by the fact that generalized routines belong better in other places, although they still work for strings as expected.
This notice is followed by a list of methods that have been moved either to std.algorithm or std.array. I had noticed it before when I needed to use std.string.insert, which is now std.array.insertInPlace. As it happens, the split method also has been moved to std.array.
So I go to the docs for std.array.split. And I see this:
Split the string s into an array of words, using whitespace as delimiter. Runs of whitespace are merged together (no empty words are produced).
WTF? I thought this was supposed to be a generalized function, hence the move to std.array. Yet it still operates on strings and splits on whitespace. If that’s the case, then doesn’t it make more sense to keep it in std.string? Well, whatever.
So, I look for the version that allows me to specify the character to split on. And… it doesn’t exist. Not in std.array, not in std.string. Given that I’ve already wasted enough time looking for it, I decide to take a different approach to implementing my script. No big deal.
Then today, I see this post by Chad J. in the D.learn newsgroup. My first thought is, “Good to see I’m not the only one who thinks this way.” He’s also looking for a string splitter that lets you specify the separator. simendsjo gives the answer:
Seriously? So if I want to split a string using whitespace, the split function that operates on strings using whitespace as the separator is std.array.split rather than std.string.split. And if I want to specify a separator, I need to use std.algorithm.splitter instead.
I can’t think of any word to describe this other than ridiculous. And this sort of thing comes up time and again when working with ranges. While working on the same script, I wanted to use std.algorithm.find. The example in the documentation shows this:
auto a = [ 1, 2, 3 ];
assert(find(a, 5).empty); // not found
assert(!find(a, 2).empty); // found
OK, easy enough. But then I get this error when compiling:
undefined identifier ‘empty
Huh? After more head scratching and keyboard banging, I realize I’m supposed to import std.range in order to get access to the ‘empty’ property of ranges. Given that the functions in std.algorithm all operate on ranges, shouldn’t it be importing std.range publicly so that I don’t have to?
Every time I want to do something simple with ranges, I always have to dig through not just the documentation, but the source to Phobos so that I can see exactly what’s happening and try to figure out why I’m getting the errors that inevitably pop up. For this reason, I try to avoid ranges as much as I can. But they keep intruding every time I want to do something simple like split a string.
What’s more, ranges pervade Phobos. Some time ago I was putting together a build script for Derelict and needed to use std.file.dirEntries to iterate the files in a directory tree. What I really wanted was an array of files. What I got was an “InputRange”. It took a bit of trial and error (several errors) and digging around the Phobos source before I could finally do something with it.
A while back, I was quite proud of myself when I finally grokked the basics of what ranges are and how they operate. But as implemented in Phobos, they aren’t intuitive to use at all. With methods spread out across several modules, I don’t see how anybody keeps everything straight. I shouldn’t have to look all over the place to split a string. Yes, strings are arrays and arrays are ranges. But, conceptually, strings are strings. IMO, std.string should be the place to look for string operations. Furthermore, I shouldn’t need to import three different modules to do one operation, as you often have to do when you find yourself suddenly dealing with ranges in a module where you didn’t expect to find them.
As I work with this stuff more, it will eventually become second nature to me. I’ll know that I need to look in std.algorithm for this, or std.array for that. But for now, the learning curve is steep. And I don’t think it’s just a matter of documentation. I think the layout of Phobos needs to be reconsidered. At the very least, std.range, std.algorithm, and std.array all need to be available with one import since they are so tightly coupled. Also, range/array operations that specialize on strings have no business in std.array or std.algorithm. They belong in std.string. That’s the only intuitive place to put them. Otherwise, why have the ‘string’ alias at all?