I don’t know how many times I’ve had to write a function to safely join url path components. It seems so simple, “Just join the strings together with forward slashes.”
But, then I think, “Oh, and don’t strip the leading forward slash, if any”. And “oh, also don’t strip the trailing one”. Also, “make sure not to double up any slashes.”
It is an annoying routine to write, and to be honest, I’ve never really liked any of my implementations. They always failed on some edge case. But tonight, I conquered.
So, without further delay, here is my perfect url_join routine. It hasn’t failed any reasonable edge case I’ve thrown at it yet! The backslash replacement parts are to make it windows compatible, but of course I haven’t tested those.
def url_join(*args):
"""Join any arbitrary strings into a forward-slash delimited list.
Do not strip leading / from first element, nor trailing / from last element."""
if len(args) == 0:
return ""
if len(args) == 1:
return str(args[0])
else:
args = [str(arg).replace("\", "/") for arg in args]
work = [args[0]]
for arg in args[1:]:
if arg.startswith("/"):
work.append(arg[1:])
else:
work.append(arg)
joined = reduce(os.path.join, work)
return joined.replace("\", "/")
Technorati Tags: python
















2 responses so far ↓
1 Michal Kwiatkowski // Jul 7, 2006 at 9:09 am
Don’t use os.path.join for URLs because it will break on Windows and Mac. Simple ‘/’.join(work) will do the job. And there is a function in stdlib for path normalization: os.path.normpath, which will be more elegant solution that if/else.
2 Bruce // Jul 9, 2006 at 4:30 pm
I actually don’t mind the forward vs. backslash windows/linux os.path.join issue. Os.path.join also avoids doubling up the slashes. I wanted that function, and can simply correct for the backslashes.
It looks like I could simply join with “/” and then run normpath, but note that I’ll still need to correct for Windows brain-damaged backslashes.
Leave a Comment