Understanding "Local" Functions in Bash
I've been working a lot with Bash recently (because sadly a supercomputing
center doesn't tend to offer Fish out the box). One issue that I've been
running into when deploying complex scientific software oh HPC
systems is that relative
imports (via source
) are really finicky. The "canonical" way to do a relative
import would be to use:
source $(readlink -f $(dirname ${BASH_SOURCE[0]}))/relative/path/to/file.sh
But I find that the whole $(readlink -f $(dirname ${BASH_SOURCE[0]}))
is a bit
cumbersome. So what I wanted to do is define a function that gives me the
location of the current script -- and call it this
. This way the code above
becomes:
source $(this)/relateive/path/to/file.sh
Test Setup
I have 2 shell scripts, test1.sh
and test2.sh
(yea, I'm creative at naming
things) arranged as follows:
/tmp
├── test
| └── test2.sh
└── test1.sh
test1.sh
sourcestest2.sh
:
#!/usr/bin/env bash
# definition of `this` goes here
source $(this)/test/test2.sh
# Re-check what "this" is after source
echo "calling 'this' from test1.sh: $(this)"
echo $var
- And
test2.sh
redefinesthis
:
#!/usr/bin/env bash
# definition of `this` (same as test1.sh) goes here
export var="Exported Var"
echo "calling 'this' from test/test2.sh: $(this)"
Failures
I ended up "wasting" something like 2 days to get this to work -- here is my litany of failures:
Define a $this
Variable
My first idea was to save the output from readlink
to a variable:
this=$(readlink -f $(dirname ${BASH_SOURCE[0]}))
But $this
will be overwritten by any source
'ed file, if those files define
variables of the same name. Hence we see the following output with the this
version above:
calling 'this' from test/test2.sh: /tmp/test
calling 'this' from test1.sh: /tmp/test
Exported Var
Note how both printouts of this
show the same path -- that's bad.
Define a this()
Function
A reasonable lesson learned from the previous section would be to define a function, with the assumption that this will run in the "correct" local context:
this () { echo $(readlink -f $(dirname ${BASH_SOURCE[0]})); }
Unfortunately though, this won't work either! The definition of this()
in
test2.sh
overwrites the test1.sh
definition -- so when $(this)
is executed
in test1.sh
, it is run from test/test2.sh
. Hence we see the output:
calling 'this' from test/test2.sh: /tmp/test
calling 'this' from test1.sh: /tmp/test
Exported Var
Why this Approach was Ultimately Flawed
The problem is that there is no such thing as a local variable in bash scripts -- there are local variable in function definitions, but that's not that useful unless we want to rewrite EVERYTHING to live inside functions. Unfortunately this design extends to function definitions also: functions are executed from the location (ie. the files) in which they are defined.
This is more than just closures: you might think that the BASH_SOURCE
variable is captured by the function definition. If that was the only thing
that's going on, we could simply put an eval
to a command string into this
() { ... }
but that doesn't work either -- so it's not that BASH_SOURCE
is
copied over, but this
is run from the location which last defines it.
Solution: alias this
In the end, it's clear what we want: aliases! We want this
to map onto a
command which is executed at the time and in the place where this
is
called. Hence we use:
shopt -s expand_aliases
alias this="readlink -f \$(dirname \${BASH_SOURCE[0]})"
shopt -s expand_aliases
is needed to allow shell scripts to actually expand
aliases into valid commands. Since the alias is the same -- regardless where it
is defined -- it doesn't matter that downstream redefinitions keep overwriting
it with the same command.
This solution works like a charm:
calling 'this' from test/test2.sh: /tmp/test
calling 'this' from test1.sh: /tmp
Exported Var