- open FILEHANDLE,EXPR
- open FILEHANDLE,MODE,EXPR
- open FILEHANDLE,MODE,EXPR,LIST
- open FILEHANDLE,MODE,REFERENCE
- open FILEHANDLE
Opens the file whose filename is given by EXPR, and associates it with FILEHANDLE.
(The following is a comprehensive reference to open(): for a gentler introduction you may consider perlopentut.)
If FILEHANDLE is an undefined scalar variable (or array or hash element)
the variable is assigned a reference to a new anonymous filehandle,
otherwise if FILEHANDLE is an expression, its value is used as the name of
the real filehandle wanted. (This is considered a symbolic reference, so
use strict 'refs'
should not be in effect.)
If EXPR is omitted, the scalar variable of the same name as the
FILEHANDLE contains the filename. (Note that lexical variables--those
declared with my
--will not work for this purpose; so if you're
using my
, specify EXPR in your call to open.)
If three or more arguments are specified then the mode of opening and
the file name are separate. If MODE is '<'
or nothing, the file
is opened for input. If MODE is '>'
, the file is truncated and
opened for output, being created if necessary. If MODE is '>>'
,
the file is opened for appending, again being created if necessary.
You can put a '+'
in front of the '>'
or '<'
to
indicate that you want both read and write access to the file; thus
'+<'
is almost always preferred for read/write updates--the '+>'
mode would clobber the file first. You can't usually use
either read-write mode for updating textfiles, since they have
variable length records. See the -i switch in perlrun for a
better approach. The file is created with permissions of 0666
modified by the process' umask
value.
These various prefixes correspond to the fopen(3) modes of 'r'
,
'r+'
, 'w'
, 'w+'
, 'a'
, and 'a+'
.
In the 2-arguments (and 1-argument) form of the call the mode and
filename should be concatenated (in this order), possibly separated by
spaces. It is possible to omit the mode in these forms if the mode is
'<'
.
If the filename begins with '|'
, the filename is interpreted as a
command to which output is to be piped, and if the filename ends with a
'|'
, the filename is interpreted as a command which pipes output to
us. See "Using open() for IPC" at perlipc
for more examples of this. (You are not allowed to open
to a command
that pipes both in and out, but see IPC::Open2, IPC::Open3,
and "Bidirectional Communication with Another Process" at perlipc
for alternatives.)
For three or more arguments if MODE is '|-'
, the filename is
interpreted as a command to which output is to be piped, and if MODE
is '-|'
, the filename is interpreted as a command which pipes
output to us. In the 2-arguments (and 1-argument) form one should
replace dash ('-'
) with the command.
See "Using open() for IPC" at perlipc for more examples of this.
(You are not allowed to open
to a command that pipes both in and
out, but see IPC::Open2, IPC::Open3, and
"Bidirectional Communication" at perlipc for alternatives.)
In the three-or-more argument form of pipe opens, if LIST is specified
(extra arguments after the command name) then LIST becomes arguments
to the command invoked if the platform supports it. The meaning of
open
with more than three arguments for non-pipe modes is not yet
specified. Experimental "layers" may give extra LIST arguments
meaning.
In the 2-arguments (and 1-argument) form opening '-'
opens STDIN
and opening '>-'
opens STDOUT.
You may use the three-argument form of open to specify IO "layers" (sometimes also referred to as "disciplines") to be applied to the handle that affect how the input and output are processed (see open and PerlIO for more details). For example
open(FH, "<:utf8", "file")
will open the UTF-8 encoded file containing Unicode characters,
see perluniintro. (Note that if layers are specified in the
three-arg form then default layers set by the open
pragma are
ignored.)
Open returns nonzero upon success, the undefined value otherwise. If
the open
involved a pipe, the return value happens to be the pid of
the subprocess.
If you're running Perl on a system that distinguishes between text
files and binary files, then you should check out "binmode" for tips
for dealing with this. The key distinction between systems that need
binmode
and those that don't is their text file formats. Systems
like Unix, Mac OS, and Plan 9, which delimit lines with a single
character, and which encode that character in C as "\n"
, do not
need binmode
. The rest need it.
When opening a file, it's usually a bad idea to continue normal execution
if the request failed, so open
is frequently used in connection with
die
. Even if die
won't do what you want (say, in a CGI script,
where you want to make a nicely formatted error message (but there are
modules that can help with that problem)) you should always check
the return value from opening a file. The infrequent exception is when
working with an unopened filehandle is actually what you want to do.
As a special case the 3 arg form with a read/write mode and the third
argument being undef
:
open(TMP, "+>", undef) or die ...
opens a filehandle to an anonymous temporary file. Also using "+<" works for symmetry, but you really should consider writing something to the temporary file first. You will need to seek() to do the reading.
Since v5.8.0, perl has built using PerlIO by default. Unless you've changed this (ie Configure -Uuseperlio), you can open file handles to "in memory" files held in Perl scalars via:
open($fh, '>', \$variable) || ..
Though if you try to re-open STDOUT
or STDERR
as an "in memory"
file, you have to close it first:
close STDOUT; open STDOUT, '>', \$variable or die "Can't open STDOUT: $!";
Examples:
$ARTICLE = 100; open ARTICLE or die "Can't find article $ARTICLE: $!\n"; while (<ARTICLE>) {... open(LOG, '>>/usr/spool/news/twitlog'); # (log is reserved) # if the open fails, output is discarded open(DBASE, '+<', 'dbase.mine') # open for update or die "Can't open 'dbase.mine' for update: $!"; open(DBASE, '+<dbase.mine') # ditto or die "Can't open 'dbase.mine' for update: $!"; open(ARTICLE, '-|', "caesar <$article") # decrypt article or die "Can't start caesar: $!"; open(ARTICLE, "caesar <$article |") # ditto or die "Can't start caesar: $!"; open(EXTRACT, "|sort >Tmp$$") # $$ is our process id or die "Can't start sort: $!"; # in memory files open(MEMORY,'>', \$var) or die "Can't open memory file: $!"; print MEMORY "foo!\n"; # output will end up in $var # process argument list of files along with any includes foreach $file (@ARGV) { process($file, 'fh00'); } sub process { my($filename, $input) = @_; $input++; # this is a string increment unless (open($input, $filename)) { print STDERR "Can't open $filename: $!\n"; return; } local $_; while (<$input>) { # note use of indirection if (/^#include "(.*)"/) { process($1, $input); next; } #... # whatever } }
See perliol for detailed info on PerlIO.
You may also, in the Bourne shell tradition, specify an EXPR beginning
with '>&'
, in which case the rest of the string is interpreted
as the name of a filehandle (or file descriptor, if numeric) to be
duped (as dup(2)) and opened. You may use &
after >
,
>>
, <
, +>
, +>>
, and +<
.
The mode you specify should match the mode of the original filehandle.
(Duping a filehandle does not take into account any existing contents
of IO buffers.) If you use the 3 arg form then you can pass either a
number, the name of a filehandle or the normal "reference to a glob".
Here is a script that saves, redirects, and restores STDOUT
and
STDERR
using various methods:
#!/usr/bin/perl open my $oldout, ">&STDOUT" or die "Can't dup STDOUT: $!"; open OLDERR, ">&", \*STDERR or die "Can't dup STDERR: $!"; open STDOUT, '>', "foo.out" or die "Can't redirect STDOUT: $!"; open STDERR, ">&STDOUT" or die "Can't dup STDOUT: $!"; select STDERR; $| = 1; # make unbuffered select STDOUT; $| = 1; # make unbuffered print STDOUT "stdout 1\n"; # this works for print STDERR "stderr 1\n"; # subprocesses too open STDOUT, ">&", $oldout or die "Can't dup \$oldout: $!"; open STDERR, ">&OLDERR" or die "Can't dup OLDERR: $!"; print STDOUT "stdout 2\n"; print STDERR "stderr 2\n";
If you specify '<&=X'
, where X
is a file descriptor number
or a filehandle, then Perl will do an equivalent of C's fdopen
of
that file descriptor (and not call dup(2)); this is more
parsimonious of file descriptors. For example:
# open for input, reusing the fileno of $fd open(FILEHANDLE, "<&=$fd")
or
open(FILEHANDLE, "<&=", $fd)
or
# open for append, using the fileno of OLDFH open(FH, ">>&=", OLDFH)
or
open(FH, ">>&=OLDFH")
Being parsimonious on filehandles is also useful (besides being
parsimonious) for example when something is dependent on file
descriptors, like for example locking using flock(). If you do just
open(A, '>>&B')
, the filehandle A will not have the same file
descriptor as B, and therefore flock(A) will not flock(B), and vice
versa. But with open(A, '>>&=B')
the filehandles will share
the same file descriptor.
Note that if you are using Perls older than 5.8.0, Perl will be using the standard C libraries' fdopen() to implement the "=" functionality. On many UNIX systems fdopen() fails when file descriptors exceed a certain value, typically 255. For Perls 5.8.0 and later, PerlIO is most often the default.
You can see whether Perl has been compiled with PerlIO or not by
running perl -V
and looking for useperlio=
line. If useperlio
is define
, you have PerlIO, otherwise you don't.
If you open a pipe on the command '-'
, i.e., either '|-'
or '-|'
with 2-arguments (or 1-argument) form of open(), then
there is an implicit fork done, and the return value of open is the pid
of the child within the parent process, and 0
within the child
process. (Use defined($pid)
to determine whether the open was successful.)
The filehandle behaves normally for the parent, but i/o to that
filehandle is piped from/to the STDOUT/STDIN of the child process.
In the child process the filehandle isn't opened--i/o happens from/to
the new STDOUT or STDIN. Typically this is used like the normal
piped open when you want to exercise more control over just how the
pipe command gets executed, such as when you are running setuid, and
don't want to have to scan shell commands for metacharacters.
The following triples are more or less equivalent:
open(FOO, "|tr '[a-z]' '[A-Z]'"); open(FOO, '|-', "tr '[a-z]' '[A-Z]'"); open(FOO, '|-') || exec 'tr', '[a-z]', '[A-Z]'; open(FOO, '|-', "tr", '[a-z]', '[A-Z]'); open(FOO, "cat -n '$file'|"); open(FOO, '-|', "cat -n '$file'"); open(FOO, '-|') || exec 'cat', '-n', $file; open(FOO, '-|', "cat", '-n', $file);
The last example in each block shows the pipe as "list form", which is
not yet supported on all platforms. A good rule of thumb is that if
your platform has true fork()
(in other words, if your platform is
UNIX) you can use the list form.
See "Safe Pipe Opens" at perlipc for more examples of this.
Beginning with v5.6.0, Perl will attempt to flush all files opened for
output before any operation that may do a fork, but this may not be
supported on some platforms (see perlport). To be safe, you may need
to set $|
($AUTOFLUSH in English) or call the autoflush()
method
of IO::Handle
on any open handles.
On systems that support a close-on-exec flag on files, the flag will be set for the newly opened file descriptor as determined by the value of $^F. See "$^F" at perlvar.
Closing any piped filehandle causes the parent process to wait for the
child to finish, and returns the status value in $?
.
The filename passed to 2-argument (or 1-argument) form of open() will have leading and trailing whitespace deleted, and the normal redirection characters honored. This property, known as "magic open", can often be used to good effect. A user could specify a filename of "rsh cat file |", or you could change certain filenames as needed:
$filename =~ s/(.*\.gz)\s*$/gzip -dc < $1|/; open(FH, $filename) or die "Can't open $filename: $!";
Use 3-argument form to open a file with arbitrary weird characters in it,
open(FOO, '<', $file);
otherwise it's necessary to protect any leading and trailing whitespace:
$file =~ s#^(\s)#./$1#; open(FOO, "< $file\0");
(this may not work on some bizarre filesystems). One should conscientiously choose between the magic and 3-arguments form of open():
open IN, $ARGV[0];
will allow the user to specify an argument of the form "rsh cat file |"
,
but will not work on a filename which happens to have a trailing space, while
open IN, '<', $ARGV[0];
will have exactly the opposite restrictions.
If you want a "real" C open
(see open(2) on your system), then you
should use the sysopen
function, which involves no such magic (but
may use subtly different filemodes than Perl open(), which is mapped
to C fopen()). This is
another way to protect your filenames from interpretation. For example:
use IO::Handle; sysopen(HANDLE, $path, O_RDWR|O_CREAT|O_EXCL) or die "sysopen $path: $!"; $oldfh = select(HANDLE); $| = 1; select($oldfh); print HANDLE "stuff $$\n"; seek(HANDLE, 0, 0); print "File contains: ", <HANDLE>;
Using the constructor from the IO::Handle
package (or one of its
subclasses, such as IO::File
or IO::Socket
), you can generate anonymous
filehandles that have the scope of whatever variables hold references to
them, and automatically close whenever and however you leave that scope:
use IO::File; #... sub read_myfile_munged { my $ALL = shift; my $handle = new IO::File; open($handle, "myfile") or die "myfile: $!"; $first = <$handle> or return (); # Automatically closed here. mung $first or die "mung failed"; # Or here. return $first, <$handle> if $ALL; # Or here. $first; # Or here. }
See "seek" for some details about mixing reading and writing.