#native_company# #native_desc#

PHP and Shell Scripting: Using Pipes

By Peter Shaw
on February 12, 2009

Following Darrell Brogdon’s previous article on using PHP as a shell scripting language, this article covers some advanced uses of the language and shows some tricks that make PHP an extremely useful language to work with.
Unlike Darrel’s article however, this one will also cover the Windows version of PHP as well as the Linux/Unix one. First, however, there are a couple of differences you need to be aware of.
So what are the differences?
The differences between the two environments are mostly due to operating system differences as opposed to differences in PHP itself. The big one has to be case sensitivity in file names; this has nailed me on many an occasions. A very good habit to get into to avoid this, is to always use lowercase filenames no matter what. How you do it is up to you, but lowercase filenames have always worked for me. The other thing you need to be very aware of are the differences in the file system.
Windows uses a system of drive letters and separate disks, where as most posix based systems (this is the technical name for Linux and Unix variants) treat all file systems and disks as one large directory tree. If you’re only working in one folder this should not present a problem, but if you??re working with globs and other file system functions, you need to be aware of this.
The 3rd most important thing is one of binary versus text when writing files. Under Linux/Unix, all files are handled identically and that is as a byte stream, whereas under Windows the situation is a little different. Windows treats text files as a separate type than binary files. Thankfully in the latest builds of PHP this is no longer an issue to be concerned with as the fopen calls now default to binary mode.
The last thing to be aware of is line endings. Under Windows these take the form rn as opposed to just n on Linux/Unix.
So to sum up, watch out for position in the file system, check your filenames and be cautious of line endings.
Why are these differences important in a Shell environment?
First and foremost, the case situation will manifest itself when running scripts. A script named ‘MyScript.php’ will happily run from the command prompt using ‘PHP -q myscript.php’ on Windows, but will fail miserably on Linux. File system position will show up when using glob functions, and other directory scanning functions. Lastly the line ending issue is most likely to surprise you if your parsing files, text streams or the output from commands. From this point on, I??ll not mention these points again, but if something is not working as you expect, then keep them in mind.
So what neat things can we do with PHP in the shell?
Well, the answer to that is pretty much anything. I use PHP for a huge amount of different tasks, from dumping files to chucking together quick test models or proof of concept ideas. Over the years however, one of the great things I??ve found with PHP is its ability to hack together quick pipe filters.
For those of you who are not up to speed on pipe filters, a quick explanation is in order. When the predecessors of today??s modern OS’s where developed back in the 60’s the idea of one tool, many uses was a very common one. Shell programming spawned a whole generation of people who spent insane amounts of time writing ridiculously long command lines, to perform some quite long tasks. It??s for this very reason that today most OS’s (Windows included) have a very rich set of separate programs that each do a small but efficient task.
Consider this example (running under Linux):

<code>ls -al | awk '{print $8,$5,$6}'</code>

This will take a standard long linux file listing, and re-order it so you get the file name, number of bytes and date of creation:

test 4096 2006-11-20
test1.php 0 2007-08-05
test.txt.gz 480 2006-04-09
tomcat 4096 2007-02-27
ukcode.php?number=07971899759 455 2006-12-04
unser.php 205 2007-01-14

If you install the Gawk package from the win32 GNU utilities page at http://gnuwin32.sourceforge.net/ then you can also do the following:

<code>dir | gawk "{print $4,$3,$1}"</code>

Which will give:

netset.txt 1,037 29/11/2007
ntent_a.xml 6,440 16/09/2008
ntent_ie.xml 1,654 16/09/2008
ntent_m.xml 5,862 16/09/2008
ntent_y.xml 5,816 16/09/2008
ntuser.dat 15,728,640 29/01/2009
persistent_state 16 04/08/2008
phone 47 22/05/2008


Download: shell.zip