Linux command line tricks


It’s been a little while since I posted something ultra geeky and ultra useful, so after been spurned on by a post that I answered earlier today on the LinkedIn, Open source group, I thought I’d do a little snippet on some useful tricks, and Impart some Linux/Unix goodness with you all.

A word of warning, what I’m about to show you is ubhergeek command line & I.T Ninja based stuff, so if your not comfortable typing in strange strings of commands, and prefer the cuddly warmth of a gooey clicky rodent, then this is probably not for you, on the other hand if your curious and want a little taste of what you can do at a CLI or in a console window, then please do continue reading.

The linux command line is not really something to be scared of, and for a long time was the only way to issue commands not only to *nix but to windows based machines also, granted it’s not for the feint hearted, and it does take a little bit of serious learning (and a good memory) but once you start mastering your way round, it’s not as scary as it would fist appear, and you’ll very likely find that a lot of things you can do, you can do a whole lot quicker than with a GUI.

Command lines are built round one simple philosophy, have lots of little tools, that do lots of little things, but do them very well.  Take for example the ‘ls’ command.

ls is used to list files, typing ls on it’s own and pressing return will likely result in something that looks like this:


root@poweredge:/wwwroot/relativememories.digital-solutions.local/htdocs# ls
images  index.html  js  phpinfo.php  styles

as you can see, all the files in the current folder on one single line (multiple lines depending on number of files, and terminal size), however with one simple change, we can get a whole load more information.

try:


ls -al

and press return


root@poweredge:/wwwroot/relativememories.digital-solutions.local/htdocs# ls -al
total 32
drwxrwxr-x 5 www-data www-data 4096 2009-08-03 17:26 .
drwxrwxr-x 5 www-data www-data 4096 2006-05-12 19:27 ..
drwxrwxr-x 2 www-data www-data 4096 2009-08-03 02:07 images
-rw-r--r-- 1 root     root     4949 2009-08-03 02:30 index.html
drwxrwxr-x 2 www-data www-data 4096 2009-08-02 15:55 js
-rw-rw-r-- 1 www-data www-data   26 2006-05-12 17:42 phpinfo.php
drwxrwxr-x 2 www-data www-data 4096 2009-08-02 15:52 styles

as you can see, we have a huge amount more of information, we have permissions, times, dates, sizes, owner name etc etc

now like me, I’m sure a lot of you have over the years had to report to a manager or project leader of some sorts, and we all know that these people just love reports right?  How about, if I showed you a great way, to turn that file listing into an XML file?  would that be kewl?

for this we need another command line tool, called ‘awk’ , like ‘ls’ it’s pretty standard, so you should find that it’s already there.  Type  awk  and press return, and you should get a whole screen of gibberish.

Now, what where going to do here, is joining 2 commands ‘ls’ and ‘awk’ together, using somthing called a pipe.

The pipe character is that | symbol, thats usually either next to your enter key, or in the lower left near the shift key, it looks like an elongated colon.  This is your magic pipe character that joins commands together, to use it you simply just type is you would a normal character, and the output from the previous command will be magicly transfered to the next.

Ok, so now lets turn our attention to ‘awk’ , awk is a text processing language and it’s speciality is in dealing with tabular data, if we look at the output from ‘ls -al’ we can easily see that the data is tabular, try typing the following:

ls -al | awk '{print $8}'

If things worked as expected, then you should have a list of files, one file entry per line, somthing like this:


.
..
images
index.html
js
phpinfo.php
styles

If you examine the output from ‘ls’ you’ll see it produces 8 columns of text, and that column number 8 {$8} in awk speak is the file name, $1 are the permissions, $2 is the inode,  $3 is the owner, $4 is the group, $5 is the size in blocks, $6 is the creation date, $7 is the creation time, and now our manager wants a list of files, and thier owners, and the date and time of creation.

awk script blocks, can begin with a pattern, so only lines matching that pattern are processed.  There are 2 special blocks,  BEGIN & END, any block beginning with these will be output only once, at the beginning and end respectively, try typing and executing the following:


ls -al | awk 'BEGIN{print "<filereport>"} {print $8} END{print "</filereport>"}'

You should get something like this:


<filereport>
.
..
images
index.html
js
phpinfo.php
styles
</filereport>

Now lets, play with that middle print statement a little, try typing and executing the following:

ls -al | awk 'BEGIN{print "<filereport>"} {print "  <fileentry>n    <name>"$8"</name>n  </fileentry>"} END{print "</filereport>"}'

This should produce:


<filereport>
  <fileentry>
    <name></name>
  </fileentry>
  <fileentry>
    <name>.</name>
  </fileentry>
  <fileentry>
    <name>..</name>
  </fileentry>
  <fileentry>
    <name>images</name>
  </fileentry>
  <fileentry>
    <name>index.html</name>
  </fileentry>
  <fileentry>
    <name>js</name>
  </fileentry>
  <fileentry>
    <name>phpinfo.php</name>
  </fileentry>
  <fileentry>
    <name>styles</name>
  </fileentry>
</filereport>

Which is half way there to what we want, clever eh 🙂

Now if we continue to expand the middle print statement, to include the other information our manager wants.


ls -al | awk 'BEGIN{print "<filereport>"} {print "  <fileentry>n    <name>"$8"</name>n    <owner>"$3"</owner>n    <createdate>"$6"</createdate>n    <createtime>"$7"</createtime>n  </fileentry>"} END{print "</filereport>"}'

This should produce:


<filereport>
  <fileentry>
    <name></name>
    <owner></owner>
    <createdate></createdate>
    <createtime></createtime>
  </fileentry>
  <fileentry>
    <name>.</name>
    <owner>www-data</owner>
    <createdate>2009-08-03</createdate>
    <createtime>17:26</createtime>
  </fileentry>
  <fileentry>
    <name>..</name>
    <owner>www-data</owner>
    <createdate>2006-05-12</createdate>
    <createtime>19:27</createtime>
  </fileentry>
  <fileentry>
    <name>images</name>
    <owner>www-data</owner>
    <createdate>2009-08-03</createdate>
    <createtime>02:07</createtime>
  </fileentry>
  <fileentry>
    <name>index.html</name>
    <owner>root</owner>
    <createdate>2009-08-03</createdate>
    <createtime>02:30</createtime>
  </fileentry>
  <fileentry>
    <name>js</name>
    <owner>www-data</owner>
    <createdate>2009-08-02</createdate>
    <createtime>15:55</createtime>
  </fileentry>
  <fileentry>
    <name>phpinfo.php</name>
    <owner>www-data</owner>
    <createdate>2006-05-12</createdate>
    <createtime>17:42</createtime>
  </fileentry>
  <fileentry>
    <name>styles</name>
    <owner>www-data</owner>
    <createdate>2009-08-02</createdate>
    <createtime>15:52</createtime>
  </fileentry>
</filereport>

Only one more thing is needed, and thats to filter out the lines we don’t want, we can do that by prefixing the mid print statement with an if clause:


ls -al | awk 'BEGIN{print "<filereport>"} {if (($8 != "..") && ($8 != ".") && ($8 != "")) print "  <fileentry>n    <name>"$8"</name>n    <owner>"$3"</owner>n    <createdate>"$6"</createdate>n    <createtime>"$7"</createtime>n  </fileentry>"} END{print "</filereport>"}'

Remember $8 is the file name, so what we are saying here is, if the file name is not equal to ".." and it is not equal to "." and it is not an empty string "" then and only then can we process the line, the final result is :


<filereport>
  <fileentry>
    <name>images</name>
    <owner>www-data</owner>
    <createdate>2009-08-03</createdate>
    <createtime>02:07</createtime>
  </fileentry>
  <fileentry>
    <name>index.html</name>
    <owner>root</owner>
    <createdate>2009-08-03</createdate>
    <createtime>02:30</createtime>
  </fileentry>
  <fileentry>
    <name>js</name>
    <owner>www-data</owner>
    <createdate>2009-08-02</createdate>
    <createtime>15:55</createtime>
  </fileentry>
  <fileentry>
    <name>phpinfo.php</name>
    <owner>www-data</owner>
    <createdate>2006-05-12</createdate>
    <createtime>17:42</createtime>
  </fileentry>
  <fileentry>
    <name>styles</name>
    <owner>www-data</owner>
    <createdate>2009-08-02</createdate>
    <createtime>15:52</createtime>
  </fileentry>
</filereport>

awk, can do a huge amount more, but where going to just briefly cover a couple of others before I wrap this one up.

A close cousin to awk, is sed.  In fact, most of the time you will often see sed and awk used hand in hand, in the same scripts.  sed stands for ‘Stream Editor’ and for want of a better description, it’s a text editor, but rather than have a full screen and move around with arrows doing replaces and other such stuff, it relies on regular expression commands to do it’s work.

In it’s simplest form, it can be used to change one (or a sequence) of characters to another, on the fly, as follows:

lets suppose we wanted to translate our output from ls -al into a CSV file, rather than an XML file, we could write an awk script that accomplished that quite easily, or we could just use sed, to replace all the spaces with a ,


ls -al | sed 's/ /,/g'

the ‘s’ in front of the / means substitute, and the g after the last / means global, so what this sed command says is,   substitute all occurences of a space / / in the stream with a comma /,/ and do it globally for the entire line (There is a space between the first two / chars, trust me), the out put we get should look somthing like the following:


total,32
drwxrwxr-x,5,www-data,www-data,4096,2009-08-03,17:26,.
drwxrwxr-x,5,www-data,www-data,4096,2006-05-12,19:27,..
drwxrwxr-x,2,www-data,www-data,4096,2009-08-03,02:07,images
-rw-r--r--,1,www-data,www-data,4949,2009-08-03,02:30,index.html
drwxrwxr-x,2,www-data,www-data,4096,2009-08-02,15:55,js
-rw-rw-r--,1,www-data,www-data,,,26,2006-05-12,17:42,phpinfo.php
drwxrwxr-x,2,www-data,www-data,4096,2009-08-02,15:52,stylesr

You’ll notice if you look at the line for phpinfo.php that there are 3 , there.  This is beacuse the output from ls -al is padded with spaces to format the output, fortunately, this is not a problem for sed, remember I said that the commands where based on regular expressions?


ls -al | sed "s/^ *//;s/ *$//;s/ {1,}/ /g" | sed "s/ /,/g"

The first time through sed, uses a regular expression, to collapse multiple spaces down to one space, then the second time it converts all those single spaces to a single ,   now if we combine this with part of what we did using awk previously to filter out the lines we don’t want:


ls -al | awk '{if (($8 != "..") && ($8 != ".") && ($8 != "")) print $0}' |  sed "s/^ *//;s/ *$//;s/ {1,}/ /g" | sed "s/ /,/g"

We end up with:


drwxrwxr-x,2,www-data,www-data,4096,2009-08-03,02:07,images
-rw-r--r--,1,www-data,www-data,4949,2009-08-03,02:30,index.html
drwxrwxr-x,2,www-data,www-data,4096,2009-08-02,15:55,js
-rw-rw-r--,1,www-data,www-data,26,2006-05-12,17:42,phpinfo.php
drwxrwxr-x,2,www-data,www-data,4096,2009-08-02,15:52,styles

which as you can see, leaves us with only our files and directories in CSV format, now if you really wanted to you could go one step further and use grep or another awk command to filter out only lines not beginning with ‘d’ and that would remove directories too, or what about using some of the information that’s stored in the proc file system?


root@poweredge:/wwwroot/relativememories.digital-solutions.local/htdocs# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 8
model name      : Pentium III (Coppermine)
stepping        : 3
cpu MHz         : 732.901
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse
bogomips        : 1465.80
clflush size    : 32
power management:

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 8
model name      : Pentium III (Coppermine)
stepping        : 3
cpu MHz         : 732.901
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse
bogomips        : 1466.02
clflush size    : 32
power management:

Imagine how much fun you could have with the files in there, you could (and many people have) build a complete server reporting system at the shell, or take any piece of info you desire and feed it into a system such as MRTG or NAGIOS.

There are many other commands to, such as ‘wc’, ‘cut’, ‘join’ then there’s the language interpreters such as ‘perl’, ‘python’ and ‘php’ that you can use to write new pipe filters in.

The only limit really, is how much your willing to put in to learn these tricks and how big your imagination is.  Hopefully I’ve given you a good start, if you want a taste of what awk can do, you can find some good material at:

and if you want to explore writing your own pipe filters in php, then I’ve previously written an article on the very subject that’s been published at phpbuilder.com, you can read the article by clicking here

remember though, if you learn any good tricks using these methods, share the knowledge, or even come back here and add a comment 🙂

Have fun.

Shawty

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s