Vim tips and tricks: HTML table re-ordering

Suppose you have an HTML table on a static website, not dynamically created. It shows a grid of thumbnail photos in table cells. The code was something like the following:

<table>
 <tr>
 <td><a href="photo/photo01.html"><img src="photo/photo01.jpg" alt="" /></a></td>
 <td><a href="photo/photo02.html"><img src="photo/photo02.jpg" alt="" /></a></td>
 <td><a href="photo/photo03.html"><img src="photo/photo03.jpg" alt="" /></a></td>
 <td><a href="photo/photo04.html"><img src="photo/photo04.jpg" alt="" /></a></td>
 <td><a href="photo/photo05.html"><img src="photo/photo05.jpg" alt="" /></a></td>
 </tr>
 <tr>
 <td><a href="photo/photo06.html"><img src="photo/photo06.jpg" alt="" /></a></td>
 <td><a href="photo/photo07.html"><img src="photo/photo07.jpg" alt="" /></a></td>
 <td><a href="photo/photo08.html"><img src="photo/photo08.jpg" alt="" /></a></td>
 <td><a href="photo/photo09.html"><img src="photo/photo09.jpg" alt="" /></a></td>
 <td><a href="photo/photo10.html"><img src="photo/photo10.jpg" alt="" /></a></td>
 </tr>
 <!-- etc... -->
</table>

Yes, ouch!

There are hundreds of photos, spread over several pages in several HTML files, and you want to put them all on one page. You also want to change the number of columns of the thumbnail grid. How are you going to do this in the most efficient way possible, without turning the code into a dynamically generated code? Read on to learn more, using grep and vim!

grep

What you want to do first is to collect all the lines with the <td> tags that are for the photos, and put them all in one file. In this case, the HTML files have conveniently been named photo_xxx.html where xxx is a zero-padded page number, like 001, 002, 003.

$ grep -h "href=\"photo/" photo_* > list.html

This will collect all the lines with href="photo/, which is the common pattern here, and put them in a file list.html. The -h argument suppresses the output of the file name in which the pattern was found – you don’t need that in your list.html file. Now you have a file with hundreds of <td> tags, all neatly on one line each.

vim

The next challenge is to put four <td> tags into one <tr> row tag, so that it looks like:

<tr>
 <td><a href="photo/photo01.html"><img src="photo/photo01.jpg" alt="" /></a></td>
 <td><a href="photo/photo02.html"><img src="photo/photo02.jpg" alt="" /></a></td>
 <td><a href="photo/photo03.html"><img src="photo/photo03.jpg" alt="" /></a></td>
 <td><a href="photo/photo04.html"><img src="photo/photo04.jpg" alt="" /></a></td>
</tr>
<tr>
 <td><a href="photo/photo05.html"><img src="photo/photo05.jpg" alt="" /></a></td>
 <td><a href="photo/photo06.html"><img src="photo/photo06.jpg" alt="" /></a></td>
 <td><a href="photo/photo07.html"><img src="photo/photo07.jpg" alt="" /></a></td>
 <td><a href="photo/photo08.html"><img src="photo/photo08.jpg" alt="" /></a></td>
</tr>
<tr>
 <td><a href="photo/photo09.html"><img src="photo/photo09.jpg" alt="" /></a></td>
 <td><a href="photo/photo10.html"><img src="photo/photo10.jpg" alt="" /></a></td>
 <!-- etc... >

We are going to start with inserting a two-line </tr><tr> after every 4th line. There is a neat way to do this in vim, courtesy of StackOverflow. So open the list.html file in vim and do the following:

:%s/.*\n.*\n.*\n.*\n/\0<\/tr>\r<tr>\r/g

The “line” pattern .*\n is put in four times, so this will match four lines. Then it will replace that with those four lines itself – which is what the \0 is for – plus the additional tags for the end of a row, and the start of the next row. This substitution command is applied to the whole document. Then it is a matter of adding the first <tr> tag at the top, and the last </tr> at the bottom – after eventually filling up the missing cells in the row with <td>&nbsp;</td> cells – and your table template is ready to be copied into the table on the original page that you wanted to update.

What are your thoughts?