March 17, 2007

word wrap for Firefox bookmarklet

Firefox displays each paragraph in a plain text file (txt format) as a single line, if long running off the right side of the window. That doesn't seem optimally useful.

I asked for help on it at MozillaZine forums and got the advice to right click on the page and click View Page Source, which has a word wrap option. That works, but it seems unnecessarily complicated. I wanted the text displayed in a better format in the same tab it was opened in, in one click or less.

At that forum, I also found a JavaScript bookmarklet that was recommended for dealing with a similar problem, text without spaces that needs to be have potential breaks inserted so it can be word wrapped:

javascript:(function() { var D = document; F(D.body); function F(n) { var u, r, c, x; if (n.nodeType == 3) { u = n.data.search(/\S{45}/); if (u >= 0) { r = n.splitText(u + 45); n.parentNode.insertBefore(D.createElement('wbr'), r); } } else if ((n.tagName != 'STYLE') && (n.tagName != 'SCRIPT')) { for (c = 0; x = n.childNodes[c]; ++c) { F(x); } } } D.body.innerHTML += ' '; })();

The raw code intrigued me. I already knew about bookmarklets but I'd never written a new program in JavaScript. I thought I could figure out what was going on and write my own program to fix it anyway. I ended up spending all night learning a little JavaScript by experimenting and reading some definitions of some elements of JavaScript at DevGuru and came up with this:

txt to html v6

javascript:{var x, c; x = document.body.innerHTML; if (x.substr(0, 5) + x.substr(x.length - 6, 6) == '<pre></pre>') {document.write('<HTML><BODY>\n'); x = x.substr(5, x.length - 11); var textline = x.split('\n'); for (c = 0; c < textline.length; c++) {document.write(textline[c] + '<br>\n')}; document.write('</BODY></HTML>'); document.close();}}

Just cut and paste the above paragraph into the "Location:" field of a bookmark "Properties" window, open a txt file, click on the bookmarklet and watch it do its magic.

The only problem I've found using it is that when it's used on ordinary web pages, it strips most of the formatting, and when used extra times, it adds extra line breaks. That is, it's not foolproof. But you can just hit the "back" button to fix any problem.

Regarding this as an example of the increasing complexity of civilization: I find a lot of humor in it, how complicated it gets to deal with the simplest formatting issue within the complexity of modern communications. If it keeps going this way, pretty soon many common formats will be almost useless, like scrambled cable, when not viewed on matched devices and software, and too tricky for anyone to crack, not because they will be designed for copy-protection, but inadvertently because of rampant complexification.
___________________
 Updated May 16, 2012: More efficient word wrap bookmarklets were provided by Mardeg at Bugzilla@Mozilla forum in 2009. Either of these is what to use if you just want to change the view mode of an unformatted page displayed in Firefox to word wrapped temporarily, without changing the font or other page data:

shorter "Wrap" bookmarklet
javascript:void(document.getElementsByTagName('pre')[0].style.whiteSpace='pre-wrap')

even shorter "Wrap" bookmarklet
javascript:void(document.body.firstChild.style.whiteSpace='pre-wrap')

Mardeg also provided an undo which works for either of the two bookmarklets above, since the back button doesn't undo them:

"refresh" bookmarklet
javascript:history.go(0)

By contrast, the "txt to html" bookmarklet I wrote mungs an unformatted page such as a txt file of plain text into a quirky minimalist html formatted page, which will then be displayed word wrapped by default and in your browser's default html font, and which you may then save as an html file. The "txt to html" bookmarklet doesn't seem to require a "refresh" bookmarklet because Firefox seems to treat rendering the newly munged page as navigating to a new page, which results in the back button having the effect of undo. I use the word "mung" since rather than convert the page data in a standard way, it's a work-around that strips the page data using the ill-advised "=document.body.innerHTML" trick and crude string manipulation, and I use the words "quirky minimalist" since rather than following standards for complete html formatting as determined by w3c, I merely used the minimum of tags that seemed to work for the purpose.

So if you use the "txt to html" bookmarklet and save a page you're viewing that way, be advised that it will insert the tags <html> <head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"> [or some such according to whatever format and character encoding Firefox sets as the default] </head> <body>, a <br> for every blank line or end of line in the original, added line breaks that divide resulting lines over 132 columns into max 72 column lines [only visible in html source but done by Firefox when saving as html, maybe for backward compatibility with older text editors], and the close tags </body> </html> into the saved page.

A particular advantage of the newer "word wrap" bookmarklets is that if you also need to apply the "word wrap for text without spaces" bookmarklet in order to word wrap really long sequences of characters without spaces, you can apply both in either order. Though you won't see the reallylongwords wrap until you click both. With "txt to html" you have to apply the "word wrap for text without spaces" after converting to html, because the "word wrap for text without spaces" bookmarklet that this all started with does a sort of html formatting when used on unformatted page, which makes it no longer a plain text page that "txt to html" can reformat. (An extra element that makes this subject confusing is that Firefox has various levels of telling what html is actually in the source of the page and what html is taken as read or is imputed to the page because of Javascript. The "View Page Source" option seems to be fairly accurately the source that the page was generated from, from the start of a navigation step or JavaScript action that can only be undone by the back button. The "Inspect Element" option then "HTML" button seems to give closer to the html that's currently taken as read. Neither one gives you exactly what you'll get if you save the page.)