Category Archives: internet

Firefox not firing onReadyStateChange with two AJAX requests and a confirm dialogue

There are many discussions like this on the internet, but I am confident I am now examining a very specific problem of Firefox, with a very specific AJAX routine.

Symptoms: there are other AJAX requests in the background, and there is a request initiated by the user, which requires confirmation (your average “are you sure you want to delete?” dialogue). The AJAX request to delete the entry never fired. Every AJAX request was in the default, asynchronous mode.

No matter what, I couldn’t for the life of me make that request work, everything else did. Only the “delete” functions didn’t (in Firefox, as Chome and Internet Explorer worked no problem).

I used console.log messages as crumbs along the way, to follow the flow, and it stopped right at onReadyStateChange, as anything after it wouldn’t show zip in the console. Which meant the onReadyStateChange wasn’t being triggered.

I even thought it was something off with the function name (maybe the delete inside it borked something, silly me thought).

I also tried switching to synchronous mode the incriminated requests… and welp, they worked! For a short while I thought about leaving them be synchronous, but I didn’t want those pesky warnings in the console about synchronous requests ruining the user esperience.

Only by chance an epifany happened, and the obvious came forth (I KNEW there had to be a reasonable explanation which I had yet to grasp).

Well, my configuration went like this (it’s an internal web application we are talking about): an AJAX request triggered each second to a PHP to check the user credentials and inactivity time, and an AJAX request by user intervention was launched to delete an entry… after a confirm dialogue requested, well, user confirmation for the delete (duh).

Everything worked flawlessly before, the problem started appearing after I added the background AJAX request each second (which I didn’t want to get rid of anyway).

As you may or may not know, the window.confirm dialogue is a synchronous structure, which means the page execution is frozen until the user presses either OK or CANCEL in the dialogue (same thing happens with the window.alert function).

So, imagine the background AJAX request being queued behind the confirm dialogue, and obviously taking precedence over anything else that had to come after, and the new AJAX request being formed right after the press on the OK button of the confirm dialogue, to delete the entry that the confirmation was for.

Well, these two requests were launched at the very same time, in a manner that Firefox (not the other browsers though) probably overlapped the onReadyStateChange and never triggered the one for the secondary AJAX request.

In fact getting rid of the confirmation dialogue solved the problem.

Either that, or I will have to come up with an asynchronous method of requesting user confirmation.

Add custom social media share icon buttons in phpBB

Self reference article: how to add responsive social media share icons on your forum for each page.

HTML to add to /styles/stylename/template/over_header.html, right under <div id="page-header" class="page-width">

 <div id="socialshare" data-expendable="yes">
 <div>
 <a class="twittersharebutton" href="http://twitter.com/share?url={SOCIAL_URL}" target="_blank">t</a>
 <a class="facebooksharebutton" href="http://www.facebook.com/sharer/sharer.php?u={SOCIAL_URL}" target="_blank">f</a>
 <a class="googlesharebutton" href="https://plus.google.com/share?url={SOCIAL_URL}" target="_blank">g</a>
 </div>
 </div>

CSS to add at the bottom of <head> section of same file:

<style>
 #socialshare {
 z-index:20;
}
a.twittersharebutton, a.facebooksharebutton, a.googlesharebutton {
 color:white;
 text-align:center;
 text-decoration: none;
 font-family:tahoma;
 vertical-align:bottom;
}
a.twittersharebutton {
 background-color:#55acee;
}
a.facebooksharebutton {
 background-color:#3b5998;
}
a.googlesharebutton {
 font-family:georgia;
 background-color:#dd4b39;
 line-height:60% !important;
}
#socialshare a:hover {
 color:white;
 filter:saturate(3);
}
@media (min-width: 1100px) {
 #socialshare {
 position:absolute;
 text-align:center;
 left:-55px;
 }
#socialshare>div {
 width: 50px;
 position:fixed;
 top:10px;
 }
a.twittersharebutton, a.facebooksharebutton, a.googlesharebutton {
 display:block;
 margin-bottom:10px;
 width:50px;
 height:50px;
 font-size:50px;
 line-height:51px;
 border-radius:6px;
 }
 
 #page-header {position:relative;}
}
@media (max-width:1099px) {
 #socialshare {
 bottom:2px;
 left:2px;
 position:fixed;
 }
a.twittersharebutton, a.facebooksharebutton, a.googlesharebutton {
 display:inline-block;
 margin-right:10px;
 width:30px;
 height:30px;
 font-size:30px;
 line-height:31px;
 border-radius:4px;
 }
}
</style>

Code to add in /includes/functions.php, in the template variable assignment (look for LAST_VISIT_DATE)

'SOCIAL_URL' => urlencode(generate_board_url() . '/' . $user->page['page']),

Reload, all done!

AFWall+ and Linux Deploy, no internet access unless firewall is disabled

This is a personal reminder and also an easier-to-find heads up to those looking for a solution: if you installed linux on Android via Linux Deploy, and find that, no matter how you set rules on AFWall+ you can never get internet to the mounted linux image, unless you disable the firewall altogether (not recommendable since you installed a firewall in the first place), then here is the solution provided in this thread (it’s all due to DNS calls being blocked without a possibility to make them pass through in the vanilla AFWall+).

Under AFWall+ contextual menu, open the custom script editor, and inser these lines:

 

$IPTABLES -A afwall-wifi -m owner --uid-owner root -p udp --sport=67 --dport=68 -j RETURN
$IPTABLES -A afwall-wifi -m owner --uid-owner nobody -p udp --sport=67 --dport=68 -j RETURN
$IPTABLES -A afwall-wifi -m owner --uid-owner root -p udp --sport=53 -j RETURN
$IPTABLES -A afwall-wifi -m owner --uid-owner nobody -p udp --sport=53 -j RETURN
$IPTABLES -A afwall-wifi -m owner --uid-owner root -p tcp --sport=53 -j RETURN
$IPTABLES -A afwall-wifi -m owner --uid-owner nobody -p tcp --sport=53 -j RETURN
$IPTABLES -A afwall-3g -m owner --uid-owner root -p udp --dport=53 -j RETURN
$IPTABLES -A afwall-3g -m owner --uid-owner nobody -p udp --dport=53 -j RETURN
$IPTABLES -A afwall-3g -m owner --uid-owner root -p tcp --dport=53 -j RETURN
$IPTABLES -A afwall-3g -m owner --uid-owner nobody -p tcp --dport=53 -j RETURN

making sure you preserve the line return after each RETURN since pasting directly into the tiny textbox of AFWall+ may lose the carriage returns.

BAM you will have internet from your android linux without having to disable the firewall. Naturally, you will also have to enable internet access to “Applications running as root”.

Update: as per Peter’s suggestion in the comments (thank you Peter!) if you still get errors with this approach you may need to add a couple more lines, like so:

$IPTABLES -A afwall-wifi-wan -m owner –uid-owner 5000 -j RETURN
$IPTABLES -A afwall-wifi-lan -m owner –uid-owner 5000 -j RETURN

where “5000” is an id you have to customize to your needs, and you can get it either from AFWall’s errors logs, or by checking the /etc/passwd file for the current user’s entry.

Cache PHP to gzipped static html pages using htaccess redirect

No matter the server performance, the fastest kind of website for a visitor is the one with static HTML pages; this way the server just has to upload existing data to the browser instead of starting the PHP compiler, open a connection to MySQL, fetch data, format the page and only after that send it. Less CPU, less memory, less processing time, more users served in a smaller amount of time.

Avoiding PHP and MySQL execution altogether is the key, and my project was to create something similar to WPSuperCache to be implemented in a generic PHP website, so my thanks go to said plugin’s developers for the source code that gave me precious tips.

DISCLAIMER: this guide presumes you are “fluent” with PHP and .htaccess coding, and is meant to just give directions as of how to obtain a certain result. This guide will not give a pre-cooked solution to just copy/paste into your website, you need to change/add code to adapt it to your need; I am not the person who can help you if you need a tailored solution, simply because I have no time to give away free personalized help!

Here’s our logical scheme:

  1. Visitor comes to website and asks for page X;
  2. Webserver checks via .htaccess (avoiding PHP execution) if there is a static cached version of page X already, and in this case either serves the gzipped file (if the client supports it), or falls back to the uncompressed HTML version… at which point our scheme ends; otherwise, if no cached version exists, continue to next step;
  3. PHP builds the page and serves it to the visitor as “fresh”, but at the same time saves the HTML output as both gzipped and uncompressed version so the next visitor will directly download that one.

This is a very bare procedure, which needs to be perfected by the following conditions:

  • Not all pages need to be cached (those that by their very nature change very often, like real time statistics, a poll, whatever)… in fact, some pages must NOT be cached (those that only moderators can view, both for secutiry reasons and consistency); PHP must avoid caching those pages, so that htaccess will always serve the “fresh” PHP page.
  • Pages that do get cached still need to be refreshed, because sometimes they can change (articles or posts that are modified/updated with time, or where comments get posted).
  • Some pages, even if the corresponding cached version exists, must NOT be served from cache, for instance when POST data is submitted (sometimes, also with GET, you as the webmaster should know if this is the case), that requires PHP to be handled.

Define “caching behaviour” in PHP

This section is where you set the variables that the PHP caching routine will check against to know when and how to do its job.

$cachepath=$_SERVER["DOCUMENT_ROOT"]."/cache/";
$cachesuffix="-static";
$nocachelist=array(
    "search"=>1,
    "statistics"=>1,
    "secretcodes"=>1,
    "..."=>1,
 );

And explained:

  • $cachepath is the folder where you want the static (HTML and gzip) files to be stored (the website I use this caching routine on, has all the pages accessible from root folder with rewrite URLs, in other words the only slash in the URL is the one after the domain name, and I have no need to reproduce any folder structure; if you do, you’re on your own);
  • $cachesuffix is a string I need to add to the URL string (for example if address is domainname.com/pagename then in this case the cachefile will be named pagename-static), this is useful if you want to cache the homepage which is not named index.php but is just domainname.com/ because in that case, the cachefile name would be empty and .htaccess won’t find it;
  • $nocachelist is an associative array where you have to add as many keys (pointing to a value of 1) as the pages you don’t want (for whatever reason) to cache; in the key name you have to put the string the user would write in the URL bar after the domain name slash to get to the page, for example if you don’t want to cache domainname.com/statistics you would be using “statistics”=>1 in there, as already is.

Have PHP actually save to disk the cached pages

     if (
        !isset($nocachelist[$_GET["page"]]) &&
        !$_SESSION["admin"] &&
        !count($_POST) &&
        !$_SERVER["QUERY_STRING"]
    ) {
        //build the uncompressed cache
        if (!file_exists($cachepath.$url.$cachesuffix.".html")) {
            $cached=str_replace("~creationtime","cached",$html);
            $fp=fopen($cachepath.$url.$cachesuffix.".html","w");
            fwrite($fp,$cached);
            fclose($fp);
        }
        //build the compressed cache
        if (!file_exists($cachepath.$url.$cachesuffix.".html.gz")) {
            $cachedz=str_replace("~creationtime","cached&amp;gzipped",$html);
            $cachedz=gzencode($cachedz,9);
            $cachedzsize=strlen($cachedz);
            $fp=fopen($cachepath.$url.$cachesuffix.".html.gz","w");
            fwrite($fp,$cachedz);
            fclose($fp);
        }
    }

Pretty much self explanatory, isn’t it.
No seriously, do you really want me to elaborate?

Ok, you got it.

  • The $_GET[“page”] is a GET value I set early in the code to know where we are in the website, you can use any variable here as long as you can check it against the $nocachelist array;
  • The other conditions in the first if should be clear, they avoid building the page’s cache if the CMS’s admin is logged (security) or if POST data is submitted or if there is a query string appended to the URL (consistency/stability);
  • $url is a variable that I define early in the code, and contains the string after the domain name slash and before the query string question mark, basically the kind of string you fill the $nocachelist array with (if you were paying attention, you may now think I have a redundant variable since $url and $_GET[“page”] should be the same, but this is not the case for other reasons);
  • $html is the string variable that, across the whole CMS, defines the raw HTML code to echo at the end of the PHP execution; you can either do like I do and define such string, or use an output buffer to obtain HTML if you instead print the HTML directly to screen during PHP execution;
  • ~creationtime is a “hotkey” I use in my template to plug in the time in seconds that was needed to create the page in PHP; since I am creating a cached version now, the creation time of the page is zero, because it’s already there to be downloaded from the browser instead of having to be compiled by the server, so in there I print either “cached&gzipped” for clients that support gzip, or only “cached” when the browser doesn’t not; you can safely strip out this part, as this is more of a eyecandy/nerdy/debug thing.

Let .htaccess send the cached files before starting the PHP compiler

AddEncoding x-gzip .gz
<FilesMatch "\.html\.gz$">
    ForceType text/html
</FilesMatch>

#GZIP CMS
RewriteCond %{REQUEST_METHOD} !POST
RewriteCond %{REQUEST_URI} !/forum/
RewriteCond %{QUERY_STRING} !.*=.*
RewriteCond %{HTTP:Cookie} !^.*(isadmin|nocache).*$
RewriteCond %{HTTP:X-Wap-Profile} !^[a-z0-9\"]+ [NC]
RewriteCond %{HTTP:Profile} !^[a-z0-9\"]+ [NC]
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteCond /full/path/to/your/htdocs/cache/$1-static.html.gz -f
RewriteRule ^(.*) /cache/$1-static.html.gz [L,T=text/html]

#UNCOMPRESSED CMS
RewriteCond %{REQUEST_METHOD} !POST
RewriteCond %{REQUEST_URI} !/forum/
RewriteCond %{QUERY_STRING} !.*=.*
RewriteCond %{HTTP:Cookie} !^.*(isadmin|nocache).*$
RewriteCond %{HTTP:X-Wap-Profile} !^[a-z0-9\"]+ [NC]
RewriteCond %{HTTP:Profile} !^[a-z0-9\"]+ [NC]
RewriteCond /full/path/to/your/htdocs/cache/$1-static.html -f
RewriteRule ^(.*) /cache/$1-static.html [L]

This is the code you need to plug in .htaccess, preferably after everything else, but before defining the custom error pages; anyway, since you should be .htaccess-fluent, you shouldn’t need to be told where this fits best.

Some detailing for the curious:

  • First bit is needed (at least I needed it) to serve the gzipped files in a way that the browser knows to handle, otherwise I just got gibberish (the gzip was being sent to output without being uncompressed by the browser first);
  • if the cookies “isadmin” or  “nocache” exist, the cache version of the page will not be served even if it exists; easy explanation, if an admin is logged, and there is special content on a page that only admins can see, you don’t want them to see the “vanilla” cached version of the pages instead; so it’s your duty in this case to set a “isadmin” cookie when an admin logs in, and remove it when the admin logs out;
  • choosing the correct full path on the webserver was a bit tricky in my case, I can’t quite remember where the issue was, but depending on the method I used to get it I had different paths; I kind of remember I had to choose between $_SERVER[“DOCUMENT_ROOT”] and dirname(__FILE__) because only one was working with .htaccess;
  • you don’t really have to change much in this code snippet unless you have particular needs; the /forum/ path exclusion could be irrelevant in your case;
  • thanks go to WPSuperCache developers for their .htaccess code that I stole to build this snippet!

Last but not least

The cache is there to help you, not to give an handicap to your website. You choose what the cache performance should be, and how many times it should get served instead of the dynamic page, before considering it paied off, but you have to clear the cache from time to time to make sure you’re not serving outdated stuff to your visitors.

In my case I use an external cronjob provider (setcronjob.com) to trigger a PHP routine every night which includes the following:

$handle=opendir("cache");
while (($file = readdir($handle))!==false) @unlink("cache/".$file);
closedir($handle);

so that everyday the website starts off with a fresh cache. Not less important, you should clear the cache of a single page as soon as you know that page changed, unless you’re ok with the changes being visibile only the next day after all the cache is cleared anyway. Example: you edit a page while logged as admin, or a user posts a comment, or anything happens that you have control over that alters in any way the page’s HTML: simply use unlink() to delete both the gzipped and uncompressed caches and the website will recreate them with the updated content.

 

Have fun with pimping up this draft!

Redirect mobile and desktop users to the correct CSS via .htaccess file

The winning solution for a mobile compliant website is to have a single HTML structure for every visitor, and just choose the appropriate CSS to render the page correctly for both worlds; so, I’m not speaking here about a /mobile/ folder under your website root, nor any ?mobile query string, but about the exact same URL, that has all the candy when viewed by a desktop browser, while is super-compact and sleak if the browser is running on a smartphone.

How do you do that?

First, get yourself two stylesheets suited for desktop and mobile, or rather, like I did, prepare a set of CSS files, where there is a “common.css” always used, and a “desktop.css” and “mobile.css” that get loaded when needed. Also, get yourself minify in order to define a “desk” and a “mobi” group of files to load with a simple URL (I use this procedure on another website).

Then add this to your .htaccess file, fairly early after the start of the RewriteEngine:

RewriteCond %{HTTP_USER_AGENT} !^.*(2.0\ MMP|240x320|400X240|AvantGo|BlackBerry|Blazer|Cellphone|Danger|DoCoMo|Elaine/3.0|EudoraWeb|Googlebot-Mobile|hiptop|IEMobile|KYOCERA/WX310K|LG/U990|MIDP-2.|MMEF20|MOT-V|NetFront|Newt|Nintendo\ Wii|Nitro|Nokia|Opera\ Mini|Palm|PlayStation\ Portable|portalmmm|Proxinet|ProxiNet|SHARP-TQ-GX10|SHG-i900|Small|SonyEricsson|Symbian\ OS|SymbianOS|TS21i-10|UP.Browser|UP.Link|webOS|Windows\ CE|WinWAP|YahooSeeker/M1A1-R2D2|iPhone|iPod|Android|BlackBerry9530|LG-TU915\ Obigo|LGE\ VX|webOS|Nokia5800).* [NC]
RewriteCond %{HTTP_user_agent} !^(w3c\ |w3c-|acs-|alav|alca|amoi|audi|avan|benq|bird|blac|blaz|brew|cell|cldc|cmd-|dang|doco|eric|hipt|htc_|inno|ipaq|ipod|jigs|kddi|keji|leno|lg-c|lg-d|lg-g|lge-|lg/u|maui|maxo|midp|mits|mmef|mobi|mot-|moto|mwbp|nec-|newt|noki|palm|pana|pant|phil|play|port|prox|qwap|sage|sams|sany|sch-|sec-|send|seri|sgh-|shar|sie-|siem|smal|smar|sony|sph-|symb|t-mo|teli|tim-|tosh|tsm-|upg1|upsi|vk-v|voda|wap-|wapa|wapi|wapp|wapr|webc|winw|winw|xda\ |xda-).* [NC]
RewriteRule ^stylesheets\.css$ /min/index.php?g=cssdesk [L]
RewriteRule ^stylesheets\.css$ /min/index.php?g=cssmobi [L]

First of all, these directives do not consider the iPad as a mobile device, since it has plenty of screen space in my opinion to just work as a desktop browser; if you do not agree, just insert an “ipad” string in there together with the rest.

Some explaining is in order: credit for the RewriteCond’s goes to the creators of WP Supercache, which has been a model to write my own caching system on another website of mine (next article is coming about the caching routine itself).

Those RewriteCond lines tie the next RewriteRule directive only to desktop browsers (since they exclude all user agents that can be associated to mobile devices), so the first RewriteRule redirects to the /min/index.php?g=cssdesk minified CSS set tailored for desktops (change it as needed). The [L] directive right after it tells .htaccess to ignore everything else after that. When the RewriteCond’s are not met, it means a mobile device is coming to your site, so the first RewriteRule is ignored, and the one after that is used that redirects to the mobile CSS set instead.

You see that I redirect requests for a file named “stylesheets.css”. Guess what, this file doesn’t exist on my website, and it just references from the template’s HTML: it’s .htaccess that deals with giving the browser the correct data. In this case all you have to do is reference the stylesheets in your HTML code like this:

<link rel="stylesheet" href="/stylesheets.css" type="text/css" media="screen" />

The advantages? This is faster if compared to the same operation done with PHP, since a webserver processes .htaccess directives faster than it compiles PHP code, plus you may be saving an additional include operation on an external PHP file. Also, an approach of this type is absolutely needed if you want to implement a no-PHP-execution caching system, because then there would be no PHP code to use to address browsers to the desktop or mobile CSS, would it.

Zip MySQL database backup with PHP and send as email attachment

I was a happy user of MySQLDumper until a while ago, when my shared hosting upgraded to the cloud architecture and latest PHP version; with that, some modules for PERL went missing and I was left without a viable backup solution (MySQLDumper PHP approach uses a javascript reloading routine which is superslow and is not compatible with wget nor remote cronjobs like SetCronJob.com).

So I found a couple of tutorials on the net (this, and this), mixed them together, especially improving David Walsh’s routine, blended them with some healthy bzip2 compression, and shipped to my webspace.

I stripped the part where you can select which tables to backup; I figured that any backup deserving this name should be “all-inclusive”; also I think that anyone with the special need to backup only selective tables should also be smart enough to change the code to suit his/her needs.

UPDATE 9/9/11: I just added a snippet of code to blacklist some tables from the backup, since I needed to trim the size or it gave a fatal error about the available memory being full; just fill the array with rows as you see fit

This is the result:

<?php
$creationstart=strtok(microtime()," ")+strtok(" ");

$dbhost="your.sql.host";
$dbname="yourdbname";
$dbuser="yourmysqlusername";
$dbpass="yourmysqlpassword";

$mailto="mail.me@mymailbox.com";
$subject="Backup DB";
$from_name="Your trustworthy website";
$from_mail="noreply@yourwebsite.com";

mysql_connect($dbhost, $dbuser, $dbpass);
mysql_select_db($dbname);

$tablesblocklist=array(
    "tablename1"=>1,
    "tablename2"=>1,
    "tablename3"=>1,
);
$tables = array();
$result = mysql_query("SHOW TABLES");
while($row = mysql_fetch_row($result))
$tables[] = $row[0];
foreach($tables as $table) {
if (!isset($tablesblocklist[$table])) {
$result = mysql_query("SELECT * FROM $table");
$return.= "DROP TABLE IF EXISTS $table;";
$row2 = mysql_fetch_row(mysql_query("SHOW CREATE TABLE $table"));
$return.= "\n\n".$row2[1].";\n\n";
while($row = mysql_fetch_row($result)) {
$return.= "INSERT INTO $table VALUES(";
$fields=array();
foreach ($row as $field)
$fields[]="'".mysql_real_escape_string($field)."'";
$return.= implode(",",$fields).");\n";
}
$return.="\n\n\n";
}
}
$filename='db-backup-'.date("Y-m-d H.m.i").'.sql.bz2';

$content=chunk_split(base64_encode(bzcompress($return,9)));
$uid=md5(uniqid(time()));
$header=
"From: ".$from_name." <".$from_mail.">\r\n".
"Reply-To: ".$replyto."\r\n".
"MIME-Version: 1.0\r\n".
"Content-Type: multipart/mixed; boundary=\"".$uid."\"\r\n\r\n".
"This is a multi-part message in MIME format.\r\n".
"--".$uid."\r\n".
"Content-type:text/plain; charset=iso-8859-1\r\n".
"Content-Transfer-Encoding: 7bit\r\n\r\n".
$message."\r\n\r\n".
"--".$uid."\r\n".
"Content-Type: application/octet-stream; name=\"".$filename."\"\r\n".
"Content-Transfer-Encoding: base64\r\n".
"Content-Disposition: attachment; filename=\"".$filename."\"\r\n\r\n".
$content."\r\n\r\n".
"--".$uid."--";
mail($mailto,$subject,"",$header);

$creationend=strtok(microtime()," ")+strtok(" ");
$creationtime=number_format($creationend-$creationstart,4);
echo "Database backup compressed to bz2 and sent by email in $creationtime seconds";
?>

(WordPress killed the indenting and I’ve no intention of fixing it manually)

The whole routine runs in roughly 7seconds on my not-so-bright hoster server, including bzip2’ing and mailing, while the original text-only output for David’s code took on the same server roughly 17seconds; probably it’s due to the removal of several redundant loops, and using the builtin mysql_real_escape_string() and implode() functions instead of David’s workarounds.

My sincere thanks go to the authors of the two guides, without whom I wouldn’t be writing this today 😉

reCAPTCHA fails to deliver, bypassed or cracked, spam still comes through

I like(d)  the idea behind reCAPTCHA, blocking spam while helping scanning books is cool, and let’s be honest, reCAPCHA images are more readable to the human eye than the random deformed alphanumeric combinations the usual antispam codes produce.

Anyway, even if I have Akismet plugin installed, being the comments captcha’ed I expected almost no spam at all, not certainly from bots, yet it’s been a constant for me; lately, a huge wave of spam messages, almost identical, hitted the website; this also happened on the punbb forum of my professional website, where reCAPTCHA plugin was in action, and you understand that finding a spam post containing plenty of links with obscene thumbnails attached is not good for my image.

What’s the reason behind all this? Possibly reCAPTCHA pollution by special kind of keyword spamming to make certain known words always check positive, or advanced OCR techniques, or your average next door guy payed some millicent each captcha he decyphers. I don’t really care, what I know is that spam is passing through.

So I decided to drop reCAPTCHA altogether and switch to good old onsite captcha generation, let’s see what happens, there is now SI Captcha plugin installed (suppose SI stands for SecureImage, I recognize the pattern).

  This article has been Digiproved

PHP templating, preg_match, file_get_contents and strpos, the fastest

While working on the website of reecycle.it, il frecycling italiano per il riciclo degli oggetti, I am spending my resources also on reducing at most the load on the Apache server of my italian shared hosting, Tophost, so i decided to make some benchmarks to see which was the best speed I could load the templates at, using different methods (a template is a “model” of a page -or part of-  written in HTML code, inside which the dynamic values loaded by the website engine are inserted).

All the templates used by reecycle.it were initially created as several .html files (one per each template) with several %tags% substituted by the str_replace() function before the output, but I thought that maybe there were different, faster ways to obtain the same result; a page can contain up to 5 or more different templates (general layout, login panel, simple search widget, search results table, and single row of the search results table), so each page could tell the server to access and load 5 different files on the hard disk (we are ignoring the disk cache for simplicity); maybe reading a single, bigger file containing all of the templates, loading it into memory, and later extracting the needed parts, is faster? This path can be taken in two ways, either by writing two “clean” lines of code using preg_match() and a regular expression, or by using a rather less “elegant” but strictly more performant combo of strpos() and substr() which instead needs several more lines of code.

In other words, I needed to know which one of the three (separate template files, single big template with regexp extraction, and single big template with strpos/substr extraction) was faster. I already knew preg_match was heaps slower than strpos/substr, yet I included it in the test for the sake of completeness.

This is the routine I used:

<?php
$creationstart=strtok(microtime()," ")+strtok(" ");

for ($i=0;$i<1000;$i++) {
 $text=file_get_contents("full.html");
 for ($n=1;$n<=8;$n++) {
 preg_match("/<!--{$n}\(-->(.*)<!--\){$n}-->/s",$text,$matches);
 $html[$n]=$matches[1];
 }
}

$creationend=strtok(microtime()," ")+strtok(" ");
$creationtime=number_format($creationend-$creationstart,4);
echo "preg_match: ".$creationtime."<br />";

/////////////////
$creationstart=strtok(microtime()," ")+strtok(" ");

for ($i=0;$i<1000;$i++) {
 $text=file_get_contents("full.html");
 for ($n=1;$n<=8;$n++) {
 $start=strpos($text,"<!--$n(-->")+strlen("<!--$n(-->");
 $ending=strpos($text,"<!--)$n-->");
 $html[$n]=substr($text,$start,($ending-$start));
 }
}

$creationend=strtok(microtime()," ")+strtok(" ");
$creationtime=number_format($creationend-$creationstart,4);
echo "strpos/substr: ".$creationtime."<br />";

////////////////////
$creationstart=strtok(microtime()," ")+strtok(" ");

for ($i=0;$i<1000;$i++) {
 for ($n=1;$n<=8;$n++) {
 $html[$n]=file_get_contents($n.".html");
 }
}

$creationend=strtok(microtime()," ")+strtok(" ");
$creationtime=number_format($creationend-$creationstart,4);
echo "file_get_contents: ".$creationtime."<br />";

where full.html is the single HTML file containing all the templates (a total of 8, consisting in paragraphs in lorem ipsum style, of different length), identified by <!--templatenumber(--> and <!--)templatenumber--> between which was the code to extract, while the single template files were named from 1.html to 8.html.

What the code does is, for each method it repeats 1000 iterations in which every single template is loaded, from 1 to 8, and measures the needed time to complete. This usage is not very realistic, as the template code is actually several lines of HTML code instead of few long lines of text, and templates are never loaded all together, but only those which are actually needed to draw the page; anyway, better performance in this test means better performance in real-life use (WRONG! check bottom for more details).

So, this was the result:

preg_match: 1.8984
strpos/substr: 0.0681
file_get_contents: 0.1352

Final times were obviously different at each page refresh, from a minimum (for preg_match) of 1.4s up to a maximum of 3s, anyway the relationship between them remained the same, that is the strpos/substr combination was two times faster than file_get_contents called for each file, yet what surprised me is how preg_match method is almost 30 times slower than strpos/substr, and hence 15 times slower than asking the server to read several different files together (I suppose this was due to the disk cache in action).

On a side note, the tech support of my hosting, inquired about this, suggested me to drop reading separate files in favour of reading a single file and using preg_match… go figure.

UPDATE:

I just went and tested this benchmark with real templates off reecycle.it… oh how much I was wrong.
I made a function on reecycle.it to fetch the requested template, that when the template inside the big file is missing, loads the single file template, returns it, and adds the missing template to the big single file, so after a while I got a 37kb supertemplate.tpl containing all of the templates of the website. I just changed the routines in the example above, to use the real files of the templates… and behold, the results were inverted! Using file_get_contents() on several files was two times faster than using strpos/substr on a single big file. No matter how I changed it, several, separate small files were still much faster than a single big file.

I blame it on two things: the file is actually big, so string functions have to deal with a big chunk of data to process, and especially the tag formats, since the template delimiters, practically in HTML comment format, begin with the “less than” symbol which is bound to be repeated lots of times inside HTML code, maybe confusing the strpos function.

In fact, in the templates archive file I modified the delimiting tags so they were like {tag(} and {)tag} instead of using the html comment tags <!–…–>, and the results of the benchmark went back to normal, being faster for the strpos/substr combo on the single file archive than with file_get_contents on several separate files, and the more the template requests, the faster the strpos/strsub method if compared to file_get_contents… see results of my next post.

Se vuoi cambiare la tua password, o non la ricordi, inserisci qui l’indirizzo di posta elettronica con cui ti sei registrato<br />
<form method=”post” action=”index.php”>
<table>
<tr>
<td class=”dida”>
email
</td>
<td class=”dati”>
<input type=”text” name=”resetmail” />
</td>
</tr>
<tr>
<td colspan=”2″>
<input type=”hidden” value=”resetpassword” name=”action” />
<input type=”submit” value=”Invia mail di reset” name=”sendresetpassword” />
</td>
</tr>
</table>
</form>

  This article has been Digiproved

Protect your WordPress contents from plagiarism with a digital timestamp

Since a while ago, I decided to release all the contents I produce with a Creative Commons license. Why? Well, once I publish my stuff online, anyone with an internet connection can read it; so, if they like it and want to make it more widely available, a stricter copyright policy would compell them to ask me for permission and blah blah; if people like the contents I produce and distribute them around, I’m happy about it, as long as it’s made clear that I am the original author and noone else tries to take credit for my work (because that would obviously piss me off big time). The CC license I chose is the NC-SA, which also forbids any commercial use of my contents (if someone’s going to earn money out of my work, then I want my fair share), and allows re-use of the contents as long as they are released with the same license (I wouldn’t want someone to take my work, reassemble it at his own liking, and then lock it down even if it’s not really theirs).

Well, all’s good until now, but what if someone tries to “steal” your contents? The things you write are legally yours since the moment they are written (as long as they are original, naturally), yet you need to prove they’re yours in case of a dispute. If someone copies one or more articles off your website, he could very well decide to say that those contents are his, and you copied them off him; your word against his.

The only way to “prove” that a written text is yours, is to prove that it was in your possession before anyone else got it; “timestamping” is the keyword, a timestamp which is legally valid (meaning that noone would be able to prove it’s fake), and that binds the digital works with your identity; it’s best if you get a digital timestamp before releasing your work, in fact, if an “attacker” has the means to download the contents off your website and certify them before you do, he would become the new technical owner of your contents.

There are several ways to get a timestamp, free and paid, more and less solid. A free digital timestamp service is provided by Timemarker, but it’s only available for manual timestamping, that is you have to go to their site and provide them with your contents, then download their timestamp PGP signature. It’s good if you need to timestamp images, or archives.

On the other side, if the contents you want to timestamp are written posts on your WordPress blog (and this is all this article is about), the only real good service available to occasional bloggers like me (those who do not make a living with their website) is the free Copyright Proof WordPress Plugin (wordpress.org plugin repository has the most updated changelog), which gets you the best for its price. As soon as you register a free account on Digiprove (they also offer paid timestamping for professional, non-Wordpress users), you can insert your own API key in the plugin, and it will take care of everything: as soon as you hit the publish button in the WordPress editor, before actually publishing the article its written contents will be submitted to Digiprove server for secure timestamping, and once the timestamp information is obtained, the article will be online along with its certificate (you can control how the digiprove timestamp appears both on your website and on their servers, and you can also see the badge at the bottom of this post). Same thing happens when you edit the article, the certificate will be updated to reflect the changes on the article, and the older certificates, even if not linked anymore, will still be valid for the previous revisions of the article.

This service, again, is free, and does its job perfectly. Cian Kinsella, Digiprove’s CEO, told me they’re going to improve the way the plugin works, to make it dynamically create the certificate HTML code inside the post (it’s currently hardcoded, with CSS styling done inside the tags), and he also anticipated that, after this change has been done, they will be probably adding a feature I requested, that is a sort of “older certificates drawer”, thanks to which the reader can access the links to the older timestamps of the previous revisions of your page which reside on their servers, and are currently lost (the links, not the certificates); this will greatly improve the functionality of the plugin for those who prefer updating their older posts with newer info, instead of opening new posts.
By the way, even if Cian is CEO of two different companies, and I don’t know his daily schedule but I suppose it should be pretty busy, he still managed to personally reply to my mails to Digiprove’s support, which is good!

Since a while ago, I decided to release all the contents I produce with a Creative Commons license. Why? Well, once I publish my stuff online, anyone with an internet connection can read it; so, if they like it and want to make it more widely available, a stricter copyright policy would compell them to ask me for permission and blah blah; if people like the contents I produce and distribute them around, I’m happy about it, as long as it’s made clear that I am the original author and noone else tries to take credit for my work (because that would obviously piss me off big time). The CC license I chose is the NC-SA, which also forbids any commercial use of my contents (if someone’s going to earn money out of my work, then I want my fair share), and allows re-use of the contents as long as they are released with the same license (I wouldn’t want someone to take my work, reassemble it at his own liking, and then lock it down even if it’s not really theirs).

Well, all’s good until now, but what if someone tries to “steal” your contents? The things you write are legally yours from the moment they are written (as long as they are original, naturally), yet you need to prove they’re yours in case of a dispute. If someone copies one or more articles off your website, he could very well decide to say that those contents are his, and you copied them off him; your word against his.

The only way to “prove” that a written text is yours, is to prove that it was in your possession before anyone else got it; “timestamping” is the keyword, a timestamp which is legally valid (meaning that noone would be able to prove it’s fake), and that binds the digital works with your identity; it’s best if you get a digital timestamp before releasing your work, in fact, if an “attacker” has the means to download the contents off your website and certify them before you do, he would become the new technical owner of your contents.

There are several ways to get a timestamp, free and paid, more and less solid. A free digital timestamp service is provided by Timemarker, but it’s only available for manual timestamping, that is you have to go to their site and provide them with your contents, then download their timestamp signature. It’s good if you need to timestamp images, or archives.

On the other side, if the contents you want to timestamp are written posts on your WordPress blog (and this is all this article is about), the only real good service available to occasional bloggers like me (those who do not make a living with their website) is the free Copyright Proof WordPress Plugin (wordpress.org plugin repository has the most updated changelog), which gets you the best for its price. As soon as you register a free account on Digiprove (they also offer paid timestamping for professional, non-Wordpress users), you can insert your own API key in the plugin, and it will take care of everything: as soon as you hit the publish button in the WordPress editor, before actually publishing the article its written contents will be submitted to Digiprove server for secure timstamping, and once the timestamp information is obtained, the article will be online along with its certificate (you can control how the digiprove timestamp appears both on your website and on their servers, and you can also see the badge at the bottom of this post). Same thing happens when you edit the article, the certificate will be updated to reflect the changes on the article, and the older certificates, even if not linked anymore, will still be valid for the previous revisions of the article.

This service, again, is free, and does its job perfectly. Cian Kinsella, Digiprove’s CEO, told me they’re going to improve the way the plugin works, to make it dynamically create the certificate HTML code inside the post (it’s currently hardcoded, with CSS styling done inside the tags), and he also anticipated that, after this change has been done, they will be probably adding a feature I requested, that is a sort of “older certificates drawer”, thanks to which the reader can access the links to the older timestamps of the previous revisions of your page which reside on their servers, and are currently lost (the links, not the certificates); this will greatly improve the functionality of the plugin for those who prefer updating their older posts with newer info, instead of opening new posts.
By the way, even if Cian is CEO of two different companies, and I don’t know his daily schedule but I suppose it should be pretty busy, he still managed to personally reply to my mails to Digiprove’s support, which is good!

  This article has been Digiproved

Resize, enlarge or scale an html image map with a PHP script

I am creating a portal for an italian website which will sport a nice region selector with an imagemap, and region highlighting with a javascript. I found a free and detailed image map of italy along with a combined png, but it was too small to be really useable, so I needed to put a bigger image; with that, I also needed to alter the imagemap coordinates to they matched the enlarged image… noway I was going to do that by hand!

So, after searching for a pre-made solution (which I obviously didn’t find)  I devised a very simple PHP script to do eaxctly that. The script puts the entire code for the image map inside a string variable, and then processes that string with a regular expression search and replace to change the values accordingly to my needs. I only needed to make the image two times bigger, mantaining the aspect ratio, but since I was going to publish this inside a guide, I said to myself “why not making it so you can change the aspect ratio as well”. So, if you want, you can make the imagemap two times larger, and 1.5 times taller.

Here’s the sample script (the $html variable is defined with a sample imagemap for the region of Lazio alone, for space purposes, but you can put whatever you want inside, I used it with the whole map of Italy, together with <map> tags and lines breaks/indentation). Be careful and ESCAPE the double quotes inside the HTML before pasting the code inside the string. In other words, simply put a backslash (the \ character) before each occurence of a double quote (the $quot; character) inside the HTML, I used the search and replace function of Notepad++.

<?php
$html="<area href=\"#\" alt=\"state\" title=\"lazio\" shape=\"poly\" coords=\"74.513,86.938,75.667,87.365,75.667,88.007,74.744,89.077,75.436,90.467,76.359,90.039,77.857,90.039,78.319,90.039,79.127,90.788,79.588,91.857,79.588,92.606,80.049,93.034,80.51,93.034,81.317,94.103,81.779,94.852,82.24,94.959,83.74,94.852,84.201,94.959,85.123,94.959,86.392,94.103,87.43,93.141,88.122,93.141,89.39,93.141,89.967,92.713,91.351,90.895,91.813,90.895,92.274,91.216,93.196,90.895,94.349,90.788,94.926,90.467,96.31,89.825,96.886,90.467,96.656,90.895,95.849,91.323,95.387,92.072,94.234,92.072,92.965,92.713,92.505,93.676,92.505,94.317,92.734,94.959,91.928,95.28,91.813,95.922,91.467,96.778,92.505,98.382,92.505,99.023,92.505,99.986,91.928,101.804,91.928,103.194,92.734,103.837,94.234,103.623,96.31,104.264,97.579,105.013,99.309,106.51,102.191,108.543,103.229,108.543,104.728,109.077,106.113,110.361,106.574,111.965,106.804,113.035,106.574,113.783,106.574,114.425,105.882,114.853,105.305,115.067,104.844,115.067,104.728,116.029,104.728,117.099,104.152,118.061,103.46,118.703,102.999,119.345,102.999,120.093,101.961,120.308,100.23,120.735,99.539,120.308,98.271,119.345,96.656,118.489,95.156,118.275,92.965,118.489,91.005,118.703,89.39,116.885,89.506,116.029,88.122,114.639,85.931,113.997,83.97,112.607,81.548,110.574,78.55,107.687,77.627,105.869,76.128,104.692,74.975,102.874,73.706,101.056,71.745,99.023,70.131,97.098,67.594,94.959,69.093,93.676,70.131,92.606,70.592,91.216,70.592,90.039,71.745,89.611,72.553,88.649,73.014,88.221,72.553,86.938,73.245,86.189,74.513,86.938\" />";
echo preg_replace("/([0-9.]{2,}),([0-9.]{2,})/e","round(\\1*2,3).','.round(\\2*1.5,3)",htmlentities($html));
?>

All you need to do is change the parameters inside the regular expression (numbers in red), first is for horizontal proportion, second for vertical; in this example the imagemap will be resized to 2 times its length, and 1.5 times its height, if you want to make the size three times bigger, change it to

echo  preg_replace("/([0-9.]{2,}),([0-9.]{2,})/e","round(\\1*3,3).','.round(\\2*3,3)",htmlentities($html));

Then, copy the php script to your server, open it in a web browser, and copy/paste the result. Note: according the the way the imagemap is originally formatted, you may need to edit the regular expression to accomodate for spaces, tabs, linebreaks or whatever; in this case, since the coordinates were listed with just commas inbetween it was not needed. If you are stuck, write in the comments an excerpt of your imagemap code and I’ll try and help you.

  This article has been Digiproved