Twitter caching using XML

Several times recently I’ve been asked to implement custom Twitter feeds into sites. Not too tricky in itself. However, Twitter’s 150 calls per hour rate limit means tweets need to be cached in some way, as a direct call to the API will fail if the limit is exceeded. Furthermore, in one such recent case, tweets weren’t posted very often. As a result the API was only providing data for the most recent tweets rather than say, all of the last 20, regardless of when they were posted. So basically, the cache needed to be updated with any new tweets while ensuring historic ones were retained.

I’m no PHP expert so there’s undoubtedly room for improvement, but this is one simple approach that works. You could obviously store the feeds in a database but I’ve used XML documents on the server.

Firstly, we check a text file on the server for the time that the caching process last occured. If the caching hasn’t occurred for a set period, we go ahead with the call to Twitter.

 PHP |  copy code |? 
01
02
$timeFile = "tweetCaching/time.txt";
03
$ft = fopen($timeFile, 'r') or die("can't open file");
04
$lastCacheTime = fread($ft, filesize($timeFile));
05
fclose($ft);
06
 
07
if(time() > $lastCacheTime + 86400) //e.g 24hrs, set as required 
08
{
09
  //rest of code in here 
10

We then load in the existing cached tweets and get a reference to the time that the most recent tweet (or ‘status’) in the cache was posted. Next, we make the call to Twitter.

 PHP |  copy code |? 
01
02
$tweetsStr = file_get_contents("tweetCaching/".$username.".xml");
03
$tweetsXml = simplexml_load_string($tweetsStr);				
04
$tweetTime = $tweetsXml->status[0]->created_at;
05
 
06
$ch = curl_init("http://twitter.com/statuses/user_timeline/".$username.".xml");
07
//uncomment these lines when using    
08
//curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
09
//curl_setopt($ch, CURLOPT_HEADER, 0);
10
$tweetsData = curl_exec($ch);
11
$http_status = curl_getinfo($ch, CURLINFO_HTTP_CODE); //get http status code
12
curl_close($ch);
13

Then, provided we’ve received some actual tweet data from Twitter, we loop through all the tweets in the response, from the bottom up, checking the tweet creation time for each tweet. If a tweet is newer than our most recent cached tweet, we concatenate its nodes to a string. Assuming, there were any new tweets, we then insert the concatenated nodes into the top of the string that represents the existing cached tweets XML document (as shown below).

 XML |  copy code |? 
1
2
<statuses>
3
    <status>This is a new tweet from Twitter that's been inserted into this string of XML</status>
4
    <status>So is this.</status>
5
    <status>This is an existing cached tweet from yesterday</status>
6
    <status>This is an existing cached tweet from last week</status> 
7
</statuses>
8

 PHP |  copy code |? 
01
02
if($http_status == '200') //if we don't get exceeded rate limit message, fail whale etc..   
03
{
04
 
05
$tweetsDataXml = simplexml_load_string($tweetsData);
06
$numOfTweets = count($tweetsDataXml->status);
07
$newTweets = " ";
08
 
09
//loop through all tweets from bottom upwards
10
for($counter2 = $numOfTweets - 1; $counter2 >= 0; $counter2 = $counter2 - 1)
11
{
12
	if(strtotime($tweetsDataXml->status[$counter2]->created_at) > $mostRecentTweetTimes[$counter1])
13
	{
14
            $tweetID = $tweetsDataXml->status[$counter2]->id; 
15
 
16
	    //get all nodes for new tweet only
17
	    $nodes = $tweetsDataXml->xpath('/statuses/status[id = '.$tweetID.']');
18
	    $result = '';
19
	    foreach ( $nodes as $node )
20
	    {
21
	        $result .= $node->asXML()."\n";
22
	    }
23
 
24
	    $newTweets = $result.$newTweets;
25
	}
26
 
27
	if(strlen($newTweets) > 1)
28
	{
29
	    //get pos of 'statuses' tag in stored tweets;
30
	    $match = '<statuses type="array">';
31
	    $matchPos = stripos($tweetsStr,$match);
32
 
33
	    //get xml after statuses tag (i.e old tweets)
34
	    $restOfTweets = substr($tempTweetsXML,$matchPos + strlen($match));
35
 
36
	    //insert new tweet or tweets as top tweet in stored tweets   
37
	    $tempTweetsXML = substr_replace($tempTweetsXML,$match.$newTweets,$matchPos);
38
	    $updatedTweetsXML = $tempTweetsXML."\n".$restOfTweets;
39
	}							
40
}
41
}
42

The cached tweets XML document is then overwriten with existing cached tweets plus any new tweets. Finally the time is stored to enure a call isn’t made for at least another 24hrs. The cached XML can then be used to display tweets as required.

 PHP |  copy code |? 
01
02
//rewrite the cache
03
$currentTweetCache = "tweetCaching/".$username.".xml";
04
$fd = fopen($currentTweetCache, 'w') or die("can't open file");
05
fwrite($fd, '<statuses type="array">'.$updatedTweetsXML); 
06
fclose($fd);
07
 
08
//write the time
09
$now = time();
10
$lastCachedTimeFile = "tweetCaching/time.txt";
11
$fe = fopen($lastCachedTimeFile, 'w') or die("can't open file");
12
fwrite($fe, $now);
13
fclose($fe);
14