Kamisama.me

Background jobs with php and resque: part 2, queue system

As said in part 1, a queue is needed to save the jobs. Worker will then poll this queue at a defined interval to execute these jobs.

Queue systeme diagram

Read the rest of this entry »

Background jobs with php and resque: part 1, introduction

Background jobs are jobs that are executed outside the main flow of your program, and usually handled by a queue system.

This first tutorial of the serie will introduce what’s a background job and its importance.

Read the rest of this entry »

For a more visual DebugKit

DebugKit is a very useful tool to debug you cakephp pages. You can view variables passed to the views, sql queries etc … but it requires opening the panel. More than often, we’re interested in a ‘summary’ more than the details.

What if it looks like that instead, a la github

The toolbar will not be hidden anymore, and will always stick to the top.

Panels title will be replaced by a summary of its content, and fall into 2 categories :

The important one

It’s all the panels with stats and other stuff that matters to the current page.

  • Variables panel‘s title display the number of variables set by the controller, excluding global variables, like title_for_layout, $this->validationErrors, $request->data and Loaded Helpers.
  • Sql panel‘s title show the total number of sql queries, and the time taken
  • Time panel‘s title show the peak memory use and the request time

Contents of the panels remain unchanged, just the panels title are changed.

Note that the Cache, Jobs and NoSql panels are my own panels, and not included in DebugKit, you can found them in DebugKitEx. Cache Panel could be useful to anyone :

The others

All the others panels, deemed less important, with the title downgraded to a simple icon, since there’s not enough width. They’re :

  • History panel
  • Session panel
  • Request panel
  • Logs panel
  • Environment
  • Included files

Panels importance is arbitrary, and based on my taste. But a panel can be easily upgraded via the $priority variable.

Each panel will have a new $priority var, defining its importance.

  • 0 means not important, display like an icon on the left side
  • 1 and greater will put the panel on the right side, with a more detailed title.

Upgrading/Downgrading a panel can then be done in the controller :

public $components = array(
    'DebugKit.Toolbar' => array(
        'panels' => array('history' => array('priority) => 1)
    )
);

$priority could be used to re-order the panels order.

Current issues

  • Panel’s title are set before view rendering. This causes problem for the timer panel, because the timer is stopped way before the end, and the displayed time and memory consumption is not accurate anymore.
  • Icons title uses FontAwesome, and are not semantic. And each panels must define which icon to use.

Try it

A beta is available here. I will try to discuss incorporating these changes with Mark Story, and maybe, who knows, you’ll see these features in DebugKit one day :P.

ResqueBoard

ResqueBoard is a web interface to monitor your Php Resque activities in realtime.

It uses Square’s Cube (requires MongoDB and NodeJS) to collect, compute and stream datas/events/metrics in realtime.

We all know that metrics are important, especially in a bigger-than-average application. I guess you don’t need resque at all if you have a blog, but when you have like 200 jobs per minutes, you’ll start to wondering how your workers behave, and ask some of the following questions :

  • How many jobs/min are my workers processing ?
  • Is some of my queues more busier than other ?
  • Do I need additional workers on some queues ?
  • Are the workers overloaded at some time ?
  • Why did some jobs fail ?

Most of ResqueBoard’s charts are refreshed in realtime, via websocket. You’ll see which job just completed or failed, etc …

Take a look at the demo (website does not have much activity => you’ll not see all the number blink and other realtime stuff).

More informations and screenshots on the resqueboard website.

ResqueBoard is different than the front-end dashboard shipped with the original resque, which doesn’t log anything, and thus can not compute metrics.

Project is still young, and all focus are put on the features, rather than code optimization, user interface etc … It’s also my first time working with d3.js, so there’s some mess in the javascript.

Feedbacks are appreciated, since tools can behave differently depending of your workers load. It should not have difficulty running on a production server, since all the computations are done by Cube, which is production ready.

CakeResque 1.0

Just released the version 1.0.0 of Cake-Resque a few days ago.

A big update that comes with a lot of bug fixes and new functions, thanks to feedback on Fresque, the sister version of Cake-Resque not tied to a framework.

Some of the new features :

  • Composer is used to manage all dependencies
  • Php-resque-Ex is used by default instead of php-resque
  • New logging options
  • stop can now stop individual workers
  • tail the log file you want

New features

Php-Resque-Ex

Php-Resque-Ex is a fork of php-resque, which uses Monolog to handle all the logging stuff. It also automatically uses phpredis extension to connect to Redis when installed, and implements an extra method to load jobs externally, very important to load and instantiate cakephp models.

You can revert back to php-resque if you wish, you’ll just lose some logging functions, but it will work just fine. Though you’ll have to patch the Resque Worker class yourself to bootstrap the CakePHP classes.

New logging options

For now, only CubeHandler and RotatingFileHandler are supported. More to comes later. Refer to documentation for usage.

Stop

stop will ask you to choose the worker to stop, from a list. Add --all to stop all workers.

Tail

tail will ask you to choose the log to display, from a list, because you can now create a different log for each start command.

A new website

A brand new very basic website : http://cakeresque.kamisama.me, with documentation, changelog, and install instruction.

Faster php lint

php lint (php -l) can only check a single file by default. Sadly, you can’t pass a folder as argument and check the content recursively, making validating a medium size project (700 files, with 95.000 lines of code) very slow.

With the default target in the jenkinphp build.xml

<target name="lint" description="Perform syntax check of sourcecode files">
  <apply executable="php" failonerror="true">
   <arg value="-l" />

   <fileset dir="${basedir}/src">
    <include name="**/*.php" />
    <modified />
   </fileset>

   <fileset dir="${basedir}/tests">
    <include name="**/*.php" />
    <modified />
   </fileset>
  </apply>
 </target>

building a project takes up to 8min, with a little more than 5min taken by lint. A complete waste of time, considering that only a handful of files’ been edited with each build, and if you build very frequently.

You can run lint more faster, by running it in parallel, using some piping.
Read the rest of this entry »

Install Sphinx search engine on OS X Lion

As of September 2012, the sphinx port available via macport is still the very old version 0.9.9, released more than 2 years ago. Stable version today is already 2.0.5, released on July 28th, 2012. No other choices than compiling it yourself if you want real time index support, and other cool things not available in 0.9.9.

Install Sphinx

To compile yourself, you have to install all the basic library required by sphinx. In this case, we suppose we will need the mysql library to connect the engine to mysql directly, and the libstemmer library (for stemming).

Read the rest of this entry »

CakePHP reverse routing is slow, … and what you can do

Reverse routing is a very handy feature, but also very expensive.

In my cake application, I’ve noticed that rendering the views often takes more time than processing the controller action. It takes like 2s to render the view, while processing the action takes 0.5s. Very weird, since views are just a matter of printing variables (and maybe some include() when using elements).

A little debugging point out that printing the tag cloud in my footer takes ~200ms. A tag cloud with 50 tags, using $this->Html() to print each links. It costs ~5ms per links. Combine that with the others links in my views (menu, sidebar etc… : another ~70 links = total of 120 links) and I ended up waiting ~2s, just for processing the various $this->Html(). After looking how reverse works, I found out the “bottleneck”.
Read the rest of this entry »

Cake-Resque, a CakePHP plugin to manage queue system

Cake-Resque is a CakePHP plugin allowing you to put some tasks in a background queue, and execute them later.

Update 2012-02-17 : Refer to Github page to updated documentation
Update 2012-09-05 : Refer to official website for up-to-date API. CakeResque 1.1 comes with a lot of new features.

Background

Cake-Resque is based on Resque, written by defunkt, and used on Github to process background jobs. Resque uses Redis to store and retrieve the jobs, making it very fast. Among the redis advantage listed in the resque presentation are :

  • Atomic, O(1) list push and pop
  • Ability to paginate over lists without mutating them
  • Queryable keyspace, high visibility
  • Fast
  • Easy to install – no dependencies
  • Reliable Ruby client library
  • Store arbitrary strings
  • Support for integer counters
  • Persistent
  • Master-slave replication
  • Network aware

Plus, redis database is stored in the ram. It’s a very fast database, and used for tasks requiring many I/O, storing sessions, incrementing counter, caching, etc … Resque uses it to implement its queue structure, since it can retrieve and add data to a set in O(1) !

Read the rest of this entry »

PHP Resque with phpredis

In my quest of speed and performance, I wanted to implement a queue system in my PHP backed-end website, to defer non-essential tasks, such as logging, send notification/email, warming up cache etc …

There’s already some good and popular messaging system like Gearman and rabbitMQ, and many others that’s written for java, ruby etc …

Resque, from the folks at Github, is a Redis-backed library and can do almost the same thing, but it’s written in ruby.

PHPResque to the rescue

PHP-Resque is a php port by Chris Boulton for resque. You just have to install redis on your server, then use php-resque to queue your php jobs. Just start the workers, and they will poll a queue periodically for jobs to execute.

Read the rest of this entry »