Serverless JS-Webapp Pub/Sub with AWS IoT

I’m currently very interested in serverless (aka no dedicated backend required) JavaScript Web Applications … with AWS S3, Lambda & API Gateway you can actually get pretty far.
Yet there is one thing I didn’t know how to do: Pub/Sub or “Realtime Messaging”.

Realtime messaging allows to build web applications that can instantly receive messages published by another application (or the same one running in a different person’s browser). There even are cloud services permitting to do exactly this, e.g. Realtime Messaging Platform and PubNub Data Streams

However recently having played with AWS Lambda and S3 I was wondering how this could be achieved on AWS… and at first it seemed like it really isn’t possible. Especially the otherwise very interesting article Receiving AWS IoT messages in your browser using websockets by @jtparreira misled me, as he’s telling that it wouldn’t be possible. The article was published Nov 2015, … not so long ago. But turns out it’s outdated anyways…

Enter AWS IoT

While reading I stumbled over AWS IoT which allows to connect “Internet of Things” devices to the AWS cloud and furthermore provides messaging between those devices. It has a message broker (aka Device Gateway) sitting in the middle and “things” around it that connect to it. It’s based on the MQTT protocol and there are SDKs for the Raspberry Pi (Node.js), Android & iOS … sound’s interesting, but not at all like “web browsers”

MQTT over Web Sockets

Then I found an announcement: AWS IoT Now Supports WebSockets published Jan 28, 2016.
Brand new, but sounds great :)

… so even when IoT still sounds strange to do Pub/Sub with - it looks like a way to go.

Making it work

For the proof of concept I didn’t care to publish AWS IAM User keys to the web application (of course this is a smell to be fixed before production use). So I went to “IAM” in the AWS management console and created a new user first, attaching the pre-defined AWSIoTDataAccess policy.

So the proof of concept should involve a simple web page that allows to establish a connection to the broker, features a text box where a message can be typed plus a publish button. So if two browsers are connected simultaneously then both should immediately receive messages published by one of them.

required parts: … we of course need a MQTT client and we need to do AWS-style request signing in the browser. NPM modules to the rescue:

  • aws-signature-v4 does the signature calculation
  • crypto helps it + some extra hashing we need to do
  • mqtt has an MqttClient

… all of them have browser support through webpack. So we just need some more JavaScript to string everything together. To set up the connection:

let client = new MqttClient(() => {
    const url = v4.createPresignedURL(
        'GET',
        AWS_IOT_ENDPOINT_HOST.toLowerCase(),
        '/mqtt',
        'iotdevicegateway',
        crypto.createHash('sha256').update('', 'utf8').digest('hex'),
        {
            'key': AWS_ACCESS_KEY,
            'secret': AWS_SECRET_ACCESS_KEY,
            'protocol': 'wss',
            'expires': 15
        }
    );

    return websocket(url, [ 'mqttv3.1' ]);
});

… here createPresignedURL from aws-signature-v4 first does the heavy-lifting for us. We tell it the IoT endpoint address, protocol plus AWS credentials and it provides us with the signed URL to connect to.

There was just one stumbling block to me: I had upper-case letters in the hostname (as it is output by aws iot describe-endpoint command), the module however doesn’t convert these to lower case as expected by AWS’ V4 signing process … and as a matter of that access was denied first.

Having the signed URL we simply pass it on to a websocket-stream and create a new MqttClient instance around it.

Connection established … time to subscibe to a topic. Turns out to be simple:

client.on('connect', () => client.subscribe(MQTT_TOPIC));

Handling incoming messages … also easy:

client.on('message', (topic, message) => console.log(message.toString()));

… and last not least publishing messages … trivial again:

client.publish(MQTT_TOPIC, message);

… that’s it :-)

My proof of concept

here’s what it looks like:

screenshot of demo web page

… the last incoming message was published from another browser running the exact same application.

I’ve published my source code as a Gist on Github, feel free to re-use it.

To try it yourself:

  • clone the Gist
  • adjust the constants declared at the top of main.js as needed
    • create a user in IAM first, see above
    • for the endpoint host run aws iot describe-endpoint CLI command
  • run npm install
  • run ./node_modules/.bin/webpack-dev-server --colors

Next steps

This was just the first (big) part. There’s more stuff left to be done:

  • neither is hard-coding AWS credentials into the application source the way to go nor is publishing the secret key at all
  • … one possible approach would be to use the API Gateway + Lambda to create pre-signed URLs
  • … this could be further limited by using IAM roles and temporary identity federation (through STS Token Service)
  • there’s no user authentication yet, this should be achievable with AWS Cognito
  • … with that publishing/subscribing could be limitted to identity-related topics (depends on the use case)

Heroku custom platform repo for V8Js

Yesterday @dzuelke poked me to migrate the old PHP buildpack adjusted for V8Js to the new custom platform repo infrastructure. The advantage is that the custom platform repo only contains the v8js extension packages now, the rest (i.e. Apache and PHP itself) are pulled from the lang-php bucket, aka normal php buildpack.

As I already had that on my TODO list, I just immediately did that :-)

… so here’s the new heroku-v8js Github repository that has all the build formulas. Besides that there now is a S3 bucket heroku-v8js that stores the pre-compiled V8Js extensions for PHP 5.5, 5.6 and 7.0. packages.json file here.

To use with Heroku, just run

$ heroku config:set HEROKU_PHP_PLATFORM_REPOSITORIES="https://heroku-v8js.s3.amazonaws.com/dist-cedar-14-stable/packages.json"

with Dokku:

$ dokku config:set YOUR_APPNAME HEROKU_PHP_PLATFORM_REPOSITORIES="https://heroku-v8js.s3.amazonaws.com/dist-cedar-14-stable/packages.json"

replacing Huginn with λ

I used to self-host the Ruby application Huginn which is some kind of IFTTT on steroids. That is it allows to configure so-called agents that perform certain tasks online, automatically. One of those tasks was to regularly scrape the Firefox website for the latest firefox version number (which happens to be a data-attribute on the html element by the way), take only the major version number, compare it to the most recent known value (aka last crawl cycle) and send an email notification if it changes. I wanted to have that notification so I could test, update & release Geierlein.

The thing is that that worked really well (I had it around for almost a year now), … nevertheless I decided to cut down (many) self-hosted projects (saving time on hosting, constantly updating, etc. to have more time for honing my software development skills). But I still needed those notifications so I had to find an alternative … and I found it in AWS Lambda.

(actually I’ve been interested in Lambda since they had it in private beta, I even applied for the beta program, … but never really used it as I had no idea what to do with it back then)

So my all AWS services approach involves

  • a CloudWatch scheduler event that triggers AWS Lambda
  • AWS Lambda doing the web scraping & flow control
  • S3 to persist the last known major version number
  • SES (simple e-mail services) to send the e-mail notification

I’ve used S3 and configured stuff with IAM before, SES is really straight forward, so actually only Lambda was new to me. Then the learning curve is okayish, as the AWS documentation guides into the right direction and Google + StackOverflow helps for the rest. If you’ve never used AWS services before, then the learning curve might be a bit steeper (mainly because of IAM) …

All in all I got it working within two hours or maybe three … and it just works now :)
… without nothing for me to host anymore
… and actually everything for free (as Lambda & SES stay within free usage quota and the single S3 object’s cost is negligible)

In case you want to follow along, here’s my …

step by step guide

under IAM service …

  • create AWS user with API keys to do local development (using AWS root account is undesirable)
  • grant that user the necessary permissions
    • managed policy AWSLambdaFullAccess (that includes full access to logs & S3)
    • yet it doesn’t include the right to send e-mails via SES, therefore create a user policy like
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt1459031930000",
            "Effect": "Allow",
            "Action": [
                "ses:SendEmail"
            ],
            "Resource": [
                "*"
            ]
        }
    ]
}

under S3 service …

  • create a new Bucket to be used with Lambda, I picked lambdabrain (so pick something else)

again under IAM service …

  • create an AWS role, to be used by our lambda function later on
  • choose AWS Lambda from AWS Service Roles in Step 2 of the assistant, then attach AWSLambdaBasicExecutionRole policy
  • do not attach the AWSLambdaExecute managed policy as it includes read/write access to all object of all your S3 buckets
  • last not least add a custom Role Policy to grant rights on the newly created S3 Bucket + ses:SendEmail with
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::lambdabrain"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::lambdabrain/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "ses:SendEmail"
            ],
            "Resource": [
                "*"
            ]
        }
    ]
}

… turns out the s3:ListBucket is actually needed to initially create the persistance object.

under AWS SES

  • validate your mail domain (so you can send mails to yourself)
  • if you would like to send mails to other domains you also need to request a limit increase also

After setting up AWS CLI finally it’s time to (locally) create a Node.js application (the Lambda function to be).

  • create a new folder
  • … and an initial package.json file like this:
{
  "name": "firefox-version-notifier",
  "version": "0.0.1",
  "description": "firefox version checker & notifier",
  "main": "index.js",
  "dependencies": {
    "promise": "^7.1.1",
    "scrape": "^0.2.3"
  },
  "devDependencies": {
    "node-lambda": "^0.7.1",
    "aws-sdk": "^2.2.47"
  },
  "author": "Stefan Siegl <stesie@brokenpipe.de>",
  "license": "MIT"
}

I used promises throughout my code, and scrape to do the web scraping.

  • aws-sdk is actually needed in production as well, still I declared it under devDependencies as it is available globally on AWS Lambda and hence need not be included in the ZIP archive upload later on.
  • node-lambda is a neat tool to assist development for AWS Lambda

  • run npm install and ./node_modules/.bin/node-lambda setup
  • configure node-lambda through the newly created .env file as needed
    • AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY of the IAM user from above
    • AWS_ROLE_ARN is the full role ARN (from above)
    • AWS_HANDLER=index.handler (index because of the index.js file name, handler will be the exported name in there)

Here’s my straight-forward code, … definitely deserves some more love, yet it’s just a better shell script …
Adapt the name of the S3 bucket and the e-mail addresses (sender and receiver) of course.

var Promise = require('promise');
var AWS = require('aws-sdk');
var scrape = Promise.denodeify(require('scrape').request);

var brain = new AWS.S3({ params: { Bucket: 'lambdabrain' }});
var ses = new AWS.SES();

function getCurrentFirefoxVersion() {
	return scrape('https://www.mozilla.org/en-US/firefox/new/')
		.then(function($) {
			var currentFirefoxVersion = $('html')[0].attribs['data-latest-firefox'].split(/\./)[0];
			console.log('current firefox version: ', currentFirefoxVersion);
			return currentFirefoxVersion;
		});
}

function getBrainValue(key) {
	return new Promise(function(resolve, reject) {
		brain.getObject({ Key: key })
		.on('success', function(response) {
			resolve(response.data.Body.toString());
		})
		.on('error', function(error, response) {
			if(response.error.code === 'NoSuchKey') {
				resolve(undefined);
			} else {
				reject(error);
			}
		})
		.send();
	});
}

function setBrainValue(key, value) {
	return new Promise(function(resolve, reject) {
		brain.putObject({ Key: key, Body: value })
		.on('success', function(response) {
			resolve(response.requestId);
		})
		.on('error', function(error) {
			reject(error);
		})
		.send();
	});
}

function sendNotification(subject, message) {
	return new Promise(function(resolve, reject) {
		ses.sendEmail({
			Source: 'stesie@brokenpipe.de',
			Destination: { ToAddresses: [ 'stesie@brokenpipe.de' ] },
			Message: {
				Subject: { Data: subject },
				Body: {
					Text: { Data: message }
				}
			}
		})
		.on('success', function(response) {
			resolve(response);
		})
		.on('error', function(error, response) {
			console.log(error, response);
			reject(error);
		})
		.send();
	});
}

exports.handler = function(event, context) {
	Promise.all([
		getCurrentFirefoxVersion(),
		getBrainValue('last-notified-firefox')
	])
	.then(function(results) {
		if(results[0] === results[1]) {
			console.log('Firefox versions remain unchanged');
		} else {
			return sendNotification('New Firefox version!', 'Version: ' + results[0])
				.then(function() {
					return setBrainValue('last-notified-firefox', results[0]);
				});
		}
	})
	.then(function(results) {
		context.succeed("finished");
	})
	.catch(function(error) {
		context.fail(error);
	});
};
  • exports.handler function initially creates an all-promise that (in parallel)
    • scrapes the Firefox website
    • fetches the S3 object
  • then compares the two and (if different) …
    • creates another promise to send a notification
    • … (if successful) then updates the S3 object
  • and finally marks the lambda function as successful (via context.succeed)

I really like how the promises allow to easily parallelize stuff as well as make things depend on another (S3:PutObject on SES:SendMail)

Run ./node_modules/.bin/node-lambda run to test the script locally. If it works run ./node_modules/.bin/node-lambda deploy to upload.

Back in the AWS console, now under “Lambda”

  • you should see the new function, click it and hit “Test” to try it on AWS.
  • if it does, choose “Publish new version” from the “Actions”.
  • under “Event sources” add a new event source, choose “CloudWatch Events - Schedule” and choose an interval (I picked daily)

V8Js: improved fluent setter performance

After fixing V8Js’ behaviour of not retaining the object identity of passed back V8Object instances (i.e. always re-wrapping them, instead of re-using the already existing object) I tried how V8Js handles fluent setters (those that return $this at the end).

Unfortunately they weren’t handled well, that is V8Js always wrapped the same object again and again (in both directions). Functionality-wise that doesn’t make a big difference since the underlying object is the same, hence further setters can still be called.

But still the wrapping code takes some time – with simple “just store that” setters it is approximately half of the time. Here is a performance comparison of calling 200000 minimalist fluent setters one after another:

performance comparison of old & new handling

Besides the performance gain it also keeps object identity intact, however I assume noone ever stores the result of such a setter to a variable and compares it against another object. So that isn’t a big deal by itself.

The behaviour is changed with pull requests #220 and #221.

V8PromiseFactory

V8 has support for ES6 Promises and they make a clean JS-side API. So why not create promises from PHP, (later on) being resolved by PHP?

V8Js doesn’t allow direct creation of JS objects from PHP-code, a little JS-side helper needs to be used. One possibility is this:

class V8PromiseFactory
{
    private $v8;

    public function __construct(V8Js $v8)
    {
        $this->v8 = $v8;
    }

    public function __invoke($executor)
    {
        $trampoline = $this->v8->executeString(
            '(function(executor) { return new Promise(executor); })');
        return $trampoline($executor);
    }
}

… it can be used to construct an API method that returns a Promise like this:

$v8 = new V8Js();
$promiseFactory = new V8PromiseFactory($v8);

$v8->theApiCall = function() use ($promiseFactory) {
    return $promiseFactory(function($resolve, $reject) {
        // do something (maybe async) here, finally call $resolve or $reject
        $resolve(42);
    });
};

$v8->executeString("
    const p = PHP.theApiCall();
    p.then(function(result) {
        var_dump(result);
    });
");

this code

  • initializes V8, V8Js and the V8PromiseFactory first
  • then attaches an API call named theApiCall, that uses $promiseFactory and passes it an executor that immediately resolves to the integer 42.
  • then executes some JavaScript code that uses the theApiCall function and attaches a then function that simply echos the value (42)

V8PromiseFactory::__invoke should cache $trampoline if it is used to create a lot of promises.

This code requires V8Js with pull request #219 applied to function properly.

thoughts on phpspec

As I’ve recently been poked whether I had used phpspec and I had to negate, today I finally gave it a try (doing the Bowling Kata) …

phpspec has some class and method templating built into it.  If for example a test fails due to a missing function, it asks whether it should create one (that does nothing at all). This is nice but IMHO breaks the workflow a bit as you have to move the cursor to the terminal window and answer the question. You don’t just Shift+F10, see “red” in the panel and then hit Alt+Enter in PhpStorm and choose to create the method (which is my way of working with phpunit).

I like the well readable test code that can be written with it like

$this->getScore()->shouldReturn(150)

… yet that code shows also what I hate about it. Since $this actually is the test-class, having to call the message to test on it feels strange (or even wrong) and also phpstorm has no support for that … so no auto-completion here.

Calling methods of the SUT directly on $this gets even more messy once you add test helper methods like

function it_grants_spare_bonus()
{
    $this->rollSpare();
    $this->roll(5);
    $this->rollMany(17, 0);
    
    $this->getScore()->shouldBe(20);
}

… here only roll is a method of the SUT, rollSpare and rollMany are just helper methods.

After all I’m still torn, I like the readability, but the rest still feels strange and I miss native support in PhpStorm.

happy & lucky numbers

The other day I paired with the guys from @solutiondrive and @niklas_heer, we had a fun evening learing about happy numbers, shared PhpStorm knowledge, tried Codeception etc. Actually we didn’t even finish the “Happy Numbers” Kata, since we only wrote the classifying routine, not the loop generating the output.

On my way home I kept googling and also found out about Lucky Numbers. Lucky numbers are natural numbers, recursively filtered by a sieve that eliminates numbers based on their position (where the second number tells the elimination offsets).

So I immediately came up with another Kata: generating those numbers.
My constraint: no upper limit, i.e. use PHP’s Generator instead

… so I came up with the idea to implement the sieve itself as a Generator, that reads from an injected Generator, filters as needed and yields the result. The first “sieve generator” is fed from another generator that simply yields all natural numbers. The second one is fed from the first one and so on. The generator into generator injection is handled by yet another generator … turn’s out: it works, but doesn’t look so nice.
The outer generator cannot simply inject generators endlessly (since they are actually instanciated), so injection has to be deferred - that however dilutes the self-contained sieve generator :-(

Anyways it was a good exercise on PHP’s generators. I think I’ll give it another try soon, again with generators yet another approach.