Secure Real-Time WebSockets With RabbitMQ

I wanted to add real-time notification support to my website.   Well not just my website but all my other native applications (Desktop/iPhone/Android).  I also needed to secure it with user authentication and authorization.

I started with a single Ruby on Rails application.  Rails has convenient built-in WebSocket support via Action Cable.  One problem with Action Cable is that it doesn’t scale.  There is a good article describing the pros and cons of Action Cable here:  https://www.ably.io/blog/rails-actioncable-the-good-and-the-bad

At LayerKeep we are using a micro-service architecture where we have different services written in different languages. Currently our inter-service communication is done via RabbitMQ.

I think RabbitMQ is awesome and when it comes to routing messages, it’s the best.  It’s also built on top of Erlang so it’s very fast. It has a plugin architecture that allows you to enable many different features.  And it’s very simple to use.  Since we are already using RabbitMQ for our inter-service communication I thought it would be nice to be able to use the same backend for handling our WebSocket connections.   

Rabbit has a nice plugin just for that using Web STOMP  (https://www.rabbitmq.com/web-stomp.html)

(* You can also use MQTT protocol instead of STOMP using https://www.rabbitmq.com/web-mqtt.html) .

Since this plugin ships in the core distribution, simply enable it with:

rabbitmq-plugins enable rabbitmq_web_stomp

Depending on how you configure stomp you could connect directly to it.  I personally like to proxy my requests through nginx.  If you are using nginx then something as simple as a location path like this should work.

location /ws {
  proxy_set_header Host $http_host;
  proxy_http_version 1.1;
  proxy_set_header Upgrade $http_upgrade;
  proxy_set_header Connection "upgrade";
  proxy_pass     http://your_rabbitmq_host:15674;
}

Then for you website you can find a nice javascript client to connect with stomp.  I personally use https://www.npmjs.com/package/webstomp-client.  Seems to work pretty well.  I’m also using React so I wrapped it in another component to handle all the connection and subscription events.   The nice thing about having the proxy is I don’t have to worry about changing the url or port.

import webstomp from 'webstomp-client';
setupWebSocket() {
   var protocol = document.location.protocol == "https:" ? 'wss' : 'ws';
   var wspath = `${protocol}://${document.location.hostname}/ws`
   var client = webstomp.client(wspath, {debug: false});   
   client.ws.onopen = this.onOpen
   this.setState( { client: client })
}

If you just want to receive transient notifications while connected, you can connect using a temporary queue.  As soon as the connection is disconnected the queue will be removed.  This means that if a notification is sent when a user refreshes the page the message wont be sent to the user.

You can also make the queue permanent so if messages come in while the user isn’t connected they will be persisted and as soon as the user connects they will receive all the previous messages.

What about Authentication/Authorization?

I really liked the idea of using Rabbit but needed to figure out how to handle authentication and authorization.  

RabbitMQ has a default internal user/password authentication that uses its own internal user database.  We don’t want to have to manage all ours users in multiple places just for queues so we need something dynamic.  Luckily there is a RabbitMQ plugin for handling auth using http (https://github.com/rabbitmq/rabbitmq-auth-backend-http).

When you try to connect to Rabbit with a username/password it will make a request to your backend server to authenticate that username/password.  This is great but I don’t want the user to have to enter their username and password again.  I figured authenticating using an OAuth token could work well.

RabbitMQ has auth_backends that it will try in order.  For example:

auth_backends.1.authn = internal
auth_backends.1.authz = rabbit_auth_backend_ip_range
auth_backends.2       = rabbit_auth_backend_http
auth_http.http_method   = post
auth_http.user_path     = http://auth_api/auth/user
auth_http.vhost_path    = http://auth_api/auth/vhost
auth_http.resource_path = http://auth_api/auth/resource
auth_http.topic_path    = http://auth_api/auth/topic
This configuration says that when a user tries to connect it will lookup the username and password in its internal database first.  If the user exists and is authenticated it will then check the IP address to see if the request is coming from an authorized IP address.  (That part isn’t needed but you can find it here: https://github.com/gotthardp/rabbitmq-auth-backend-ip-range)
If the user isn’t authed via auth_backends.1,  it will go to auth_backends.2.
By only specifying auth_backends.2, that means it will go to http backend for both authentication and authorization.
Now that we have the configured our backend, we need to go add those routes.
Depending on how complicated your authorization logic is, you might want a separate route for all them or you can add a single route.  In Rails it would be something like this:
post '/auth/:kind', to: 'rabbit#auth'
Then add your handler which should always return a 200 status with the permission (“allow” or “deny”)
def auth
  // Get the user

  user = User.find_by(username: params["username"])
  render status: :ok, json: "deny" and return unless user

  // First call will be to /auth/user

  if params["kind"] == "user"
    token = user.access_tokens.where(token: params["password"]).first
    if token.nil? || !token.accessible?
      permission = "deny"
    end
  else
    /***
     *** Handle all the other authorizations your app
     *** requires something like
     ***/

    if params["resource"] == "exchange" and params["permission"] != "read"
      permission = "deny"
    end
  end
  render status: :ok, json: permission and return
end

Look at the http backend plugin for all the other parameters.

You can find a docker image for RabbitMQ with http auth config here:  https://github.com/frenzylabs/rabbitmq.

Setting Up a Ceph Filesystem with Kubernetes on DigitalOcean

Recently my company created an application for managing 3D printing projects, profiles, and slices. Check it out at layerkeep.com.

We wanted users to be able to keep track of all their file revisions and also be able to manage the files without having to go through the browser. To accomplish this, we decided to use Git which meant we needed a scalable filesystem.

The first thing we did is setup a Kubernetes cluster on DigitalOcean.

Currently DigitalOcean only provides Volumes that are ReadWriteOnce. Since we have multiple services that need access to the files (api, nginx, slicers), we needed to be able to mount the same volume with ReadWriteMany.

I decided to try s3fs with DigitalOcean Spaces since they are S3-compatible object stores.  I setup the CSI from https://github.com/ctrox/csi-s3. I tried both the s3fs and goofys mounter. Both worked and both were way too slow.  Most of our APIs require accessing the filesystem multiple times and each access took between 3-15 seconds so I moved on to Ceph.

Ceph Preparation:

There is a great storage manager called Rook (https://rook.github.io/) that can be used to deploy many different storage providers to Kubernetes.
** Kubernetes on DigitalOcean doesn’t support FlexVolumes so you need to use CSI instead.

Hardware Requirements.
You can check the ceph docs to see what you might need. http://docs.ceph.com/docs/jewel/start/hardware-recommendations/#minimum-hardware-recommendations

Create the Kubernetes Cluster

Follow the directions here to create the cluster: https://www.digitalocean.com/docs/kubernetes/how-to/create-clusters/

** Initially I tried a 3 node pool each with 1 CPU and 2GB of memory but it wasn’t enough. It needed more CPU on startup. I changed each node to have 2 CPUs and 2 GBs of memory which worked.

We’ll make sure to keep all ceph services constrained to this pool by naming it “storage-pool” (or whatever name you want) and adding a node affinity using that name later.

Cluster Access

Make sure you followed DO directions to accessing the Cluster with kubectl. (https://www.digitalocean.com/docs/kubernetes/how-to/connect-to-cluster/)

You also might want to add a Kubernetes Dashboard. (https://github.com/kubernetes/dashboard)

SSH:
Right now it doesn’t look like you can ssh into the droplets that DigitalOcean creates when you create a node pool. I wanted to have access just in case so I went to the droplets section and reset the root password for each of them. I was then able to add my ssh key and disable root login. I recommend doing this before adding any services.

Create Volumes
Go to the Volumes section in DigitalOcean dashboard. We want to create a volume for each node in the node pool we just created. Don’t format it. Then attach it to the correct droplet. Remember that volumes can only be increased in size (not decreased) without having to create a new one.

Create the Ceph Cluster

Clone down the Rook repository or just copy down the ceph directory from: https://github.com/rook/rook/tree/release-1.0/cluster/examples/kubernetes/ceph

cd cluster/examples/kubernetes/ceph

Modify the cluster.yaml file.

This is where we’ll add the node affinity to run the ceph cluster only on nodes with the “storage-pool” name.

placement:
  all:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
        - key: doks.digitalocean.com/node-pool
          operator: In
          values:
          - storage-pool
    podAffinity:
    podAntiAffinity:
    tolerations:
    - key: storage-pool
      operator: Exists

 

There are also other configs that are commented out that you might need to change. For example, if your disks are smaller than 100 GB you’ll need to uncomment the ‘databaseSizeMB: “1024”‘.

Modify the filesystem.yaml file if you want. (Filesystem Design)
Once you’re done configuring you can run:


kubectl apply -f ceph/common.yaml
kubectl apply -f ceph/csi/rbac/cephfs/
kubectl apply -f ceph/filesystem.yaml
kubectl apply -f ceph/operator-with-csi.yaml
kubectl apply -f ceph/cluster.yaml

If you want the ceph dashboard you can run:
kubectl apply -f ceph/dashboard-external-https.yaml

Your operator should create your cluster. You should see 3 managers, 3 monitors, and 3 osds. Check here for issues: https://rook.github.io/docs/rook/master/ceph-common-issues.html

Deploy the CSI

https://rook.github.io/docs/rook/master/ceph-csi-drivers.html

We need to create a secret to give the provisioner permission to create the volumes.

To get the adminKey we need to exec into the operator pod. We can print it out in one line with:

POD_NAME=$(kubectl get pods -n rook-ceph | grep rook-ceph-operator | awk '{print $1;}'); kubectl exec -it $POD_NAME -n rook-ceph ceph auth get-key client.admin

Create a secret.yaml file:

apiVersion: v1
kind: Secret
metadata:
  name: csi-cephfs-secret
  namespace: default
data:
  # Required if provisionVolume is set to true
  adminID: admin
  adminKey: {{ PUT THE RESULT FROM LAST COMMAND }}

Create the CephFS StorageClass.

We’ll need to modify the example storageclass in ceph/csi/example/cephfs/storageclass.yaml.

The storageclass.yaml file should look like:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: csi-cephfs
provisioner: cephfs.csi.ceph.com
parameters:
  # Comma separated list of Ceph monitors
  # if using FQDN, make sure csi plugin's dns policy is appropriate.
  monitors: rook-ceph-mon-a.rook-ceph:6789,rook-ceph-mon-b.rook-ceph:6789,rook-ceph-mon-c.rook-ceph:6789

  # For provisionVolume: "true":
  # A new volume will be created along with a new Ceph user.
  # Requires admin credentials (adminID, adminKey).
  # For provisionVolume: "false":
  # It is assumed the volume already exists and the user is expected
  # to provide path to that volume (rootPath) and user credentials (userID, userKey).
  provisionVolume: "true"

  # Ceph pool into which the volume shall be created
  # Required for provisionVolume: "true"
  pool: myfs-data0

  # The secrets have to contain user and/or Ceph admin credentials.
  csi.storage.k8s.io/provisioner-secret-name: csi-cephfs-secret
  csi.storage.k8s.io/provisioner-secret-namespace: default
  csi.storage.k8s.io/node-stage-secret-name: csi-cephfs-secret
  csi.storage.k8s.io/node-stage-secret-namespace: default

reclaimPolicy: Retain
allowVolumeExpansion: true

Change the storage class name to whatever you want.

*If you changed metadata.name in filesystem.yaml to something other than “myfs” then make sure you update the pool name here.

Create the PVC:

Remember that Persistent Volume Claims are accessible only from within the same namespace.

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: my-pv-claim
spec:
  storageClassName: csi-cephfs
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 10Gi

Use the Storage

Now you can mount your volume using the persistent volume claim you just created in your Kubernetes resource.  An example Deployment is:

apiVersion: apps/v1
kind: Deployment
metadata:
 name: webserver
 namespace: default
 labels:
   k8s-app: webserver
spec:
 replicas: 2
 selector:
   matchLabels:
     k8s-app: webserver
 template:
   metadata:
     labels:
       k8s-app: webserver
   spec:
     containers:
     - name: web-server
       image: nginx
       volumeMounts:
       - name: my-persistent-storage
         mountPath: /var/www/assets
     volumes:
     - name: my-persistent-storage
       persistentVolumeClaim:
         claimName: my-pv-claim

 

Both deployment replicas will have access to the same data inside /var/www/assets.

ADDITIONAL TOOLS

You can also test and debug the filesystem using the Rook toolbox.  (https://rook.io/docs/rook/v1.0/ceph-toolbox.html).

First start the toolbox with:  kubectl apply -f ceph/toolbox.yaml

Shell into the pod.

TOOL_POD=$(kubectl get pods -n rook-ceph | grep tools | head -n 1 | awk '{print $1;}'); kubectl exec -it $TOOL_POD -n rook-ceph /bin/bash

Run Ceph commands:  http://docs.ceph.com/docs/giant/rados/operations/control/

Validate the filesystem is working by mounting it directly into the toolbox pod.
From: https://rook.io/docs/rook/v1.0/direct-tools.html

 

# Create the directory
mkdir /tmp/registry

# Detect the mon endpoints and the user secret for the connection
mon_endpoints=$(grep mon_host /etc/ceph/ceph.conf | awk '{print $3}')
my_secret=$(grep key /etc/ceph/keyring | awk '{print $3}')

# Mount the file system
mount -t ceph -o mds_namespace=myfs,name=admin,secret=$my_secret $mon_endpoints:/ /tmp/registry

# See your mounted file system
df -h

Try writing and reading a file to the shared file system.

echo "Hello Rook" > /tmp/registry/hello
cat /tmp/registry/hello

# delete the file when you're done
rm -f /tmp/registry/hello

Unmount the Filesystem

To unmount the shared file system from the toolbox pod:

umount /tmp/registry
rmdir /tmp/registry

No data will be deleted by unmounting the file system.

Monitoring

Now that everything is working you should add monitoring and alerts.

You can add the Ceph dashboard and/or Prometheus/Grafana to monitor your filesystem.
http://docs.ceph.com/docs/master/mgr/dashboard/
https://github.com/rook/rook/blob/master/Documentation/ceph-monitoring.md

Include System Libraries Using Swift Package Manager Or CocoaPods

I’m currently using swift package manager to build a framework for an iOS project.  Why? Because I like the clean, modular approach, no need to have an Xcode project and I find it faster for CI testing.

Unfortunately swift package manager doesn’t work for the iOS project.  And since my team is already familiar with CocoaPods that is what the iOS project is using.

Now I will explain how I included a system library inside a Swift framework that is then used in an iOS project with CocoaPods.

I’ll show you how I setup CommonCrypto to use Swift Package Manager and CocoaPods.

For Swift Package Manager:

  • A. Create a git repo for the swift package wrapper around the system library.   
    1. Add a module.modulemap file to the repo and add the system header you are wrapping.
      module CCommonCrypto [system] {
      header "/usr/include/CommonCrypto/CommonCrypto.h"
      export *
      }
    2.  Add Package.swift file to repo
      import PackageDescription
      let package = Package(
      name: "CCommonCrypto"
      )
    3.  Commit the files and add a tag to them (Swift Packages need a tag to be used)
      git tag 0.0.1
      git push origin 0.0.1
      (you can overwrite previous tag by using the "-f" flag on both of those commands)
  • B. Create another repo for the Swift Package that will use the system package wrapper.
    1. Create the Package.swift file and add the repo created in step 1 as a dependency.

      import PackageDescription
      let package = Package(
      name: "CommonCrypto",
      targets: [
      Target(name: "CommonCrypto"),
      ],
      dependencies: [
      .Package(url: "https://github.com/kmussel/ccommoncrypto.git", "0.0.1"),
      ]
      )
    2. Add the “Sources” folder and a subfolder named the same thing you named your target in the Package.swift file
      Sources -> CommonCrypto
    3. Inside the subfolder (CommonCrypto in this case) add all your swift files that you want to use.
      1.  In each file that you want to use the dependency package import that package using the name from its Package.swift file
        import CCommonCrypto

Now you can create another Swift Package and include Repo 2 as a dependency and when you run “swift build” it will download and include the dependencies

You can also run “swift package generate-xcodeproj” if you want to use the Xcode project.

Getting the Swift Package to Work with CocoaPods:

  • C.  Repo 1 is not needed.  We just need to update Repo 2 with necessary CocoaPod files.
    1.  Add another subfolder under the Subfolder we created in Step B.2. The name of the subfolder should match the name of the Package from step A.2.  This is so CocoaPods will recognize the import statement from step B.3.1.
    2. Copy the module.modulemap file from step A.1. into this subfolder.
    3. Create another subfolder in the root directory.  It can be named anything you want but I called it “CocoaPods” since it is only used for that.

    4. Inside the “CocoaPods” directory you will create multiple subfolders for each target you want this package to target.
      ie. iphoneos, iphonesimulator, macosx, etc

    5. Create the files module.map under each of the subfolders you just created pointing to the correct system headers
      CocoaPods -> iphoneos -> module.map
      module CCommonCrypto [system] {
      header "/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS10.1.sdk/usr/include/CommonCrypto/CommonCrypto.h"
      export *
      }
    6. Create the Podspec file for it
      1. Our source_files will just be the .swift files inside the Sources directory.  These files are added to the Xcode project.
        s.source_files = "Sources/**/*.swift"
      2. Normally any file not in the source_files will be removed but we need the project to be able to access our “CocoaPods” directory to know which files to include.  To keep the “CocoaPods” directory without adding it to the project we use the “preserve_paths” command to keep the “CocoaPods” directory.
        s.preserve_paths = 'CocoaPods/**/*'
      3. We then tell Xcode where the include paths are for each sdk.  CocoaPods installs it in the PODS_ROOT directory and under the subdirectory named the same as the name of this Podspec.
        s.pod_target_xcconfig = {
        'SWIFT_INCLUDE_PATHS[sdk=iphoneos*]' => '$(PODS_ROOT)/CommonCrypto/CocoaPods/iphoneos',
        'SWIFT_INCLUDE_PATHS[sdk=iphonesimulator*]' => '$(PODS_ROOT)/CommonCrypto/CocoaPods/iphonesimulator',
        'SWIFT_INCLUDE_PATHS[sdk=macosx*]' => '$(PODS_ROOT)/CommonCrypto/CocoaPods/macosx'
        }
      4. The next thing is that the header file in each of the module.map files probably won’t be the same for each user.  We need to change it for the user when they install via pod install.  We create a script to do this and execute it using the prepare_command in cocoa pods.  For example, my path is “/Applications/Xcode-beta.app/Contents/Developer/” so this script replaces the default to that.  I grabbed the script from this site. But should probably should modify the script to handle sdk versions too.
        s.prepare_command = <<-CMD
        ./CocoaPods/injectXcodePath.sh
        CMD

 Now you can include Repo 2 in the Podfile of your iOS project.

The entire Podspec:

You can see the entire CommonCrypto example here:

https://github.com/kmussel/commoncrypto
https://github.com/kmussel/ccommoncrypto

Google Universal Analytics and Tag Manager with Enhanced Ecommerce

Google Analytics! It used to be a simple add this snippet of code to your page and you’re finished. Now depending on the options you want there could be a lot more work to do. You could be using classic google analytics or universal analytics.  You could be using the ecommerce plugin or enhanced ecommerce or none.  You could be using the data layer or macros. You could be using any combination of those with google tag manager.  And depending on which combination you use you will have to code it differently.

I’ll show you how I setup analytics using Universal Analytics and Tag Manger with Enhanced Ecommerce and the Data Layer.

Finding the correct documentation for the analytics combination I’m using was frustrating.  Here is a list of the docs that were helpful to me.

First off if you haven’t used Google Tag Manager I would read:
Getting Started and
How It Works

Basically once the javascript snippet is deployed to the site, it allows non-developers to manage what data they want to collect from the site without involving the developers.

For example.   Let’s say a website has a button to login.  This button has an ID of “login-btn”.  Now a user can use the tag manager to add a tag called “Log In” with a rule that when a user clicks on an element with an ID of “login-btn” it will fire the tag.  The rule would look like this: {{element id}} equals login-btn.  Depending on the type of tag you use you might also need to add to the rule {{event}} equals gtm.click.

Now your site will start collecting data every time a user clicks on the login button without your developer having to make any changes to the code.

** Once you cross over to the Ecommerce world, a developer is going to be required.

The Basic Steps:

Google Tag Manager

The data that is used by the Tag Manager and then sent to google analytics is retrieved from the Data Layer or Macros.   The recommended approach is to use the Data Layer.

I recommend going over the development docs if you haven’t already. https://developers.google.com/tag-manager/devguide

In their docs they mention two ways to populate the data layer and fire the tags:

  1. Declare all needed information in the data layer above the container snippet
  2. Use HTML Event Handlers

If you have the traditional Multi Page Application you most likely will use option 1 .  If you have a Single Page Application (SPA) you will need to use option 2.   Personally I’ve been using this on a SPA with Angular so I never used option 1.

If you are using option 1, you would create a tag in Google Tag Manager with these attributes:

Tag type : Universal Analytics
Track type : Pageview
Enable Enhanced Ecommerce Features: true
Use Data Layer: true
Basic Settings – Document Path: {{url path}}
Firing Rule: {{event}} equals gtm.js

In this case you would make sure the all the data was in the dataLayer before the snippet.

dataLayer = [{
 'ecommerce': {
  'impressions': [{
   'name': productObj.name,                      // Name or ID is required.
   'id': productObj.id,
   'price': productObj.price,
   'brand': productObj.brand,
   'category': productObj.cat,
   'variant': productObj.variant,
   'list': 'Search Results',
   'position': 1
  }]
 }
}];

For option 2 you would create a tag like this:

Tag type : Universal Analytics
Track type : Event
Event Category: Ecommerce
Event Action: Product Click
Enable Enhanced Ecommerce Features: true
Use Data Layer: true
Basic Settings – Document Path: {{url path}}
Firing Rule: {{event}} equals productClick

Then in your javascript you would send the data by pushing it to the dataLayer object like this:

dataLayer.push({
'event': 'productClick',
'ecommerce': {
  'click': {
    'actionField': {'list': 'Search Results'},      // Optional list property.
    'products': [{
      'name': productObj.name,                      // Name or ID is required.
      'id': productObj.id,
      'price': productObj.price,
      'brand': productObj.brand,
      'category': productObj.cat,
      'variant': productObj.variant
    }]
  }
}
});


I recommend using a generic event to help keep things manageable. I’m using the angular javascript framework with angulartics (http://luisfarzati.github.io/angulartics/).  The readme here (https://github.com/luisfarzati/angulartics#for-google-tag-manager)  explains what tags, rules and macros need to be setup.  Even if you aren’t using angular the setup is the same for a generic event.

Also for both of those tags, make sure you check both “Enable Enhanced Ecommerce Features”  and  “Use Data Layer”.

** If you are using angulartics you’ll need to make sure it handles ecommerce. You just need to make sure one of the top level keys is ‘ecommerce’.   I just over wrote the module and did this:

$analyticsProvider.registerEventTrack(function(action, properties){
  var dataLayer = window.dataLayer = window.dataLayer || [];
  var data = {
    'event': 'interaction',
    'target': properties.category,
    'action': action,
    'target-properties': properties.label,
    'value': properties.value,
    'interaction-type': properties.noninteraction
  };
  if(properties.ecommerce)
    data['ecommerce'] = properties.ecommerce;

  dataLayer.push(data);
});

Make sure you look at the docs for google tag enhanced ecommerce.  It will show you what attributes to use when you push your data to the data layer.  https://developers.google.com/tag-manager/enhanced-ecommerce

Viewing the data you sent in google analytics.

Go to Reporting and go to the Conversions section.  Under there you’ll see the Ecommerce reports.

* When you push your products to the dataLayer the id field maps to the Product SKU.

Dynamic Remarketing

In Google Tag Manager for each tag you want to use this feature for check the “Enable Display Advertising Features” box.
In Google Analytics, go to the Admin section. Select the property you want and then click on “Dynamic Attributes”. Then for Step 2 for Product ID select “Product SKU”. https://support.google.com/analytics/answer/6002231

Debugging Locally

Click on each tag you created to edit it. Under the “More settings” click on “Cookie Configuration”. For Cookie Domain type “none”.
You can also setup another view in Google Analytics and in Tag Manager you can create a macro for the Tag’s Tracking Code like here.

A couple useful articles about Google Tag Manager Macros:
http://www.simoahava.com/analytics/macro-guide-google-tag-manager/
http://www.simoahava.com/analytics/macro-magic-google-tag-manager/

I know this can be confusing and there is a lot of questions that I didn’t answer. The biggest thing for me is to know where I can find the answers so hopefully this article along with the links I posted will help you forge ahead through the world of google analytics.

Inter-Service Communication using Client Certificate Authentication

I love the Service Oriented Architecture. But like all things, security is needed. In this case to make sure that one service has permission to talk to another service. There are a few different ways to obtain this security but I really like using SSL certificates. It’s very simple to add other services and your webserver (apache, nginx) will handle the validation for you.

I wrote an article on getting this setup here: http://www.sitepoint.com/inter-service-communication-using-client-certificate-authentication/

AngularJS Navigation Menu

Once again I’m in need of creating a navigation menu with drop downs. This time I’ve been working with Angular and using Foundation 4 for styling.

The Final Output

Because who doesn’t like dessert first

Here is the fiddle. http://jsfiddle.net/kmussel/evXFZ/

The Styling

The good news is Foundation basically has all the styles already done for you under the class of “top-bar” and “top-bar-section”. But since I needed to have this nav under the top bar I just copied most of the top-bar styles and put it under the “nav-menu” class. You can see some of the styles I copied over at the bottom of the css panel in the fiddle.

You can also test out using only foundation’s top-bar styling by uncommenting out the html in the fiddle and commenting out the other “nav” element. And then change the directive name from “navMenu” to “anavMenu”. Make sure the result panel is wide enough though cause foundation changes the css on small screen sizes when it comes to the top-bar.
Basically if all you do is include foundation’s css the only thing you have to make sure you do is change your html to have a “top-bar” class and then a nested “top-bar-section” class.

The AngularJS Goodness

If you’ve never worked with angular I definitely recommend it. But it can be frustrating especially dealing with directives (which is what we will be doing here). I think this guy is spot on with how learning angular will make you feel.

So I basically knew what I wanted to be able to do and that is in html write this:

<nav menu-data="menu"></nav>

and it spit out my entire navigation.

And so I could resuse the directive with other controllers I added an attribute to it that would reference the scope variable where the menu data would be which led me to have this:

<nav menu-data="menu" menu-data="menu"></nav>

The first directive I create then is the navMenu one.


app.directive('navMenu', ['$parse', '$compile', function($parse, $compile)
{
    return {
        restrict: 'E', //Element
        scope:true,
        link: function (scope, element, attrs)
        {
            scope.$watch( attrs.menuData, function(val)
            {
                var template = angular.element('<ul id="parentTreeNavigation"><li ng-repeat="node in ' + attrs.menuData + '" ng-class="{active:node.active && node.active==true, \'has-dropdown\': !!node.children && node.children.length}"><a ng-href="{{node.href}}" ng-click="{{node.click}}" target="{{node.target}}" >{{node.text}}</a><sub-navigation-tree></sub-navigation-tree></li></ul>');
               var linkFunction = $compile(template);
               linkFunction(scope);
               element.html(null).append( template );
            }, true );
      }

}]);

In here i create the initial template. The template contains 2 other directives. Angular’s built-in “ng-repeat” and my other “sub-navigation-tree”. This is put inside a watch statement so if the menu changes it will be updated. The template is then compiled and linked to current scope and then appended to the current element.
This then compiles and links all the directives within that.

The Inner Directive handles the actual drop down part.



.directive('subNavigationTree', ['$compile', function($compile)
{
    return {
        restrict: 'E', //Element
        scope:true,
        link: function (scope, element, attrs)
        {
            scope.tree = scope.node;

            if(scope.tree.children && scope.tree.children.length )
            {
                var template = angular.element('<ul class="dropdown "><li ng-repeat="node in tree.children" node-id={{node.' + attrs.nodeId + '}}  ng-class="{active:node.active && node.active==true, \'has-dropdown\': !!node.children && node.children.length}"><a ng-href="{{node.href}}" ng-click="{{node.click}}" target="{{node.target}}" ng-bind-html-unsafe="node.text"></a><sub-navigation-tree tree="node"></sub-navigation-tree></li></ul>');

                var linkFunction = $compile(template);
                linkFunction(scope);
                element.replaceWith( template );
            }
            else
            {
                element.remove();
            }
        }
     };
}]);

Here you see the same thing with the template but it’s referencing itself now. The template is being linked to the current scope. This means that the directives within the template will be run using that scope. So ng-repeat=”node in tree.children” will be looping over scope.tree.children.
The ng-repeat then creates new scopes for each item and gives that scope access to the object through whatever you called it in your ng-repeat. In our case “node”. The scope that is created in the ng-repeat is the one that is passed to our directive.
Then if the current scope has children it does it all again with linking the template up with the current scope. It then replaces the current element with the template or removes it all together if there are no children.

Asynchronous Request within UITableViewCell

The Problem:
I have an image that needs to be loaded for each table cell. Unfortunately this causes the UI to become unresponsive for a couple seconds when you click to go to that table view.

The Solution:
Make the call to get the image data asynchronous.

First in our tableview delegate method, cellForRowAtIndexPath, we’ll use NSURLConnection to create the connection.


NSURLRequest *req = [[NSURLRequest alloc] initWithURL:[NSURL URLWithString:@"http://www.test.com/image_url"]];
NSURLConnection *imgCon = [NSURLConnection alloc];
..... /** Code added later  **/
[imgCon initWithRequest:req delegate:self startImmediately:YES];

There are 2 delegate methods that you need to have to make this work. (There are others that you’ll probably want to implement as well. Apple Docs)


-(void)connection:(NSURLConnection *)connection didReceiveData:(NSData *)data;
-(void)connectionDidFinishLoading:(NSURLConnection *)connection;

The didReceiveData method will probably be called multiple times. So we need to create a variable to keep on appending the data to it. Since we are creating multiple connections we need to map each connection to the data. But we also need to know which table cell we need to update once we have all the data.
So we create 2 NSMutableDictionary objects. One to map the table cells indexPath to the data and one to map the connection to the indexPath.


NSMutableDictionary *indexPathImgData;
NSMutableDictionary *connectionIndexPath;

The problem that arrises from trying to map the connection to the indexPath is that NSURLConnection doesn’t conform to the NSCopying Protocol. To get around this we need to use CFMutableDictionary. When adding values to a CFMutableDictionary “the keys and values are not copied—they are retained” (Apple Docs on CFMutableDictionary)

So to add the values to our NSMutableDictionary we will use the CFDictionaryAddValue function.

Now our cellForRowAtIndexPath method will look like this:


NSURLRequest *req = [[NSURLRequest alloc] initWithURL:[NSURL URLWithString:@"http://www.test.com/image_url"]];
NSURLConnection *imageCon = [NSURLConnection alloc];

CFDictionaryAddValue((CFMutableDictionaryRef)self.connectionIndexPath, imageCon, indexPath);
NSMutableData *imageData = [NSMutableData dataWithCapacity:0];
CFDictionaryAddValue((CFMutableDictionaryRef)self.indexPathImgData, indexPath, imageData);

[imageCon initWithRequest:req delegate:self startImmediately:YES];

And our NSURLConnection Delegate methods will look something like this:


- (void)connection:(NSURLConnection *)connection didReceiveData:(NSData *)data {

    [[self.indexPathImgData objectForKey:[self.connectionIndexPath objectForKey:connection]] appendData:data];
}

-(void)connectionDidFinishLoading:(NSURLConnection *)connection
{
    NSIndexPath *temp = [self.connectionIndexPath objectForKey:connection];
    [self.tableView reloadRowsAtIndexPaths:[NSArray arrayWithObject:temp] withRowAnimation:UITableViewRowAnimationNone];

}

Now in your cellForRowAtIndexPath method you’ll want to do a check if the dictionary object has data for that indexPath and if so display it otherwise you’ll want to create a new connection.

iPhone Push Notifications

iPhone Push Notification Testing:

Testing out the interaction after receiving a remote push notifications on the iphone can be very annoying. Apple does not make it easy to setup remote push notifications. Plus development time is much slower if you try to build and test this interaction while waiting for remote notifications. And unfortunately Apple doesn’t provide a way to test out remote notifications on the simulator.

So instead of using a remote notification I used a local notification for testing this interaction.
In your UIApplication delegate instead of using this delegate method:


- (void)application:(UIApplication *)application 
     didReceiveRemoteNotification:(NSDictionary *)userInfo

I used this:


- (void)application:(UIApplication *)application 
        didReceiveLocalNotification: (UILocalNotification *)notification

You can then schedule a local notification (Apple Docs). This will then simulate receiving a notification when you are in the app.

To simulate the notification when the app isn’t in the foreground I created a local notification inside the delegate method:


-(void)applicationDidEnterBackground:(UIApplication *)application.  

I set it up so as soon as you closed the app the notification would fire showing the default alert box.

Here is the code I used to create the local notification inside the DidEnterBackground method.


NSDate *nowDate = [NSDate date];
NSTimeInterval seconds = 0;
NSDate *newDate = [nowDate dateByAddingTimeInterval:seconds];

UILocalNotification *localNotif = [[UILocalNotification alloc] init];
if (localNotif != nil)
{
  localNotif.fireDate = newDate;
  localNotif.timeZone = [NSTimeZone defaultTimeZone];

  localNotif.alertBody = @"You have new notifications";
  localNotif.alertAction = NSLocalizedString(@"View", nil);

  localNotif.soundName = UILocalNotificationDefaultSoundName;
  localNotif.applicationIconBadgeNumber = 2;

  NSDictionary *infoDict = [NSDictionary dictionaryWithObject:@"2" 
       forKey:@"NewNotifications"];
  localNotif.userInfo = infoDict;

  [[UIApplication sharedApplication] scheduleLocalNotification:localNotif];
  [localNotif release];
}


When you click the “View” button on the alert box, the application will come to the foreground and the didReceiveLocalNotification will be called. You can then do what you need to do based on what data you passed to it via the NSDictionary object in the local notification.

So the main difference in your code between the local and remote notification is that in the local notification you’ll have to access the data you sent to it with [notification userInfo] and in the remote notification, the dictionary object is what is passed to it.

CoreJS

Server-side Javascript using Google’s V8 Engine.

Recently I started working on an open source project with my good friend, Wess Cope, who is more passionate about javascript than anyone I’ve ever met. It was this that drove the want to be able to use javascript for everything. So we built CoreJS which you can find at https://github.com/frenzylabs/CoreJS.

We used http://wiki.commonjs.org/wiki/CommonJS#Low-level_APIs for a reference on what should be implemented in CoreJS.

Aren’t there already server-side javascript frameworks?
The main one out there is nodejs. Overall it’s pretty good but the main issue we had with it is that it forces everything to be asynchronous. Having it be asynchronous is great but we wanted to have the flexibility to be both. There are definitely times when you don’t want something to be asynchronous.
So what we ended up doing is setting up our functions to be asynchronous when there was a callback and synchronous otherwise.

For example our HTTP post request:


//Synchronous
var data = Http.post("/path/to/file", {arg:"arg1", arg2:"arg2"});
//Asynchronously
Http.post("/path/to/file", {arg:"arg1", arg2:"arg2"}, function(data){  
                     print(data); 
           });

CoreJS also utilizes the event based model, using LibEvent, and threading.

It’s definitely new and not complete yet but check it out and give us some feedback. We tried to add some decent documentation on how it all works cause we know it was very annoying to even try and build this with the lack of documentation. https://github.com/frenzylabs/CoreJS/wiki

Crawling Web Pages and Creating Sitemaps

Creating a Sitemap Based on all the Links within a Website

I built this web crawler because I wanted a way to create a sitemap of this website I was building. I know there are a few websites out there that will do this for you but I didn’t want to rely on someone else and I wanted to change a few things. So in order to do this I used php and cURL.
I started out creating a class for the crawler. When I create a new crawler class I pass in the url of the website I want to start with. This also uses cURL to access the webpage and get the content and headers. Inside this class are also methods to get all the links of a page, the page title, the entire content, just the body content, and the headers. But you could easily add more to say grab all the images on a page.

The Crawler Class

  

class Crawler {
  protected $markup='';
  protected $httpinfo='';

  public function __construct($uri, $justheaders=0){
    $output = $this->getMarkup($uri, $justheaders);
    $this->markup = $output['output'];
    $this->httpinfo = $output['code'];
  }

  public function getMarkup($uri, $justheaders) {
    $ch = curl_init($uri);
    curl_setopt($ch, CURLOPT_HEADER, 1);
    if($justheaders){
      curl_setopt($ch, CURLOPT_NOBODY, 1);
      curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
      curl_setopt($ch, CURLOPT_FAILONERROR, 1);
    }
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    curl_setopt($ch, CURLOPT_MAXREDIRS, 5);

    $output['output'] = curl_exec($ch);
    $output['code'] = curl_getinfo($ch);
    curl_close($ch);
    return $output;
  }

  public function get($type){
    $method = "_get_{$type}";
    if (method_exists($this, $method)){
      return call_user_method($method, $this);
    }
  }

  protected function _get_info(){
    return $this->httpinfo;
  }

  protected function _get_links(){
    if(!empty($this->markup)){
      preg_match_all('/<a(?:.*?)href=(["|\'].*?["|\'])(.*?)>(.*?)\<\/a\>/i',
                               $this->markup, $links);
      return !empty($links[1]) ? array_flip(array_flip($links[1])) : FALSE;
    }
  }

  protected function _get_body(){
    if(!empty($this->markup)){
      preg_match('/\<body\>(.*?)\<\/body\>/msU', $this->markup, $body);
      return $body[1];
    }
  }
  protected function _get_content(){
    if(!empty($this->markup)){
      return $this->markup;
    }
  }

  protected function _get_pagetitle() {
    if (!empty($this->markup)){
     preg_match_all('/<title>(.*?)\<\/title\>/si', $this->markup, $pagetitles);
     return !empty($pagetitles[1]) ? $pagetitles[1] : FALSE;
    }
  }
}


After this I create a recursive function that will follow each of the links. Each time I call this function I create a new instance of the Crawler class. If the url isn’t valid I just return. If the url is redirected curl has an option to follow links, the CURLOPT_FOLLOWLOCATION option. Since this is set to on, you need to get the actual url which is contained in the header information. After this I call the get links function. This will return all the unique links on a page. (Calling the array_flip twice makes them unique).

I then get the title tag for each page. This is used when creating the sitemap. I remove all the script tags and all the html tags. The next thing I do is get a base url. Since I’m creating a sitemap of just one website I don’t need any links to external pages. So if any of the links don’t contain this base url I wont follow it.

Now I begin looping through all the links on the page. If its an external link I return. If it is an absolute link and it contains a “/” I replace the whole thing with a slash. If it doesn’t have the slash but is an absolute link then the new val is “”. I do this because I am prepending the base url that we got earlier to it.

I then explode on the “/”. If the first element is empty then I put the base url in there. Otherwise I prepend the url to it. This is done to get the correct link no matter if it is a relative link, root relative, or absolute.
The complete link is formed and checked to see if it already exist in the globallinkarr. If it doesn’t I add it and then begin getting the different levels. Basically everytime there is a “/” in the url then that is a different level. This is used when I am creating the html sitemap. Also to create this array of levels, I have to call array_merge_recursive. Well the regular php function didn’t quite work. If a url has numbers as one of its levels for example blog/2009/12/post then that function would turn the 2009 to its own key. So I needed it to keep the keys the same so I just got another function off of php.net.

The Function to Get all the Links


$dontfollow = array('pdf', 'jpg', 'png', 'jpeg','zip', 'gz', 'tar', 'txt');

function findAllLinks($url){
  global $globallinkarr;
  global $depthlinks;
  global $dontfollow;
  global $pagetitles;
  global $contentarr;

  $crawl = new Crawler($url);
  $info = $crawl->get('info');

  $validcodes = array(200,301,302);
  if(!in_array($info['http_code'], $validcodes))
    return;
  $url = $info['url'];
  $links = $crawl->get('links');
  $title = $crawl->get('pagetitle');
  $title = $title[0];
  $body = $crawl->get('body');
  $count++;

  $content = strip_tags(preg_replace('//msU', '', $body));

  if(!array_key_exists($url, $contentarr))
    $contentarr[$url] = array('title'=>"$title", 'pagecontent'=>"$content");

  if(!count($links) || !is_array($links)) return;
  else{
    if(preg_match('/http(?:s)?:\/\/(.*?)\/(.*)/', $url, $pattern)){
      $baseurl = $pattern[1];
    }else{
      $baseurl = $url;
    }

    foreach($links as $val){
      if(preg_match('/.*?javascript:void\(0\)/', $val) || ereg('#', $val)){
        continue;
      }
      if(!preg_match('/[0-9a-zA-Z]/', $val)) continue;
      $val = trim($val, '"\'');

      /**
       * CHECK IF LINK IS GOING TO ANOTHER DOMAIN.  IF SO DONT FOLLOW IT.
      */

      if(preg_match('/^http(s)?:\/\//', $val) &&
               !strpos($val, preg_replace('/http(s)?:\/\//', '', $baseurl))){
        continue;
      }

      if(ereg('http', $val) && preg_match('/^http(s)?:\/\/.*?\//', $val)){
        $val = preg_replace('/^http(s)?:\/\/.*?\//', '/', $val);
      }else if(ereg('http', $val)){
        $val = '';
      }

      $sl = explode('/', $val);

      if(!preg_match('/[0-9a-zA-Z]/', $sl[0])){
        $sl[0] = preg_replace('/^http(s)?:\/\//', '', $baseurl);
        $complink = implode('/', $sl);
        $sl = explode('/', $complink);

      }else{
        $prepend = explode('/', preg_replace('/^http(s)?:\/\//', '', $url));
        if(count($prepend)>1){
          array_pop($prepend);
          $prep = implode('/', $prepend);
        }else $prep = $prepend[0];
        $sl[0] = $prep.'/'.$sl[0];
        $complink = implode('/', $sl);

        $sl = explode('/', $complink);

      }
      if(!end($sl)) array_pop($sl);

      if(!in_array($complink, $globallinkarr)){
        $globallinkarr[] = $complink;
        $pagetitles[$complink] = $title;

        $depth = count($sl);
        $templinks = array();
        $newlinks = array();
        if($depth > 1){
           if(!$sl[$depth-1]) $sl[$depth-1] = 'index';
           $templinks[$sl[$depth-2]][] = $sl[$depth-1];

           if($depth > 2){
	     for($i=$depth-2; $i>0; $i--){
               $hold = $templinks;
               $templinks = array();
               $templinks[$sl[$i-1]] = $hold;

             }
          }

          $temp = $templinks[$sl[0]];
          $newlinks[$sl[0]] = $temp;

        }
        $depthlinks = array_merge_recursive2($newlinks,$depthlinks);
        $end = strtolower(end(explode(".", $complink)));

        if(!preg_match('/^http(s)?:\/\//', $complink))
          $complink = 'http://'.$complink;
          if(!in_array($end, $dontfollow) && !ereg("sitemap", $complink)){
            findAllLinks($complink);
          }
       }
    }
    return;
  }
}

The Array Merge Recursive Function I Used From php.net

function array_merge_recursive2($array1, $array2){
  $arrays = func_get_args();
  $narrays = count($arrays);

  // check arguments
  // comment out if more performance is necessary
  //   (in this case the foreach loop will trigger a warning if the argument is not an array)
  for ($i = 0; $i < $narrays; $i ++) {
   if (!is_array($arrays[$i])) {
   // also array_merge_recursive returns nothing in this case
     trigger_error('Argument #' . ($i+1) . ' is not an array - trying to merge array with scalar! Returning null!', E_USER_WARNING);
     return;
    }
  }

    // the first array is in the output set in every case
  $ret = $arrays[0];

  // merege $ret with the remaining arrays
  for ($i = 1; $i < $narrays; $i ++) {
    foreach ($arrays[$i] as $key => $value) {
     /***  KEEP THIS COMMENTED OUT TO KEEP THE ORIGINAL KEYS
     //if (((string) $key) === ((string) intval($key))) { // integer or string as integer key - append
     //   $ret[] = $value;
    // }
    // else { // string key - merge
      if (is_array($value) && isset($ret[$key])) {
        // if $ret[$key] is not an array you try to merge an scalar
        // value with an array - the result is not defined (incompatible arrays)
        // in this case the call will trigger an E_USER_WARNING and the $ret[$key] will be null.
        $ret[$key] = array_merge_recursive2($ret[$key], $value);
      }
      else {
        $ret[$key] = $value;
      }
           // }
    }
  }
  return $ret;
}

So after I created the arrays with all the links I create the sitemaps. The first one here is an xml sitemap used for the robots.txt file.
Its really simple and used the globallinkarr array.

XML Sitemap

function createXMLSiteMap($globallinkarr){
  $xml = '<?xml version="1.0" encoding="UTF-8"?>
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">';
   if(count($globallinkarr)){
   foreach($globallinkarr as $val){
     if(!preg_match('/http(s)?:\/\//', $val)){
       $val = 'http://'.$val;
     }
     $xml .= '
       <url>
          <loc>'.str_replace('&', '&amp',$val).'</loc>
       </url>';
    }
  }
  $xml .= '</urlset>';
  return $xml;
}

The html sitemap is created using a recursive function that goes through depthlinkarr. Also it checks if the link is valid if it isn’t in the globallinkarr since all those are already checked. When it does check it only needs the headers so to speed things up I set the curl option CURLOPT_NOBODY to true. The page was timing out on me a lot but with this option set and checking to see if it already existed in the globallinkarr helped stop it from timing out. But if you have a whole lot of links there is a good chance this will cause your page to timeout.

The HTML Sitemap

function createSiteMap($depthlinks, $before = ''){
  global $globallinkarr;
  global $pagetitles;
  $validcodes = array(200,301,302);
  if(count($depthlinks)){
  $sitetree = '<ul style="padding:5px; margin:5px;">';
  foreach($depthlinks as $key=>$val){
    if(is_array($val)){
      if($before) $newbefore = $before.'/';
      $newbefore .= $key;

      $newkey = preg_replace('/^http(s)?:\/\//', '', $newbefore);

    $title = ($pagetitles[$newkey] != "") ? $pagetitles[$newkey] : $newbefore;
      if(!preg_match('/^http(s)?:\/\//', $newbefore))
         $newbefore = 'http://'.$newbefore;
      $exist = 0;
      if(in_array($newbefore, $globallinkarr)){
        $exist = 1;
      }else{
        $test = new Crawler($newbefore, 1);
        $info = $test->get('info');
        if(in_array($info['http_code'], $validcodes))
          $exist = 1;
        }
        if($exist){
          $sitetree .= '
           <li><a style="display:block;" href="'.$newbefore.'"
           target="_blank" title="'.$title.'" />'.$title.'</a></li>';
        }else{
          $sitetree .= '
             <li>'.$title;
        }
        $temp = createSiteMap($val, $newbefore);
        if($temp){
          $sitetree .= $temp;
          $sitetree .= '</li>';
        }
      }else{
        if($before != '') $newval = $before.'/'.$val;
        else $newval = $val;
        $newkey = preg_replace('/^http(s)?:\/\//', '',$newval);
        $title = $pagetitles[$newkey] ? $pagetitles[$newkey] : $newval;
        if(!preg_match('/^http(s)?:\/\//', $newval))
           $newval = 'http://'.$newval;

        $exist = 0;
        if(in_array($newval, $globallinkarr)){
            $exist = 1;
        }else{
          $test = new Crawler($newval, 1);
          $info = $test->get('info');
          if(in_array($info['http_code'], $validcodes))
            $exist = 1;
        }

        if($exist){
          $sitetree .= '
            <li><a href="'.$newval.'" title="'.$title.'"
                target="_blank">'.$title.'</a></li>';
        }
      }
    }
  $sitetree .= '</ul>';
  return $sitetree;
  }
}