Include System Libraries Using Swift Package Manager Or CocoaPods

I’m currently using swift package manager to build a framework for an iOS project.  Why? Because I like the clean, modular approach, no need to have an Xcode project and I find it faster for CI testing.

Unfortunately swift package manager doesn’t work for the iOS project.  And since my team is already familiar with CocoaPods that is what the iOS project is using.

Now I will explain how I included a system library inside a Swift framework that is then used in an iOS project with CocoaPods.

I’ll show you how I setup CommonCrypto to use Swift Package Manager and CocoaPods.

For Swift Package Manager:

  • A. Create a git repo for the swift package wrapper around the system library.   
    1. Add a module.modulemap file to the repo and add the system header you are wrapping.
      module CCommonCrypto [system] {
      header "/usr/include/CommonCrypto/CommonCrypto.h"
      export *
      }
    2.  Add Package.swift file to repo
      import PackageDescription
      let package = Package(
      name: "CCommonCrypto"
      )
    3.  Commit the files and add a tag to them (Swift Packages need a tag to be used)
      git tag 0.0.1
      git push origin 0.0.1
      (you can overwrite previous tag by using the "-f" flag on both of those commands)
  • B. Create another repo for the Swift Package that will use the system package wrapper.
    1. Create the Package.swift file and add the repo created in step 1 as a dependency.

      import PackageDescription
      let package = Package(
      name: "CommonCrypto",
      targets: [
      Target(name: "CommonCrypto"),
      ],
      dependencies: [
      .Package(url: "https://github.com/kmussel/ccommoncrypto.git", "0.0.1"),
      ]
      )
    2. Add the “Sources” folder and a subfolder named the same thing you named your target in the Package.swift file
      Sources -> CommonCrypto
    3. Inside the subfolder (CommonCrypto in this case) add all your swift files that you want to use.
      1.  In each file that you want to use the dependency package import that package using the name from its Package.swift file
        import CCommonCrypto

Now you can create another Swift Package and include Repo 2 as a dependency and when you run “swift build” it will download and include the dependencies

You can also run “swift package generate-xcodeproj” if you want to use the Xcode project.

Getting the Swift Package to Work with CocoaPods:

  • C.  Repo 1 is not needed.  We just need to update Repo 2 with necessary CocoaPod files.
    1.  Add another subfolder under the Subfolder we created in Step B.2. The name of the subfolder should match the name of the Package from step A.2.  This is so CocoaPods will recognize the import statement from step B.3.1.
    2. Copy the module.modulemap file from step A.1. into this subfolder.
    3. Create another subfolder in the root directory.  It can be named anything you want but I called it “CocoaPods” since it is only used for that.

    4. Inside the “CocoaPods” directory you will create multiple subfolders for each target you want this package to target.
      ie. iphoneos, iphonesimulator, macosx, etc

    5. Create the files module.map under each of the subfolders you just created pointing to the correct system headers
      CocoaPods -> iphoneos -> module.map
      module CCommonCrypto [system] {
      header "/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS10.1.sdk/usr/include/CommonCrypto/CommonCrypto.h"
      export *
      }
    6. Create the Podspec file for it
      1. Our source_files will just be the .swift files inside the Sources directory.  These files are added to the Xcode project.
        s.source_files = "Sources/**/*.swift"
      2. Normally any file not in the source_files will be removed but we need the project to be able to access our “CocoaPods” directory to know which files to include.  To keep the “CocoaPods” directory without adding it to the project we use the “preserve_paths” command to keep the “CocoaPods” directory.
        s.preserve_paths = 'CocoaPods/**/*'
      3. We then tell Xcode where the include paths are for each sdk.  CocoaPods installs it in the PODS_ROOT directory and under the subdirectory named the same as the name of this Podspec.
        s.pod_target_xcconfig = {
        'SWIFT_INCLUDE_PATHS[sdk=iphoneos*]' => '$(PODS_ROOT)/CommonCrypto/CocoaPods/iphoneos',
        'SWIFT_INCLUDE_PATHS[sdk=iphonesimulator*]' => '$(PODS_ROOT)/CommonCrypto/CocoaPods/iphonesimulator',
        'SWIFT_INCLUDE_PATHS[sdk=macosx*]' => '$(PODS_ROOT)/CommonCrypto/CocoaPods/macosx'
        }
      4. The next thing is that the header file in each of the module.map files probably won’t be the same for each user.  We need to change it for the user when they install via pod install.  We create a script to do this and execute it using the prepare_command in cocoa pods.  For example, my path is “/Applications/Xcode-beta.app/Contents/Developer/” so this script replaces the default to that.  I grabbed the script from this site. But should probably should modify the script to handle sdk versions too.
        s.prepare_command = <<-CMD
        ./CocoaPods/injectXcodePath.sh
        CMD

 Now you can include Repo 2 in the Podfile of your iOS project.

The entire Podspec:

You can see the entire CommonCrypto example here:

https://github.com/kmussel/commoncrypto
https://github.com/kmussel/ccommoncrypto

Google Universal Analytics and Tag Manager with Enhanced Ecommerce

Google Analytics! It used to be a simple add this snippet of code to your page and you’re finished. Now depending on the options you want there could be a lot more work to do. You could be using classic google analytics or universal analytics.  You could be using the ecommerce plugin or enhanced ecommerce or none.  You could be using the data layer or macros. You could be using any combination of those with google tag manager.  And depending on which combination you use you will have to code it differently.

I’ll show you how I setup analytics using Universal Analytics and Tag Manger with Enhanced Ecommerce and the Data Layer.

Finding the correct documentation for the analytics combination I’m using was frustrating.  Here is a list of the docs that were helpful to me.

First off if you haven’t used Google Tag Manager I would read:
Getting Started and
How It Works

Basically once the javascript snippet is deployed to the site, it allows non-developers to manage what data they want to collect from the site without involving the developers.

For example.   Let’s say a website has a button to login.  This button has an ID of “login-btn”.  Now a user can use the tag manager to add a tag called “Log In” with a rule that when a user clicks on an element with an ID of “login-btn” it will fire the tag.  The rule would look like this: {{element id}} equals login-btn.  Depending on the type of tag you use you might also need to add to the rule {{event}} equals gtm.click.

Now your site will start collecting data every time a user clicks on the login button without your developer having to make any changes to the code.

** Once you cross over to the Ecommerce world, a developer is going to be required.

The Basic Steps:

Google Tag Manager

The data that is used by the Tag Manager and then sent to google analytics is retrieved from the Data Layer or Macros.   The recommended approach is to use the Data Layer.

I recommend going over the development docs if you haven’t already. https://developers.google.com/tag-manager/devguide

In their docs they mention two ways to populate the data layer and fire the tags:

  1. Declare all needed information in the data layer above the container snippet
  2. Use HTML Event Handlers

If you have the traditional Multi Page Application you most likely will use option 1 .  If you have a Single Page Application (SPA) you will need to use option 2.   Personally I’ve been using this on a SPA with Angular so I never used option 1.

If you are using option 1, you would create a tag in Google Tag Manager with these attributes:

Tag type : Universal Analytics
Track type : Pageview
Enable Enhanced Ecommerce Features: true
Use Data Layer: true
Basic Settings – Document Path: {{url path}}
Firing Rule: {{event}} equals gtm.js

In this case you would make sure the all the data was in the dataLayer before the snippet.

dataLayer = [{
 'ecommerce': {
  'impressions': [{
   'name': productObj.name,                      // Name or ID is required.
   'id': productObj.id,
   'price': productObj.price,
   'brand': productObj.brand,
   'category': productObj.cat,
   'variant': productObj.variant,
   'list': 'Search Results',
   'position': 1
  }]
 }
}];

For option 2 you would create a tag like this:

Tag type : Universal Analytics
Track type : Event
Event Category: Ecommerce
Event Action: Product Click
Enable Enhanced Ecommerce Features: true
Use Data Layer: true
Basic Settings – Document Path: {{url path}}
Firing Rule: {{event}} equals productClick

Then in your javascript you would send the data by pushing it to the dataLayer object like this:

dataLayer.push({
'event': 'productClick',
'ecommerce': {
  'click': {
    'actionField': {'list': 'Search Results'},      // Optional list property.
    'products': [{
      'name': productObj.name,                      // Name or ID is required.
      'id': productObj.id,
      'price': productObj.price,
      'brand': productObj.brand,
      'category': productObj.cat,
      'variant': productObj.variant
    }]
  }
}
});


I recommend using a generic event to help keep things manageable. I’m using the angular javascript framework with angulartics (http://luisfarzati.github.io/angulartics/).  The readme here (https://github.com/luisfarzati/angulartics#for-google-tag-manager)  explains what tags, rules and macros need to be setup.  Even if you aren’t using angular the setup is the same for a generic event.

Also for both of those tags, make sure you check both “Enable Enhanced Ecommerce Features”  and  “Use Data Layer”.

** If you are using angulartics you’ll need to make sure it handles ecommerce. You just need to make sure one of the top level keys is ‘ecommerce’.   I just over wrote the module and did this:

$analyticsProvider.registerEventTrack(function(action, properties){
  var dataLayer = window.dataLayer = window.dataLayer || [];
  var data = {
    'event': 'interaction',
    'target': properties.category,
    'action': action,
    'target-properties': properties.label,
    'value': properties.value,
    'interaction-type': properties.noninteraction
  };
  if(properties.ecommerce)
    data['ecommerce'] = properties.ecommerce;

  dataLayer.push(data);
});

Make sure you look at the docs for google tag enhanced ecommerce.  It will show you what attributes to use when you push your data to the data layer.  https://developers.google.com/tag-manager/enhanced-ecommerce

Viewing the data you sent in google analytics.

Go to Reporting and go to the Conversions section.  Under there you’ll see the Ecommerce reports.

* When you push your products to the dataLayer the id field maps to the Product SKU.

Dynamic Remarketing

In Google Tag Manager for each tag you want to use this feature for check the “Enable Display Advertising Features” box.
In Google Analytics, go to the Admin section. Select the property you want and then click on “Dynamic Attributes”. Then for Step 2 for Product ID select “Product SKU”. https://support.google.com/analytics/answer/6002231

Debugging Locally

Click on each tag you created to edit it. Under the “More settings” click on “Cookie Configuration”. For Cookie Domain type “none”.
You can also setup another view in Google Analytics and in Tag Manager you can create a macro for the Tag’s Tracking Code like here.

A couple useful articles about Google Tag Manager Macros:
http://www.simoahava.com/analytics/macro-guide-google-tag-manager/
http://www.simoahava.com/analytics/macro-magic-google-tag-manager/

I know this can be confusing and there is a lot of questions that I didn’t answer. The biggest thing for me is to know where I can find the answers so hopefully this article along with the links I posted will help you forge ahead through the world of google analytics.

Inter-Service Communication using Client Certificate Authentication

I love the Service Oriented Architecture. But like all things, security is needed. In this case to make sure that one service has permission to talk to another service. There are a few different ways to obtain this security but I really like using SSL certificates. It’s very simple to add other services and your webserver (apache, nginx) will handle the validation for you.

I wrote an article on getting this setup here: http://www.sitepoint.com/inter-service-communication-using-client-certificate-authentication/

AngularJS Navigation Menu

Once again I’m in need of creating a navigation menu with drop downs. This time I’ve been working with Angular and using Foundation 4 for styling.

The Final Output

Because who doesn’t like dessert first

Here is the fiddle. http://jsfiddle.net/kmussel/evXFZ/

The Styling

The good news is Foundation basically has all the styles already done for you under the class of “top-bar” and “top-bar-section”. But since I needed to have this nav under the top bar I just copied most of the top-bar styles and put it under the “nav-menu” class. You can see some of the styles I copied over at the bottom of the css panel in the fiddle.

You can also test out using only foundation’s top-bar styling by uncommenting out the html in the fiddle and commenting out the other “nav” element. And then change the directive name from “navMenu” to “anavMenu”. Make sure the result panel is wide enough though cause foundation changes the css on small screen sizes when it comes to the top-bar.
Basically if all you do is include foundation’s css the only thing you have to make sure you do is change your html to have a “top-bar” class and then a nested “top-bar-section” class.

The AngularJS Goodness

If you’ve never worked with angular I definitely recommend it. But it can be frustrating especially dealing with directives (which is what we will be doing here). I think this guy is spot on with how learning angular will make you feel.

So I basically knew what I wanted to be able to do and that is in html write this:

<nav menu-data="menu"></nav>

and it spit out my entire navigation.

And so I could resuse the directive with other controllers I added an attribute to it that would reference the scope variable where the menu data would be which led me to have this:

<nav menu-data="menu" menu-data="menu"></nav>

The first directive I create then is the navMenu one.


app.directive('navMenu', ['$parse', '$compile', function($parse, $compile)
{
    return {
        restrict: 'E', //Element
        scope:true,
        link: function (scope, element, attrs)
        {
            scope.$watch( attrs.menuData, function(val)
            {
                var template = angular.element('<ul id="parentTreeNavigation"><li ng-repeat="node in ' + attrs.menuData + '" ng-class="{active:node.active && node.active==true, \'has-dropdown\': !!node.children && node.children.length}"><a ng-href="{{node.href}}" ng-click="{{node.click}}" target="{{node.target}}" >{{node.text}}</a><sub-navigation-tree></sub-navigation-tree></li></ul>');
               var linkFunction = $compile(template);
               linkFunction(scope);
               element.html(null).append( template );
            }, true );
      }

}]);

In here i create the initial template. The template contains 2 other directives. Angular’s built-in “ng-repeat” and my other “sub-navigation-tree”. This is put inside a watch statement so if the menu changes it will be updated. The template is then compiled and linked to current scope and then appended to the current element.
This then compiles and links all the directives within that.

The Inner Directive handles the actual drop down part.



.directive('subNavigationTree', ['$compile', function($compile)
{
    return {
        restrict: 'E', //Element
        scope:true,
        link: function (scope, element, attrs)
        {
            scope.tree = scope.node;

            if(scope.tree.children && scope.tree.children.length )
            {
                var template = angular.element('<ul class="dropdown "><li ng-repeat="node in tree.children" node-id={{node.' + attrs.nodeId + '}}  ng-class="{active:node.active && node.active==true, \'has-dropdown\': !!node.children && node.children.length}"><a ng-href="{{node.href}}" ng-click="{{node.click}}" target="{{node.target}}" ng-bind-html-unsafe="node.text"></a><sub-navigation-tree tree="node"></sub-navigation-tree></li></ul>');

                var linkFunction = $compile(template);
                linkFunction(scope);
                element.replaceWith( template );
            }
            else
            {
                element.remove();
            }
        }
     };
}]);

Here you see the same thing with the template but it’s referencing itself now. The template is being linked to the current scope. This means that the directives within the template will be run using that scope. So ng-repeat=”node in tree.children” will be looping over scope.tree.children.
The ng-repeat then creates new scopes for each item and gives that scope access to the object through whatever you called it in your ng-repeat. In our case “node”. The scope that is created in the ng-repeat is the one that is passed to our directive.
Then if the current scope has children it does it all again with linking the template up with the current scope. It then replaces the current element with the template or removes it all together if there are no children.

Asynchronous Request within UITableViewCell

The Problem:
I have an image that needs to be loaded for each table cell. Unfortunately this causes the UI to become unresponsive for a couple seconds when you click to go to that table view.

The Solution:
Make the call to get the image data asynchronous.

First in our tableview delegate method, cellForRowAtIndexPath, we’ll use NSURLConnection to create the connection.


NSURLRequest *req = [[NSURLRequest alloc] initWithURL:[NSURL URLWithString:@"http://www.test.com/image_url"]];
NSURLConnection *imgCon = [NSURLConnection alloc];
..... /** Code added later  **/
[imgCon initWithRequest:req delegate:self startImmediately:YES];

There are 2 delegate methods that you need to have to make this work. (There are others that you’ll probably want to implement as well. Apple Docs)


-(void)connection:(NSURLConnection *)connection didReceiveData:(NSData *)data;
-(void)connectionDidFinishLoading:(NSURLConnection *)connection;

The didReceiveData method will probably be called multiple times. So we need to create a variable to keep on appending the data to it. Since we are creating multiple connections we need to map each connection to the data. But we also need to know which table cell we need to update once we have all the data.
So we create 2 NSMutableDictionary objects. One to map the table cells indexPath to the data and one to map the connection to the indexPath.


NSMutableDictionary *indexPathImgData;
NSMutableDictionary *connectionIndexPath;

The problem that arrises from trying to map the connection to the indexPath is that NSURLConnection doesn’t conform to the NSCopying Protocol. To get around this we need to use CFMutableDictionary. When adding values to a CFMutableDictionary “the keys and values are not copied—they are retained” (Apple Docs on CFMutableDictionary)

So to add the values to our NSMutableDictionary we will use the CFDictionaryAddValue function.

Now our cellForRowAtIndexPath method will look like this:


NSURLRequest *req = [[NSURLRequest alloc] initWithURL:[NSURL URLWithString:@"http://www.test.com/image_url"]];
NSURLConnection *imageCon = [NSURLConnection alloc];

CFDictionaryAddValue((CFMutableDictionaryRef)self.connectionIndexPath, imageCon, indexPath);
NSMutableData *imageData = [NSMutableData dataWithCapacity:0];
CFDictionaryAddValue((CFMutableDictionaryRef)self.indexPathImgData, indexPath, imageData);

[imageCon initWithRequest:req delegate:self startImmediately:YES];

And our NSURLConnection Delegate methods will look something like this:


- (void)connection:(NSURLConnection *)connection didReceiveData:(NSData *)data {

    [[self.indexPathImgData objectForKey:[self.connectionIndexPath objectForKey:connection]] appendData:data];
}

-(void)connectionDidFinishLoading:(NSURLConnection *)connection
{
    NSIndexPath *temp = [self.connectionIndexPath objectForKey:connection];
    [self.tableView reloadRowsAtIndexPaths:[NSArray arrayWithObject:temp] withRowAnimation:UITableViewRowAnimationNone];

}

Now in your cellForRowAtIndexPath method you’ll want to do a check if the dictionary object has data for that indexPath and if so display it otherwise you’ll want to create a new connection.

iPhone Push Notifications

iPhone Push Notification Testing:

Testing out the interaction after receiving a remote push notifications on the iphone can be very annoying. Apple does not make it easy to setup remote push notifications. Plus development time is much slower if you try to build and test this interaction while waiting for remote notifications. And unfortunately Apple doesn’t provide a way to test out remote notifications on the simulator.

So instead of using a remote notification I used a local notification for testing this interaction.
In your UIApplication delegate instead of using this delegate method:


- (void)application:(UIApplication *)application 
     didReceiveRemoteNotification:(NSDictionary *)userInfo

I used this:


- (void)application:(UIApplication *)application 
        didReceiveLocalNotification: (UILocalNotification *)notification

You can then schedule a local notification (Apple Docs). This will then simulate receiving a notification when you are in the app.

To simulate the notification when the app isn’t in the foreground I created a local notification inside the delegate method:


-(void)applicationDidEnterBackground:(UIApplication *)application.  

I set it up so as soon as you closed the app the notification would fire showing the default alert box.

Here is the code I used to create the local notification inside the DidEnterBackground method.


NSDate *nowDate = [NSDate date];
NSTimeInterval seconds = 0;
NSDate *newDate = [nowDate dateByAddingTimeInterval:seconds];

UILocalNotification *localNotif = [[UILocalNotification alloc] init];
if (localNotif != nil)
{
  localNotif.fireDate = newDate;
  localNotif.timeZone = [NSTimeZone defaultTimeZone];

  localNotif.alertBody = @"You have new notifications";
  localNotif.alertAction = NSLocalizedString(@"View", nil);

  localNotif.soundName = UILocalNotificationDefaultSoundName;
  localNotif.applicationIconBadgeNumber = 2;

  NSDictionary *infoDict = [NSDictionary dictionaryWithObject:@"2" 
       forKey:@"NewNotifications"];
  localNotif.userInfo = infoDict;

  [[UIApplication sharedApplication] scheduleLocalNotification:localNotif];
  [localNotif release];
}


When you click the “View” button on the alert box, the application will come to the foreground and the didReceiveLocalNotification will be called. You can then do what you need to do based on what data you passed to it via the NSDictionary object in the local notification.

So the main difference in your code between the local and remote notification is that in the local notification you’ll have to access the data you sent to it with [notification userInfo] and in the remote notification, the dictionary object is what is passed to it.

CoreJS

Server-side Javascript using Google’s V8 Engine.

Recently I started working on an open source project with my good friend, Wess Cope, who is more passionate about javascript than anyone I’ve ever met. It was this that drove the want to be able to use javascript for everything. So we built CoreJS which you can find at https://github.com/frenzylabs/CoreJS.

We used http://wiki.commonjs.org/wiki/CommonJS#Low-level_APIs for a reference on what should be implemented in CoreJS.

Aren’t there already server-side javascript frameworks?
The main one out there is nodejs. Overall it’s pretty good but the main issue we had with it is that it forces everything to be asynchronous. Having it be asynchronous is great but we wanted to have the flexibility to be both. There are definitely times when you don’t want something to be asynchronous.
So what we ended up doing is setting up our functions to be asynchronous when there was a callback and synchronous otherwise.

For example our HTTP post request:


//Synchronous
var data = Http.post("/path/to/file", {arg:"arg1", arg2:"arg2"});
//Asynchronously
Http.post("/path/to/file", {arg:"arg1", arg2:"arg2"}, function(data){  
                     print(data); 
           });

CoreJS also utilizes the event based model, using LibEvent, and threading.

It’s definitely new and not complete yet but check it out and give us some feedback. We tried to add some decent documentation on how it all works cause we know it was very annoying to even try and build this with the lack of documentation. https://github.com/frenzylabs/CoreJS/wiki

Crawling Web Pages and Creating Sitemaps

Creating a Sitemap Based on all the Links within a Website

I built this web crawler because I wanted a way to create a sitemap of this website I was building. I know there are a few websites out there that will do this for you but I didn’t want to rely on someone else and I wanted to change a few things. So in order to do this I used php and cURL.
I started out creating a class for the crawler. When I create a new crawler class I pass in the url of the website I want to start with. This also uses cURL to access the webpage and get the content and headers. Inside this class are also methods to get all the links of a page, the page title, the entire content, just the body content, and the headers. But you could easily add more to say grab all the images on a page.

The Crawler Class

  

class Crawler {
  protected $markup='';
  protected $httpinfo='';

  public function __construct($uri, $justheaders=0){
    $output = $this->getMarkup($uri, $justheaders);
    $this->markup = $output['output'];
    $this->httpinfo = $output['code'];
  }

  public function getMarkup($uri, $justheaders) {
    $ch = curl_init($uri);
    curl_setopt($ch, CURLOPT_HEADER, 1);
    if($justheaders){
      curl_setopt($ch, CURLOPT_NOBODY, 1);
      curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
      curl_setopt($ch, CURLOPT_FAILONERROR, 1);
    }
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    curl_setopt($ch, CURLOPT_MAXREDIRS, 5);

    $output['output'] = curl_exec($ch);
    $output['code'] = curl_getinfo($ch);
    curl_close($ch);
    return $output;
  }

  public function get($type){
    $method = "_get_{$type}";
    if (method_exists($this, $method)){
      return call_user_method($method, $this);
    }
  }

  protected function _get_info(){
    return $this->httpinfo;
  }

  protected function _get_links(){
    if(!empty($this->markup)){
      preg_match_all('/<a(?:.*?)href=(["|\'].*?["|\'])(.*?)>(.*?)\<\/a\>/i',
                               $this->markup, $links);
      return !empty($links[1]) ? array_flip(array_flip($links[1])) : FALSE;
    }
  }

  protected function _get_body(){
    if(!empty($this->markup)){
      preg_match('/\<body\>(.*?)\<\/body\>/msU', $this->markup, $body);
      return $body[1];
    }
  }
  protected function _get_content(){
    if(!empty($this->markup)){
      return $this->markup;
    }
  }

  protected function _get_pagetitle() {
    if (!empty($this->markup)){
     preg_match_all('/<title>(.*?)\<\/title\>/si', $this->markup, $pagetitles);
     return !empty($pagetitles[1]) ? $pagetitles[1] : FALSE;
    }
  }
}


After this I create a recursive function that will follow each of the links. Each time I call this function I create a new instance of the Crawler class. If the url isn’t valid I just return. If the url is redirected curl has an option to follow links, the CURLOPT_FOLLOWLOCATION option. Since this is set to on, you need to get the actual url which is contained in the header information. After this I call the get links function. This will return all the unique links on a page. (Calling the array_flip twice makes them unique).

I then get the title tag for each page. This is used when creating the sitemap. I remove all the script tags and all the html tags. The next thing I do is get a base url. Since I’m creating a sitemap of just one website I don’t need any links to external pages. So if any of the links don’t contain this base url I wont follow it.

Now I begin looping through all the links on the page. If its an external link I return. If it is an absolute link and it contains a “/” I replace the whole thing with a slash. If it doesn’t have the slash but is an absolute link then the new val is “”. I do this because I am prepending the base url that we got earlier to it.

I then explode on the “/”. If the first element is empty then I put the base url in there. Otherwise I prepend the url to it. This is done to get the correct link no matter if it is a relative link, root relative, or absolute.
The complete link is formed and checked to see if it already exist in the globallinkarr. If it doesn’t I add it and then begin getting the different levels. Basically everytime there is a “/” in the url then that is a different level. This is used when I am creating the html sitemap. Also to create this array of levels, I have to call array_merge_recursive. Well the regular php function didn’t quite work. If a url has numbers as one of its levels for example blog/2009/12/post then that function would turn the 2009 to its own key. So I needed it to keep the keys the same so I just got another function off of php.net.

The Function to Get all the Links


$dontfollow = array('pdf', 'jpg', 'png', 'jpeg','zip', 'gz', 'tar', 'txt');

function findAllLinks($url){
  global $globallinkarr;
  global $depthlinks;
  global $dontfollow;
  global $pagetitles;
  global $contentarr;

  $crawl = new Crawler($url);
  $info = $crawl->get('info');

  $validcodes = array(200,301,302);
  if(!in_array($info['http_code'], $validcodes))
    return;
  $url = $info['url'];
  $links = $crawl->get('links');
  $title = $crawl->get('pagetitle');
  $title = $title[0];
  $body = $crawl->get('body');
  $count++;

  $content = strip_tags(preg_replace('//msU', '', $body));

  if(!array_key_exists($url, $contentarr))
    $contentarr[$url] = array('title'=>"$title", 'pagecontent'=>"$content");

  if(!count($links) || !is_array($links)) return;
  else{
    if(preg_match('/http(?:s)?:\/\/(.*?)\/(.*)/', $url, $pattern)){
      $baseurl = $pattern[1];
    }else{
      $baseurl = $url;
    }

    foreach($links as $val){
      if(preg_match('/.*?javascript:void\(0\)/', $val) || ereg('#', $val)){
        continue;
      }
      if(!preg_match('/[0-9a-zA-Z]/', $val)) continue;
      $val = trim($val, '"\'');

      /**
       * CHECK IF LINK IS GOING TO ANOTHER DOMAIN.  IF SO DONT FOLLOW IT.
      */

      if(preg_match('/^http(s)?:\/\//', $val) &&
               !strpos($val, preg_replace('/http(s)?:\/\//', '', $baseurl))){
        continue;
      }

      if(ereg('http', $val) && preg_match('/^http(s)?:\/\/.*?\//', $val)){
        $val = preg_replace('/^http(s)?:\/\/.*?\//', '/', $val);
      }else if(ereg('http', $val)){
        $val = '';
      }

      $sl = explode('/', $val);

      if(!preg_match('/[0-9a-zA-Z]/', $sl[0])){
        $sl[0] = preg_replace('/^http(s)?:\/\//', '', $baseurl);
        $complink = implode('/', $sl);
        $sl = explode('/', $complink);

      }else{
        $prepend = explode('/', preg_replace('/^http(s)?:\/\//', '', $url));
        if(count($prepend)>1){
          array_pop($prepend);
          $prep = implode('/', $prepend);
        }else $prep = $prepend[0];
        $sl[0] = $prep.'/'.$sl[0];
        $complink = implode('/', $sl);

        $sl = explode('/', $complink);

      }
      if(!end($sl)) array_pop($sl);

      if(!in_array($complink, $globallinkarr)){
        $globallinkarr[] = $complink;
        $pagetitles[$complink] = $title;

        $depth = count($sl);
        $templinks = array();
        $newlinks = array();
        if($depth > 1){
           if(!$sl[$depth-1]) $sl[$depth-1] = 'index';
           $templinks[$sl[$depth-2]][] = $sl[$depth-1];

           if($depth > 2){
	     for($i=$depth-2; $i>0; $i--){
               $hold = $templinks;
               $templinks = array();
               $templinks[$sl[$i-1]] = $hold;

             }
          }

          $temp = $templinks[$sl[0]];
          $newlinks[$sl[0]] = $temp;

        }
        $depthlinks = array_merge_recursive2($newlinks,$depthlinks);
        $end = strtolower(end(explode(".", $complink)));

        if(!preg_match('/^http(s)?:\/\//', $complink))
          $complink = 'http://'.$complink;
          if(!in_array($end, $dontfollow) && !ereg("sitemap", $complink)){
            findAllLinks($complink);
          }
       }
    }
    return;
  }
}

The Array Merge Recursive Function I Used From php.net

function array_merge_recursive2($array1, $array2){
  $arrays = func_get_args();
  $narrays = count($arrays);

  // check arguments
  // comment out if more performance is necessary
  //   (in this case the foreach loop will trigger a warning if the argument is not an array)
  for ($i = 0; $i < $narrays; $i ++) {
   if (!is_array($arrays[$i])) {
   // also array_merge_recursive returns nothing in this case
     trigger_error('Argument #' . ($i+1) . ' is not an array - trying to merge array with scalar! Returning null!', E_USER_WARNING);
     return;
    }
  }

    // the first array is in the output set in every case
  $ret = $arrays[0];

  // merege $ret with the remaining arrays
  for ($i = 1; $i < $narrays; $i ++) {
    foreach ($arrays[$i] as $key => $value) {
     /***  KEEP THIS COMMENTED OUT TO KEEP THE ORIGINAL KEYS
     //if (((string) $key) === ((string) intval($key))) { // integer or string as integer key - append
     //   $ret[] = $value;
    // }
    // else { // string key - merge
      if (is_array($value) && isset($ret[$key])) {
        // if $ret[$key] is not an array you try to merge an scalar
        // value with an array - the result is not defined (incompatible arrays)
        // in this case the call will trigger an E_USER_WARNING and the $ret[$key] will be null.
        $ret[$key] = array_merge_recursive2($ret[$key], $value);
      }
      else {
        $ret[$key] = $value;
      }
           // }
    }
  }
  return $ret;
}

So after I created the arrays with all the links I create the sitemaps. The first one here is an xml sitemap used for the robots.txt file.
Its really simple and used the globallinkarr array.

XML Sitemap

function createXMLSiteMap($globallinkarr){
  $xml = '<?xml version="1.0" encoding="UTF-8"?>
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">';
   if(count($globallinkarr)){
   foreach($globallinkarr as $val){
     if(!preg_match('/http(s)?:\/\//', $val)){
       $val = 'http://'.$val;
     }
     $xml .= '
       <url>
          <loc>'.str_replace('&', '&amp',$val).'</loc>
       </url>';
    }
  }
  $xml .= '</urlset>';
  return $xml;
}

The html sitemap is created using a recursive function that goes through depthlinkarr. Also it checks if the link is valid if it isn’t in the globallinkarr since all those are already checked. When it does check it only needs the headers so to speed things up I set the curl option CURLOPT_NOBODY to true. The page was timing out on me a lot but with this option set and checking to see if it already existed in the globallinkarr helped stop it from timing out. But if you have a whole lot of links there is a good chance this will cause your page to timeout.

The HTML Sitemap

function createSiteMap($depthlinks, $before = ''){
  global $globallinkarr;
  global $pagetitles;
  $validcodes = array(200,301,302);
  if(count($depthlinks)){
  $sitetree = '<ul style="padding:5px; margin:5px;">';
  foreach($depthlinks as $key=>$val){
    if(is_array($val)){
      if($before) $newbefore = $before.'/';
      $newbefore .= $key;

      $newkey = preg_replace('/^http(s)?:\/\//', '', $newbefore);

    $title = ($pagetitles[$newkey] != "") ? $pagetitles[$newkey] : $newbefore;
      if(!preg_match('/^http(s)?:\/\//', $newbefore))
         $newbefore = 'http://'.$newbefore;
      $exist = 0;
      if(in_array($newbefore, $globallinkarr)){
        $exist = 1;
      }else{
        $test = new Crawler($newbefore, 1);
        $info = $test->get('info');
        if(in_array($info['http_code'], $validcodes))
          $exist = 1;
        }
        if($exist){
          $sitetree .= '
           <li><a style="display:block;" href="'.$newbefore.'"
           target="_blank" title="'.$title.'" />'.$title.'</a></li>';
        }else{
          $sitetree .= '
             <li>'.$title;
        }
        $temp = createSiteMap($val, $newbefore);
        if($temp){
          $sitetree .= $temp;
          $sitetree .= '</li>';
        }
      }else{
        if($before != '') $newval = $before.'/'.$val;
        else $newval = $val;
        $newkey = preg_replace('/^http(s)?:\/\//', '',$newval);
        $title = $pagetitles[$newkey] ? $pagetitles[$newkey] : $newval;
        if(!preg_match('/^http(s)?:\/\//', $newval))
           $newval = 'http://'.$newval;

        $exist = 0;
        if(in_array($newval, $globallinkarr)){
            $exist = 1;
        }else{
          $test = new Crawler($newval, 1);
          $info = $test->get('info');
          if(in_array($info['http_code'], $validcodes))
            $exist = 1;
        }

        if($exist){
          $sitetree .= '
            <li><a href="'.$newval.'" title="'.$title.'"
                target="_blank">'.$title.'</a></li>';
        }
      }
    }
  $sitetree .= '</ul>';
  return $sitetree;
  }
}

Execute Code After Browser Finishes Loading

How to execute a script after the browser stops loading

Recently I needed to execute a bit of code but the browser was timing out. I was actually using the php exec function to tar up some files. The browser kept showing a time out page. So in order to make the browser think it was done loading and then execute this bit of script, I put everything in the buffer with ob_start(). Then when I’m ready to stop the browser I get the size of the buffer. I then set the header content-length equal to this size and flush out the buffer. Now the browser will no longer show that its loading, since it has received all the content it is going to receive. So now you can execute any script.

Here is the code


<?php
ob_end_clean();
header("Connection: close");
ignore_user_abort();  // optional
ob_start();

echo "some content";

$size = ob_get_length();
 header("Content-Length: $size");
 ob_end_flush();
 flush();
//Browser done loading

//put your script here

?>

Round Robin Algorithm

Generating a Schedule Automatically

I need to create a way for every team to play each other in each round of games. I first started out creating a chart of who each team will play. After I created a chart that worked I built an algorithm to build that chart.

The Game Schedule

The Teams: Team 1 Team 2 Team 3 Team 4 Team 5 Team 6
Rounds: Round 1 Team 6 Team 5 Team 4 Team 3 Team 2 Team 1
Round 2 Team 4 Team 6 Team 5 Team 1 Team 3 Team 2
Round 3 Team 2 Team 1 Team 6 Team 5 Team 4 Team 3
Round 4 Team 5 Team 3 Team 2 Team 6 Team 1 Team 4
Round 5 Team 3 Team 4 Team 1 Team 2 Team 6 Team 5

Now you just need to figure out the pattern and we can code it up. I was told there is a name for this pattern but I didn’t know it, so if you know it let me know.

The Variables

$gamearr AND $tempgamearr are both an array of all the teams. The array would look like this: $gamearr[0] = “firstteam”, $gamearr[1] = “secondteam” and so on.

$numteamspadded is the number of teams. If the number was odd I add another team to the end of $gamearr and $tempgamearr as a “byteam”.

$gamenumperteam is however many number of games you want each team to play.

The Logic

As you can see at the beginning I set the variable $firstteam. You realize why I did this if you figured out the pattern above. The pattern above being that I have an array of teams playing the array of teams in reverse order. After each round I decrement the reverse order array and move the last one to the front. So we would have 654321 and then 165432 and then 216543. This would cause teams to play themselves though which is where the $firstteam comes in. Every time a team would play itself we switch it out with the firstteam.

I start out by looping through the number of games and then looping through just half of the teams since I’m setting both the home and away team in there. If I looped through all the teams I would have 1 vs 6 and then 6 vs 1 but I just want the unique games.

I then loop through all the games and set the $gamearr to the values that it would be had I been just decrementing the array and moving the last item to the beginning. Meaning the gamearr will look like this after each round 234561, 345612, 456123, etc. I use the $tempgamearr because I need to know the original order of the teams since we have to switch the first team out when there is a game playing itself.
After this I remove the first team from the array. I then loop through the game to find if one of the teams will be playing itself and get its value and position. This is one of the reasons why I remove the first team from the array. If I left it in there then 1 would play itself and overwrite the fact that another team was playing itself.
I then switch the team that is playing itself with the $firstteam.

After this you have an array of the games for each round.
I then create an array of all the games and check to see if the “byweek” is playing. If it is I don’t add it to the array cause we don’t want a byweek taking up a game slot.

The Round Robin Algorithm


$bottom = $numteamspadded-1;
$half = $numteamspadded/2;

$firsteam = $gamearr[0];
for($i=0; $i<$gamenumperteam; $i++){
    for($j=0; $j<$half; $j++){
         if($numteamspadded==4)
           $rrarr[$i][$j]['home'] = $tempgamearr[$j];
        else $rrarr[$i][$j]['home'] = $gamearr[$j];
        $rrarr[$i][$j]['away'] = $gamearr[$bottom-$j];
    }

    for($j=1; $j<=$bottom+1; $j++){
        $start = ($i+$j)%$numteamspadded;
        $gamearr[$j-1] = $tempgamearr[$start];
    }

    array_splice($gamearr, array_search($firsteam, $gamearr), 1);
    foreach($gamearr as $key=>$val){
        if($tempgamearr[$bottom-$key] == $val){
              $TempGameValue = $val;
              $switch = $key;
          }
    }

    $gamearr[$bottom] = $TempGameValue;
    $gamearr[$switch] = $firsteam;
}
foreach($rrarr as $key=>$val){
  foreach($val as $gkey=>$games){
    if($games['home'] != 'byweek' && $games['away'] != 'byweek'){
      $allgames[] = $games;
    }
  }
}

UPDATE: I came across another good implementation of the round robin here at http://phpbuilder.com/board/showthread.php?p=10935200