2010er Mac Pro für Apple jetzt obsolet

heise online Newsticker - 9. November 2017 - 10:30
Die vorletzte Generation des Mac Pro im klassischen “Käsereibe”-Gehäuse wird vom Hersteller nicht länger repariert – offizielle Ersatzteile bietet Apple nicht mehr an.

Google Pixel 2: Update entschärft Display-Probleme

heise online Newsticker - 9. November 2017 - 10:30
Unlängst hatte Google versprochen, mit einem Software-Update den vor allem beim neuen Pixel 2 XL aufgetretenen Display-Effekten entgegenwirken. Nun erreicht der Patch die ersten Geräte.

MNT Reform: DIY-Laptop aus Berlin

heise online Newsticker - 9. November 2017 - 10:30
Ein Laptop zum Selbstzusammenbauen, -umbauen und -reparieren ist das Ziel von Lukas Hartmann. Der MNT Reform ist sein erster Prototyp. Open-Hardware-Fans dürften Teile der Hardware bekannt vorkommen.

EU-Kommission will Auto-Emissionen um 30 Prozent drücken

heise online Newsticker - 9. November 2017 - 10:00
Möglich ist die Minderung nur, wenn möglichst viele Elektroautos und andere umweltfreundliche Modelle auf die Straßen kommen.

SUSE baut auf Cloud Foundry und Kubernetes

heise online Newsticker - 9. November 2017 - 9:30
Mit der neuen Cloud Application Platform will SUSE DevOps und die IT-Transformation beschleunigen. Dazu kombiniert der Linux-Distributor eine containerisierte Version von Cloud Foundry mit den Verwaltungsfunktionen von Kubernetes.

Dropsolid: James & Jenny, our toolbox for faster Drupal development

Planet Drupal - 9. November 2017 - 9:30
09 Nov James & Jenny, our toolbox for faster Drupal development Nick Vanpraet Tech Drupal 8

Be aware, this is a longread with extensive value. Only read this if you are ready to uncover our Dropsolid team's exciting dev tool and platform secrets!

 

James & Jenny might sound more like a comedy double act or the protagonists of an long-forgotten tale, but they are in fact very much alive and kicking. They are the names we gave to the platforms that we developed in-house to spin up environments faster and get work done more efficiently. How? Read on!

 

In practice

Whenever we want to spin up a new server, start a new project or even create a new testing environment, we still rely on our infrastructure team. A while ago we managed to automate our build pipeline with some smart configuration of Jenkins, an open source piece of software. Combined with a permission system, we are already able to let technical clients or consultants participate in the development process of a site by triggering a build of an environment. We decided to call this home-coded piece of software James, our in-house Drupal Cloud Butler. However, this UI was very cluttered and it was easy to break the chain. Maintenance-wise, it wasn’t the friendliest system either. James 0.1 was very helpful, but needed polishing.

Behind the scenes we started building a proper platform that was designed to supersede this existing system and take over the creation of new servers, projects and environments by adding a layer on top of this - a layer that could talk to Jenkins and would be able to execute Ansible playbooks through a managed system via RabbitMQ. You could see this as James 0.2. This version of James only has one account and isn’t built with a great many permissions in mind. Its purpose is very simple: get stuff done. This means we still can’t let clients or internal staff create new environments on James directly or set up new projects. But we’d really like to.

This is why we’re currently also investing heavily in the further development of Jenny, the site-spinning machine. Jenny’s aim is to be a user-friendly layer on top of James and it consists of of two parts: a loosely decoupled Angular application consuming a Drupal 8 backend exposed through a REST API, which in turn talks to James through its REST API. Because Jenny makes sure only calls that are allowed go through to James, James can stay focused on functionality without having to add a ton of logic to make sure the request is valid. If the person who wants that new environment isn’t allowed to request one, Jenny won’t ask James to set it up in the first place.

 

How it works

 

A Jenny user will be able to create a new organization, and within that organization create new projects or clone existing ones. These projects can be housed on our servers or on external hosting (with or without VPN, Firewalls or anything else that’s required). They’ll be able to create new environments, archive entire projects or just a single environment, build, back up, restore, sync across environments, log in to an environment’s site, etc. It will even contain information about the health of the servers and also provide analytics about the sites themselves.

Now, because single-person organisations are rather non-existent, that user will be able to add other users to their organization and give them different permissions based on their actual role within the company. A marketeer doesn’t need to know the health of a feature-testing environment, and a developer has little use in seeing analytics about the live environment.

The goal of this permission system is to provide the client enough options that they can restrict a developer from archiving live but allow them to create a new testing environment and get all needed information and access for that environment. On a sidenote: these aren’t standard Drupal permissions, because those are for members within an organization, and a single user can be a part of many organizations and have different permissions for each one.

 

End-to-end

But all these layers have to be able to talk to each other before any of that can happen. JennyA(ngular) has to talk to JennyB(ackend), JennyB then has to make sure the request is valid and talk to James. And whatever information James returns, has to be checked by JennyB, stored in the database if needed, and then transformed into a message that JennyA can do something with.

To make sure we can actually pull this off, we created the following test case:

How do we trigger a build of an environment in Jenkins from JennyA, and how do we show the build log from Jenkins in JennyA?

JennyA: build the page, get project and environment info from JennyB, create a button and send request to API. How this process happens exactly, will be explained in a different post.

JennyB

For this REST resource we need two entities: Project and Environment.
We create some new permissions (defined as options in an OrgRole entity) for our Environment entity type:

  • Create environment
  • Edit environment
  • Delete environment
  • Archive environment
  • View environment
  • View archived environment
  • Build environment

Next to this, we build a custom EntityAccessControlHandler that checks these custom permissions. An AccessControlHandler must have two methods: checkAccess() and checkCreateAccess(). In both we want to make sure Drupal’s normal permissions (which for this entity we reduce to simply ‘manage project environment entities’) still rule supreme, so superadmins can debug everything. Which is why both access checks start with a normal, bog-standard $account->hasPermission() check.

if ($account->hasPermission('administer project environment entities')) { return AccessResult::allowed(); }

But then we have to add some extra logic to make sure the user is allowed to do whatever it is they’re attempting to do. For that we grab that user’s currently active Membership. A Membership is a simple entity that combines a user, an organization, and an OrgRole entity which says what permissions the user has within that organization. For non-Create access we first check if this user is even a part of the same organization as the entity they’re trying to save.

// Get the organization for this project environment $organization = $entity->getProject()->getOrganization(); // Check that the active membership and the attached organization match $accessResult = Membership::checkIfAccountIsPartOfCorrectOrganization($organization, $account); if ($accessResult->isForbidden()) { return $accessResult; }

For brevity’s sake, I won’t explain how exactly checkIfAccountIsPartOfCorrectOrganization does its checks. But it returns an AccessResultInterface object and does exactly what it says on the tin. It also includes a reason for forbidding access, so we can more easily debug problems. You can just add a string to the creation of an AccessResult or use $accessResult->setReason() and you can then grab it using $accessResult->getReason(). Take note: only forbidden and neutral implement that method. Make sure the result implements the AccessResultReasonInterface before calling either method.

if ($accessResult instanceof AccessResultReasonInterface) { $accessResult->getReason(); }

We use this extensively with our unit testing, so we know exactly why something fails.
Assuming our test passes, we can finally check if this user has the correct permissions.

$entityOrganizationMembership = User::load($account->id())->getActiveMembership(); switch ($operation) { case 'view': if (!$entity->isActive()) { return $this->allowedIf($entityOrganizationMembership->hasPermission('view archived project environment'), 'member does not have "view archived project environment" permission'); } return $this->allowedIf($entityOrganizationMembership->hasPermission('view project environment'), 'member does not have "view project environment" permission'); case 'update': case 'delete': case 'archive': case 'build': return $this->allowedIf($entityOrganizationMembership->hasPermission($operation . ' project environment'), 'member does not have "' . $operation . ' project environment" permission'); } // Unknown operation, no opinion. return AccessResult::neutral('No operation matches found for operation: ' . $operation);

As you might have noticed, normally when you load a User you don’t get a getActiveMembership() method. But we extended the base Drupal User class and added it there. We also set that new class as the default class for the User entity, which is actually very easy:

function hook_entity_type_build(&$entity_types) { if (isset($entity_types['user'])) { $entity_types['user']->setClass('Drupal\my_module\Entity\User); } }

Now loading a user returns an instance of our own class.

For createAccess() things get trickier, because at that point the entity doesn’t exist yet.This makes it impossible to check if it’s part of the correct organization (or in this case, the correct project, which is in turn part of an organization). So here we’ll have to also implement a field level Constraint on the related project field. This article explains how to create a field level Constraint.

In this Constraint we can do our Membership::checkIfAccountIsPartOfCorrectOrganization check and be sure nobody will be able to save an environment to a project for an organization they are not a part of, regardless if they are creating one or saving one (somehow having bypassed our access check). To make doubly sure, we also set the $validationRequired property on our Environment class to true. This way entities will always demand to be validated first. If they are not, or they have errors, an exception will be thrown.

Now we can finally build our rest resource. Since a Jenkins build doesn’t exist as a custom entity within JennyB (yet), we create a custom REST resource. We use Drupal console for this and set the canonical path to “/api/project_environment/{project_environment}/build/{id}” and the “create” path to “/api/project_environment/{project_environment}/build”. We then create another resource and set that one’s canonical to “/api/project_environment/{project_environment}/build”, the same as our first resource’s “create” path. This way, when you POST to that path you trigger a new build and when you GET you receive a list of all builds for that environment. We have to split this off into two resources, because each resource can only use each method once.


We generate these resources using Drupal console. But before we can begin with our logic proper, we have to make sure the ProjectEnvironment entity gets automatically loaded. For this we need to extend the routes method from the parent class.

public function routes() { $collection = parent::routes(); // add our paramconverter to all routes in the collection // if we could only add options to a few routes, we would have // to loop over $collection->all() and add them to specific ones. // Internally, that is exactly what the addOptions method does anyway $options['parameters']['project_environment'] = [ 'type' => 'entity:project_environment', 'converter' => 'paramconverter.entity' ]; $collection->addOptions($options); return $collection; }

In the routes method you can add or remove options and requirements to your heart’s content. Whatever you can normally do in a routes.yml file, you can also do here. We've explained this in more detail in this blog post.

Let’s take a closer look at our create path. First we’ll need to make sure the user is allowed to build. Luckily thanks to our custom access handler this is very easy.

// check if user can build $entity_access = $projectEnvironment->access('build', NULL, TRUE); if (!$entity_access->isAllowed()) { // if it’s not allowed, we know it’s a forbidden or neutral response which implements the Reason interface. throw new AccessDeniedHttpException($entity_access->getReason()); }

Now we can ask James to trigger the build.

// Talk to James $data['key'] = self::VALIDATION_KEY; $url = self::API_URL . '/project/' . $projectEnvironment->getProject() ->getRemoteProjectID() . '/environment/' . $projectEnvironment->getRemoteEnvironmentID() . '/build'; $response = $this->httpClient->request('POST', $url, array('json' => $data)); $responseData = json_decode($response->getBody()->getContents(), TRUE);

For this test we use a simple key that James uses for authentication and build the URL in our REST resource. Eventually this part will be moved to a library and the code might look something like this:

$remoteProjectID = $projectEnvironment->getProject()->getRemoteProjectID(); $remoteEnvironmentID = $projectEnvironment->getRemoteEnvironmentID(); $response = $this->jamesConnection->triggerNewBuild($remoteProjectID, $remoteEnvironmentID, $data); $responseData = json_decode($response->getBody()->getContents(), TRUE);

We check the data we get back and if everything has gone well, we can update our local ProjectEnvironment entity with the new currently deployed branch.

if ($response->getStatusCode() == 200 && $data['key'] !== $projectEnvironment->getCurrentlyDeployedBranch()) { // Everything went fine, so also update the $projectEnvironment to reflect what // the currently deployed branch is $projectEnvironment->setCurrentlyDeployedBranch($data['branch']); // validate the entity $violations = $projectEnvironment->validate(); foreach ($violations as $violation) { $errors[] = $violation->getMessage(); } if (isset($errors)) { throw new BadRequestHttpException("Entity save validation errors: " . implode("\n", $errors)); } // save it $projectEnvironment->save(); }

Running validate is necessary, because we set the $validationRequired property to TRUE for our entity type. If something goes wrong, including our custom Constraints, we throw a Bad Request exception and output the validation errors.

Then we simply return what James gave us.

return new ResourceResponse($responseData, $response->getStatusCode());

On James’ end, it’s mostly the same but instead of checking custom access handlers, we (for now) just validate the key. And James in turn calls Jenkins’ API. This will also change, and James will hand off the build trigger to RabbitMQ. But for the purpose of this test, we communicate with Jenkins directly.

James then returns the ID of the newly triggered build to JennyB, who returns it to JennyA. JennyA then uses that ID to call JennyB’s canonical Build route with the given ID until success or failure has occurred.

 

Curious to read more interesting Drupal-related tidbits? Check out the rest of our blog. Or simply stay up to date every three months and subscribe to our newsletter!

Bastei Lübbe floppt mit digitaler Leseplattform Oolipo

heise online Newsticker - 9. November 2017 - 9:00
Mit multimedial angereicherten Fortsetzungsromanen wollte sich Bastei-Lübbe ein neues Geschäftsfeld erschließen – die Plattform Oolipo ist aber ein Flop.

Paradise Papers: Neue Steuervermeidungsvorwürfe bringen Apple in die Defensive

heise online Newsticker - 9. November 2017 - 8:30
Apple zahle jeden Steuer-Dollar, betonte der Konzern und veröffentlichte Details zu Steuerpraktiken. Enthüllungen zufolge hat der iPhone-Hersteller für seine massiven Auslandsgewinne einen Geschäftssitz gesucht, an dem erst gar keine Steuern anfallen.

Siemens erwägt Job-Verlagerung nach Ostdeutschland

heise online Newsticker - 9. November 2017 - 8:30
Siemens stellt den Arbeitnehmern beim geplanten Stellenabbau einen Kompromiss in Aussicht: Mit Verlagerungen bestimmter Arbeiten könnten Werke vor allem in Ostdeutschland erhalten werden.

In-Memory-Datenbanken: Software AG zieht mit HANA gleich

heise online Newsticker - 9. November 2017 - 8:00
Die Darmstädter Software AG will mit einer neuen Version ihrer Terracotta DB SAPs Vorzeigeprodukt HANA angreifen. In-Memory-Datenbanken entwickeln sich womöglich zur letzten Alternative zur Cloud.

myDropWizard.com: Using lots of different tools? Do it all in Drupal instead!

Planet Drupal - 9. November 2017 - 7:43

You need a website. You need to send an e-mail newsletter. You need to track (potential) volunteers, donors, or customers. You could use Drupal, Mailchimp and HubSpot. Or you could do it all in Drupal.

We've been using the tools above in our own organization, and we continue to use them. Yet, we've been toying with the idea of moving more of our daily usage to a more Drupal based solution. I'll try to outline some of the pros and cons of each approach. I think you'll see for many organizations the Drupal solution could end-up on the winning side of the decision!

The Heavyweight Single Purpose Tools

We've used a number of we based services at myDropWizard to help keep sales, projects, and customer communication on track.

I'll outline just a few that we use that are very popular. that would make for a good comparison with a Drupal solution.

MailChimp

Currently, we use MailChimp for newsletters. I think MailChimp is a champion product with low prices and great features. MailChimp is probably the most used email newsletter platform, so it's strengths are well known.

Studie: Nur wenige Eltern setzen Jugendschutzprogramme ein

heise online Newsticker - 9. November 2017 - 7:30
Ein Viertel der Erziehungsberechtigten hat einen technischen Filter installiert, der den Nachwuchs vor ungeeigneten Inhalten im Netz bewahren soll. 90 Prozent halten den Jugendschutz für wichtiger als einen leichten Zugriff auf Online-Angebote.

Qualcomm-Atheros: Android-November-Update schließt kritische WLAN-Treiber-Lücken

heise online Newsticker - 9. November 2017 - 7:00
Im Linux-Treiber für WLAN-Chipsätze von Qualcomm-Atheros klaffen Sicherheitslücken, über die ein Angreifer das Gerät mit Hilfe von manipulierten WLAN-Paketen knacken kann. Unter anderem sind davon Android-Geräte der Nexus- und Pixel-Reihen betroffen.

HPE-Server mit Supercomputer-Technik für 32 CPUs und 48 TByte RAM

heise online Newsticker - 9. November 2017 - 7:00
Hewlett Packard Enterprise stellt einen skalierbaren Server mit Intel-CPUs vor, der vor allem Big-Data-Anwendungen beschleunigen soll. Die Shared-Memory-Technik dafür stammt von SGI.

Elevated Third: Marketing Automation, Meet Drupal

Planet Drupal - 8. November 2017 - 22:06
Marketing Automation, Meet Drupal Marketing Automation, Meet Drupal Andy Mead Wed, 11/08/2017 - 13:06

Oh, hi there. I’d be lying if I said I wasn’t expecting you. This is a blog after all. And supposedly people read these things, which is, supposedly, why you’re here. So pull up a seat (if you’re not already sitting) and I’ll tell you why Drupal is a great partner for Marketing Automation.

Ah, Marketing Automation. (Hereafter MA, because why read 7 syllables when you can read 2?) It’s arguably the most hyped business technology of the last decade or so, spoken about in hushed tones, as though simply subscribing to a platform will print money for you. Sadly that’s not the truth. But when used properly with digital strategy, it’s pretty good at what it does: capturing latent demand and turning it into sales. The tricky part is the modifying clause that opened the last sentence, “when used properly.”

What to expect from Marketing Automation?

Marketing Automation tools and platforms these days come loaded with bells and whistles: from custom reporting engines to fancy drag-n-drop campaign UIs, and WYSIWYGs that let marketers build digital assets like landing pages and emails. And yet, despite all that fanciness, it’s still really hard to do Marketing Automation right. Why? Well, leaving aside strategic questions (a massive topic on its own), my own experience with MA always left me wanting two things - expressibility and scalability.

Drupal + Marketing Automation

While publishing workflows in Marketing Automation, tools have improved over the years. They still can’t compete with a CMS; particularly one as powerful as Drupal. Drupal empowers users to express content in terms that go far beyond simple landing pages.

In fact, Drupal is used today for just about anything you can imagine, from powering Fortune 500 marketing websites to running weather.com and acting as the backbone of custom web applications. What’s possible with Drupal is really up to you. Just ask the guy who built it.

So, fine. Drupal is great and everything. But how does it help your marketing? Well, because Drupal is so flexible, you can integrate it with almost anything:  like Google Analytics, Pardot, Marketo, Eloqua, Salesforce, and on, and on, and on. In a quickly changing technology landscape that’s an incredible strength because it acts as the nervous system for your marketing technology stack.

“Marketing technology stack?” Yeah, I don’t like business jargon, either. But, it’s a helpful way to think about digital marketing tools. Because they are just that: tools with strengths and weaknesses. You probably wouldn’t use a screwdriver to drive a nail into the wall. Sure, you could, but there’s a better tool for the job: a hammer. Likewise, your MA platform could power all your digital assets, but there’s a better tool for that job, too: Drupal.

The right tools for the job

In my experience, organizing these tools around their strengths brings better results. And here at Elevated Third, we’ve done that by connecting Drupal to Marketing Automation platforms like Pardot, Marketo, and SharpSpring; using it as the front end for services that are powering marketing programs. And moreover, MA is only a piece of that puzzle. Want to use something like HotJar? Drupal is happy to.

Open source means flexibility 

So where does this flexibility come from? Drupal is Open Source Software and there’s a massive developer community that improves it daily. Probably the strength of open source software is its flexibility.

You don’t like the way something works? Easy. Let’s change it.

Is something broken? No problem, let’s fix it.

Got a new problem that off-the-shelf solutions don’t solve? Well, then, let’s built a solution for it.

Is Drupal the right tool for every job? I’d be lying (again) if I said it was. But it’s the right tool for jobs that require unique, flexible solutions. And it could be the right tool for your job, too. If you are curious, let's talk

Cheeky Monkey Media: The Drupal Checklist Every Developer Needs

Planet Drupal - 8. November 2017 - 21:49
The Drupal Checklist Every Developer Needs cody Wed, 11/08/2017 - 19:49

Are you almost finished setting up your Drupal website? At a glance, everything might look ready to go.

But, before you hit "publish," you need to make sure you haven't made any mistakes.

A writer proofreads before they post an article. Similarly, a developer should double check their work.

The last thing you want is to go live with your site and have something go wrong. Finding problems before you launch can save some headaches and embarrassment.

We've compiled a pre-launch, Drupal checklist. When it's complete, you'll rest easy knowing that your website is ready to go.

Security

Security is the first on this Drupal checklist because it's so important. Of course you want to rest easy knowing that your site is secure when it launches. You also want your users to have peace of mind knowing that their information is safe.

Double checking your site's security will ensure that there's nothing you've missed that could make you vulnerable to hackers.

Evolving Web: Profiling and Optimizing Drupal Migrations with Blackfire

Planet Drupal - 8. November 2017 - 21:34

A few weeks ago, us at Evolving Web finished migrating the Princeton University Press website to Drupal 8. The project was over 70% migrations. In this article, we will see how Blackfire helped us optimize our migrations by changing around two lines of code.

Before we start
  • This article is mainly for PHP / Drupal 8 back-end developers.
  • It is assumed that you know about the Drupal 8 Migrate API.
  • Code performance is analyzed with a tool named Blackfire.
  • Front-end performance analysis is not in the scope of this article.
The Problem

Here are some of the project requirements related to the problem. This would help you get a better picture of what's going on:

  • A PowerShell script exports a bunch of data into CSV files on the client's server.
  • A custom migration plugin PUPCSV uses the CSV files via SFTP.
  • Using hook_cron() in Drupal 8, we check hashes for each CSV.
  • If a file's MD5 hash changes, the migration is queued for import using the Drupal 8 Queue API.
  • The CSV files usually have 2 types of changes:
    • Certain records are updated here and there.
    • Certain records are added to the end of the file.
  • When a migration is executed, migrate API goes line-by-line, doing the following things for every record:
    • Read a record from the data source.
    • Merge data related to the record from other CSV files (kind of an inner join between CSVs).
    • Compute hash of the record and compare it with the hash stored in the database.
    • If a hash is not found in the database, the record is created.
    • If a hash is found and it has changed, the record is updated.
    • If a hash is unchanged, no action is taken.

While running migrations, we figured out that it was taking too much time for migrations to go through the CSV files, simply checking for changes in row hashes. So, for big migrations with over 40,000 records, migrate was taking several minutes to reach the end of file even on a high-end server. Since we were running migrate during cron (with Queue Workers), we had to ensure that any individual migration could be processed below the 3 minute PHP maximum execution time limit available on the server.

Analyzing migrations with Blackfire

At Evolving Web, we usually analyze performance with Blackfire before any major site is launch. Usually, we run Blackfire with the Blackfire Companion which is currently available for Google Chrome and Firefox. However, since migrations are executed using drush, which is a command line tool, we had to use the Blackfire CLI Tool, like this:

$ blackfire run /opt/vendor/bin/drush.launcher migrate-import pup_subjects Processed 0 items (0 created, 0 updated, 0 failed, 0 ignored) - done with 'pup_subjects' Blackfire Run completed

Upon analyzing the Blackfire reports, we found some 50 unexpected SQL queries being triggered from somewhere within a PUPCSV::fetchNextRow() method. Quite surprising! PUPCSV refers to a migrate source plugin we wrote for fetching CSV files over FTP / SFTP. This plugin also tracks a hash of the CSV files and thereby allows us to skip a migration completely if the source files have not changed. If the source hash changes, the migration updates all rows and when the last row has been migrated, we store the file's hash in the database from PUPCSV::fetchNextRow(). As a matter of fact, we are preparing another article about creating custom migrate source plugin, so stay tuned.

We found one database query per row even though no record was being created or updated. Didn't seem to be very harmful until we saw the Blackfire report.

Code before Blackfire

Taking a closer look at the RemoteCSV::fetchNextRow() method, a call to MigrateSourceBase::count() was found. It was found that the count() method was taking 40% of processing time! This is because it was being called for every row in the CSV. Since the source/cache_counts parameter was not set to TRUE in the migration YAML files, the count() method was iterating over all items to get a fresh count for each call! Thus, for a migration with 40,000 records, we were going through 40,000 x 40,000 records and the PHP maximum execution time was being reached even before migrate could get to the last row! Here's a look at the code.

protected function fetchNextRow() { // If the migration is being imported... if (MigrationInterface::STATUS_IMPORTING === $this->migration->getStatus()) { // If we are at the last row in the CSV... if ($this->getIterator()->key() === $this->count()) { // Store source hash to remember the file as "imported". $this->saveCachedFileHash(); } } return parent::fetchNextRow(); }Code after Blackfire

We could have added the cache_counts parameter in our migration YAML files, but any change in the source configuration of the migrations would have made migrate API update all records in all migrations. This is because a row's hash is computed as something like hash($row + $source). We did not want migrate to update all records because we had certain migrations which sometimes took around 7 hours to complete. Hence, we decided to statically cache the total record count to get things back in track:

protected function fetchNextRow() { // If the migration is being imported... if (MigrationInterface::STATUS_IMPORTING === $this->migration->getStatus()) { // Get total source record count and cache it statically. static $count; if (is_null($count)) { $count = $this->doCount(); } // If we are at the last row in the CSV... if ($this->getIterator()->key() === $count) { // Store source hash to remember the file as "imported". $this->saveCachedFileHash(); } } return parent::fetchNextRow(); }Problem Solved. Merci Blackfire!

After the changes, we ran Blackfire again and found things to be 52% faster for a small migration with 50 records.

For a bigger migration with 4,359 records the migration import time reduced from 1m 47s to only 12s which means a 98% improvement. Asking why we didn't include the screenshot for the bigger migration? We did not (or rather could not) generate a report for the big migration because of two reasons:

  • While working, Blackfire stores function call and other information to memory. Running a huge migration with Blackfire might be a bit slow. Besides, our objective was to find the problem and we could do that more easily while looking at smaller figures.
  • When running a migration with thousands of rows, the migration functions are called over thousands of times! Blackfire collects data for each of these function calls, hence, the collected data sometimes becomes too heavy and Blackfire rejects the huge data payload with an error message like this:
The Blackfire API answered with a 413 HTTP error () Error detected during upload: The Blackfire API rejected your payload because it's too big.

Which makes a lot of sense. As a matter of fact, for the other case study given below, we used the --limit=1 parameter to profile code performance for a single row.

A quick brag about another 50% Improvement?

Apart from this jackpot, we also found room for another 50% improvement (from 7h to 3h 32m) for one of our migrations which was using the Touki FTP library. This migration was doing the following:

  • Going through around 11,000 records in a CSV file.
  • Downloading the files over FTP when required.

A Blackfire analysis of this migration revealed something strange. For every row, the following was happening behind the scenes:

  • If a file download was required, we were doing FTP::findFileByName($name).
  • To get the file, Touki was:
    • Getting a list of all files in the directory;
    • Creating File objects for every file;
    • For every file object, various permission, owner and other objects were created.
    • Passing all the files through a callback to see if it's name was $name.
    • If the name was matching, the file was returned and all other File objects were discarded.

Hence, for downloading every file, Touki FTP was creating 11,000 File objects of which it was only using one! To resolve this, we decided to use a lower-level FTP::get($source, $destination) method which helped us bypass all those 50,000 or more objects which were being created per record (approximately, 11,000 * 50,000 or more for all records). This almost halved the import time for that migration when working with all 11,000 records! Here's a screenshot of Blackfire's report for a single row.

So the next time you think something fishy is going on with code you wrote, don't forget to use use Blackfire! And don't forget to leave your feedback, questions and even article suggestions in the comments section below.

More about Blackfire

Blackfire is a code profiling tool for PHP which gives you nice-looking reports about your code's performance. With the help of these reports, you can analyze the memory, time and other resources consumed by various functions and optimize your code where necessary. If you are new to Blackfire, you can try these links:

Apart from all this, the paid version of Blackfire lets you set up automated tests and gives you various recommendations for not only Drupal but various other PHP frameworks.

Next Steps
  • Try Blackfire for free on a sample project of your choice to see what you can find.
  • Watch video tutorials on Blackfire's YouTube channel.
  • Read the tutorial on creating custom migration source plugins written by my colleague (coming soon).
+ more awesome articles by Evolving Web

Zahlen, bitte! Komplexe Zahlen – ein Marketing-Desaster in der Mathematik

heise online Newsticker - 8. November 2017 - 19:30
Wenn eine Werbeagentur neue Zahlen vermarkten sollte, die viel mehr Probleme lösen als die bisherigen, würde sie wohl kaum abschreckende Wörter wie "komplex" oder "imaginär" verwenden ... aber im 17. Jahrhundert fehlten die Werbetexter.

Web Summit 2017: VW und Google forschen zusammen an Quantencomputern

heise online Newsticker - 8. November 2017 - 19:00
Der deutsche Autokonzern und der US-Riese wollen zusammen Anwendungen für Quantencomputer entwickeln, unter anderem um der Datenflut künftiger Mobilitätsanwendungen Herr zu werden.

Elektromobilität: Hannover soll höchste Ladesäulendichte Deutschlands bekommen

heise online Newsticker - 8. November 2017 - 19:00
600 Ladesäulen wollen die Stadtwerke Hannover in den nächsten drei Jahren aufstellen. Der Großraum soll somit Spitzenreiter in Deutschland werden.