Lullabot: Decoupled Drupal Hard Problems: Schemas

Planet Drupal - 3. November 2017 - 17:59

The Schemata module is our best approach so far in order to provide schemas for our API resources. Unfortunately, this solution is often not good enough. That is because the serialization component in Drupal is so flexible that we can’t anticipate the final form our API responses will take, meaning the schema that our consumers depend on might be inaccurate. How can we improve this situation?

This article is part of the Decoupled hard problems series. In past articles we talked about request aggregation solutions for performance reasons, and how to leverage image styles in decoupled architectures.

TL;DR
  • Schemas are key for an API's self-generated documentation
  • Schemas are key for the maintainability of the consumer’s data model.
  • Schemas are generated from Typed Data definitions using the Schemata module. They are expressed in the JSON Schema format.
  • Schemas are statically generated but normalizers are determined at runtime.
Why Do We Need Schemas?

A database schema is a description of the data a particular table can hold. Similarly an API resource schema is a description of the data a particular resource can hold. In other words, a schema describes the shape of a resource and the datatype of each particular property.

Consumers of data need schemas in order to set their expectations. For instance, the schema tells the consumer that the body property is a JSON object that contains a value that is a string. A schema also tells us that the mail property in the user resource is a string in the e-mail format. This knowledge empowers consumers to add client-side form validation for the mail property. In general, a schema will help consumers to have prior understanding of the data they will be fetching from the API, and what data objects they can write to the API.

We are using the resource schemas in the Docson and Open API to generate automatic documentation. When we enable JSON API and  Open API you get a fully functional and accurately documented HTTP API for your data model. Whenever we make changes to a content type, that will be reflected in the HTTP API and the documentation automatically. All thanks to the schemas.

A consumer could fetch the schemas for all the resources it needs at compile time or fetch them once and cache them for a long time. With that information, the consumer can generate its models automatically without developer intervention. That means that with a single implementation once, all of our consumers’ models are done forever. Probably, there is a library for our consumer’s framework that does this already.

More interestingly, since our schema comes with type information our schemas can be type safe. That is important to many languages like Swift, Java, TypeScript, Flow, Elm, etc. Moreover if the model in the consumer is auto-generated from the schema (one model per resource) then minor updates to the resource are automatically reflected in the model. We can start to use the new model properties in Angular, iOS, Android, etc.

In summary, having schemas for our resources is a huge improvement for the developer experience. This is because they provide auto-generated documentation of the API, and auto-generated models for the consumer application.

How We Are Generating Schemas In Drupal?

One of Drupal 8's API improvements was the introduction of the Typed Data API. We use this API to declare the data types for a particular content structure. For instance, there is a data type for a Timestamp that extends an Integer. The Entity and Field APIs combine these into more complex structures, like a Node.

JSON API and REST in core can expose entity types as resources out of the box. When these modules expose an entity type they do it based on typed data and field API. Since the process to expose entities is known, we can anticipate schemas for those resources.

In fact, assuming resources are a serialization of field API and typed data is the only thing we can do. The base for JSON API and REST in core is Symfony's serialization component. This component is broken into normalizers, as explained in my previous series. These normalizers transform Drupal's inner data structures into other simpler structures. After this transformation, all knowledge of the data type, or structure is lost. This happens because the normalizer classes do not return the new types and new shapes the typed data has been transformed to. This loss of information is where the big problem lies with the current state of schemas.

The Schemata module provides schemas for JSON API and core REST. It does it by serializing the entity and typed data. It is only able to do this because it knows about the implementation details of these two modules. It knows that the nid property is an integer and it has to be nested under data.attributes in JSON API, but not for core REST. If we were to support another format in Schemata we would need to add an ad-hoc implementation for it.

The big problem is that schemas are static information. That means that they can't change during the execution of the program. However, the serialization process (which transforms the Drupal entities into JSON objects) is a runtime operation. It is possible to write a normalizer that turns the number four into 4 or "four" depending if the date of execution ends in an even minute or not. Even though this example is bizarre, it shows that determining the schema upfront without other considerations can lead to errors. Unfortunately, we can’t assume anything about the data after its serialized.

We can either make normalization less flexible—forcing data types to stay true to the pre-generated schemas—or we can allow the schemas to change during runtime. The second option clearly defeats the purpose of setting expectations, because it would allow a resource to potentially differ from the original data type specified by the schema.

The GraphQL community is opinionated on this and drives the web service from their schema. Thus, they ensure that the web service and schema are always in sync.

How Do We Go Forward From Here

Happily, we are already trying to come up with a better way to normalize our data and infer the schema transformations along the way. Nevertheless, whenever a normalizer is injected by a third party contrib module or because of improved normalizations with backwards compatibility the Schemata module cannot anticipate it. Schemata will potentially provide the wrong schema in those scenarios. If we are to base the consumer models on our schemas, then they need to be reliable. At the moment they are reliable in JSON API, but only at the cost of losing flexibility with third party normalizers.

One of the attempts to support data transformations and the impact they have on the schemas are Field Enhancers in JSON API Extras. They represent simple transformations via plugins. Each plugin defines how the data is transformed, and how the schema is affected. This happens for both directions, when the data goes out and when the consumers write back to the API and the transformation needs to be reversed. Whenever we need a custom transformation for a field, we can write a field enhancer instead of a normalizer. That way schemas will remain correct even if the data change implies a change in the schema.

undefined

We are very close to being able to validate responses in JSON API against schemas when Schemata is present. It will only happen in development environments (where PHP’s asserts are enabled). Site owners will be able to validate that schemas are correct for their site, with all their custom normalizers. That way, when a site owner builds an API or makes changes they'll be able to validate the normalized resource against the purported schema. If there is any misalignment, a log message will be recorded.

Ideally, we want the certainty that schemas are correct all the time. While the community agrees on the best solution, we have these intermediate measures to have reasonable certainty that your schemas are in sync with your responses.

Join the discussion in the #contenta Slack channel or come to the next API-First Meeting and show your interest there!

Hero photo by Oliver Thomas Klein on Unsplash.

Formel E: Saisonfinale in Montreal war kein Triumph

heise online Newsticker - 3. November 2017 - 17:00
Das Saisonfinale der Formel E in Montreal war kein Gassenfeger. Selbst durch Aufdrängen von Gratistickets an der Wohnungstür wurden die Tribünen nicht gefüllt. Nur 24% wollen eine Rückkehr des Rennens.

Europäisches Patentamt: Gewerkschaft rechnet in 95 Thesen mit der Behördenspitze ab

heise online Newsticker - 3. November 2017 - 16:30
Die Mitarbeitervertretung SUEPO hat frei nach Luther ein Manifest zum sozialen Unfrieden am Europäischen Patentamt verbreitet. Die Vorwürfe gegen das Management reichen von Taubheit gegenüber der Belegschaft bis hin zu Vetternwirtschaft.

Erster Blick auf Xbox One X: Leistungsstark und leise

heise online Newsticker - 3. November 2017 - 16:00
Microsoft protzt mit leistungsstarker Grafik, hat sein Kühlsystem aber gut im Griff. Mit Dolby Atmos klingen die Spiele wirklich beeindruckend. Ein Flaschenhals droht jedoch an anderer Stelle.

Europarat: Faktenchecks wirkungslos gegen "Informationsverschmutzung"

heise online Newsticker - 3. November 2017 - 16:00
Im Kampf gegen Desinformation im Internet müssen die emotionalen und rituellen Aspekte der Kommunikationsverbreitung stärker bedacht werden, heißt es in einem Bericht des Europarats zur wachsenden "Informationsunordnung".

c't wissen Smart Home online bestellbar

heise online Newsticker - 3. November 2017 - 16:00
Bequem auf Zuruf oder aus der Ferne das smarte Heim zu steuern, wird immer einfacher, bleibt aber auch immer noch riskant fürs eigene Netz. Wie man den Komfort nutzt und dabei ungebetene Gäste draußen hält, erklärt das Sonderheft c’t wissen Smart Home.

Erster Blick auf Xbox One X: Leiser als PS4 Pro, aber Speicherplatzprobleme

heise online Newsticker - 3. November 2017 - 15:30
Microsoft protzt mit leistungsstarker Grafik, hat sein Kühlsystem aber gut im Griff. Ein Flaschenhals existiert jedoch an anderer Stelle, wie unsere ersten Messungen zeigen.

Gratis-Upgrade auf Windows 10 für Nutzer von "Hilfstechniken" endet Ende des Jahres

heise online Newsticker - 3. November 2017 - 15:00
Wer auf "Hilfstechniken" zur erleichterten Bedienung angewiesen ist, darf immer noch kostenlos auf Windows 10 umsteigen. Doch das Angebot endet am 31. Dezember - womöglich mit Auswirkungen auf alle.

Fast Pair erleichtert Bluetooth-Verbindungen unter Android

heise online Newsticker - 3. November 2017 - 15:00
Die neue Android-Funktion Fast Pair soll das Koppeln von Bluetooth-Geräten unter Android vereinfachen und sicherer machen.

iPhone X ausprobiert: Die nächsten zehn iPhone-Jahre im Blick

heise online Newsticker - 3. November 2017 - 15:00
Zum zehnjährigen Jubiläum hat Apple einen großen Wurf versprochen: Das iPhone X soll die Richtung vorgeben, in die sich die Entwicklung der Smartphones im kommenden Jahrzehnt bewegt. Ein Realitätscheck.

InternetDevels: Responsive images in Drupal 8: beautiful on every device!

Planet Drupal - 3. November 2017 - 14:48

When does “smaller” mean “bigger”? When your images grow smaller to perfectly adjust themselves to various devices, while your user satisfaction, audience coverage, website’s speed, and profits grow bigger. A nice formula, isn’t it? This magic ability of images to adjust themselves to screens is how responsive web design works. And it works especially well in the latest Drupal version, Drupal 8, which has built-in support for responsive images.

Read more

Hinweis auf Mehrfachplanetensystem: Staubgürtel um sonnenächsten Stern Proxima Centauri

heise online Newsticker - 3. November 2017 - 14:30
Proxima Centauri ist der Stern, der unserer Sonne am nächsten ist und seit 2016 wissen wir, dass ihn mindestens ein Exoplanet umkreist. Nun haben Astronomen Hinweise gefunden, die nahelegen, dass es dort noch weitere Planeten geben könnte.

Weiterhin hohe Preise für Arbeitsspeicher: Samsung will DRAM-Produktion erhöhen

heise online Newsticker - 3. November 2017 - 14:30
Die Nachfrage für Arbeitsspeicher ist weiterhin hoch, entsprechend bleiben auch die Preise weiterhin auf hohem Niveau. Samsung kündigt derweil an, die Produktion von DRAM-Chips aufzustocken.

Apache Software Foundation gibt Apache Kafka 1.0.0 frei

heise online Newsticker - 3. November 2017 - 14:30
Zahlreiche Optimierungen im Detail sollen dem Message Broker zu mehr Tempo bei der Verarbeitung von Streaming-Daten sowie der Sicherung der Streams verhelfen.

Wissenschaftsverlag Springer Nature zensiert Angebot in China

heise online Newsticker - 3. November 2017 - 14:00
Auf Druck der chinesischen Regierung wird Springer Nature künftig Teile seines Online-Angebots zensieren. Ansonsten bestünde das Risiko, "dass sämtliche unserer Inhalte geblockt würden", so der Verlag.

Agiledrop.com Blog: AGILEDROP: Why should agencies focus on building ambitious websites

Planet Drupal - 3. November 2017 - 13:35
Dries Buytaert, the founder of Drupal, gave great session this year at Drupalcon Vienna. Watch the part where he talks about who is Drupal for. Instead of focusing on big and small websites, or SME and enterprise clients, Dries describes the type of a website Drupal is made for as ambitious.  What is not an ambitious website A business that used to have a simple brochure website is now better off being served by SaaS (software as a service) solutions like Wix and Squarespace. Facebook, Google, and Amazon are providing services that not only cover what a good-old-website did in the past, but… READ MORE

Ikeas Lichtsystem Trådfri unterstützt Apples HomeKit und Alexa

heise online Newsticker - 3. November 2017 - 13:00
Die vernetzten Lampen des Möbelhauses lassen sich nun auf Zuruf steuern – über die Assistenzsysteme Siri und Alexa. Die HomeKit-Unterstützung macht eine Bedienung mit Apples in iOS integrierter Home-App möglich, auch unterwegs.

Microsoft-Präsentation: Edge stürzt ab? Dann eben Chrome installieren.

heise online Newsticker - 3. November 2017 - 13:00
Eigentlich wollte ein Microsoft-Mitarbeiter bei der Ignite die Vorzüge von Microsofts Azure-Plattform demonstrieren. Weil aber der Edge-Browser mitten in der Präsentation streikte, lud er kurzerhand Chrome runter.

Virgin Hyperloop One will 2019 mit Bau der ersten Strecke beginnen

heise online Newsticker - 3. November 2017 - 12:30
Schon in zwei Jahren könnte die Konstruktion der erste kommerziell genutze Strecke für den Hyperloop beginnen, ist man bei Virgin Hyperloop One überzeugt. Noch ist aber nicht einmal klar, auf welchem Kontinent das passieren soll.

Apple vs. Qualcomm: iPhone bald ohne Qualcomm-Chips?

heise online Newsticker - 3. November 2017 - 12:00
Aufgrund eines erbitterten Patentstreites zieht Apple einem Bericht zufolge in Betracht, in der nächsten iPhone- und iPad-Generation zum ersten Mal komplett auf Mobilfunkchips von Qualcomm zu verzichten.