GraphQL Glance

This is a quick and simple glance to the raw document (in the references), maybe you could treat it as a brief note. Hope it’s helpful to u.

At its core, GraphQL enables declarative data fetching where a client can specify exactly what data it needs from an API. GraphQL is a query language for APIs - not databases.

REST vs GraphQL

  • Data Fetching: multiple endpoints VS single query
  • Over-fetching and Under-fetching (n+1) : fixed data structure VS given exact data
  • Rapid Product Iterations on the Frontend: adjust with data change VS flexible
  • Insightful Analytics on the Backend: fine-grained insights about the data
  • Benefits of a Schema & Type System: type system => schema, frontend and backends can do their work without further communication

Core Concepts

The Schema Definition Language (SDL)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
type Person {
name: String!
age: Int! # ! means required
}
# associate
type Post {
title: String!
author: Person!
}
type Person {
name: String!
age: Int!
posts: [Post!]!
}

Fetching Data with Queries

GraphQL APIs typically only expose a single endpoint

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
##### Basic query #####

# Query with fields: name and age
{
allPersons {
name
age
}
}
# Query nested
{
allPersons {
name
age
posts {
title
}
}
}
# ==>
{
"data": {
"allPersons": [
{
"name": "Johnny",
"age": 23,
"posts": [
{
"title": "GraphQL is awesome"
},
{
"title": "Relay is a powerful GraphQL Client"
}
]
},
...
]
}
}

# Query with arguements
{
allPersons(last: 2) {
name
}
}
# ==>
{
"data": {
"allPersons": [
{
"name": "Sarah"
},
{
"name": "Alice"
}
]
}
}

Writing Data with Mutations

Three kinds of mutations:

  • creating new data
  • updating existing data
  • deleting existing data
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# when sending mutation, new information will return at the same time (in a single roundtrip).
mutation {
createPerson(name: "Bob", age: 36) {
name
age
}
}
# Expanding type
type Person {
id: ID!
name: String!
age: Int!
}
# Then can query id
mutation {
createPerson(name: "Alice", age: 36) {
id
}
}

Realtime Updates with Subscriptions

When a client subscribes to an event, it will initiate and hold a steady connection to the server. Whenever that particular event then actually happens, the server pushes the corresponding data to the client.

Unlike queries and mutations that follow a typical “request-response-cycle”, subscriptions represent a stream of data sent over to the client.

1
2
3
4
5
6
subscription {
newPerson {
name
age
}
}

Defining a Schema

Schema is often seen as a contract between the server and client.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
type Query {
allPersons(last: Int): [Person!]!
}

type Mutation {
createPerson(name: String!, age: Int!): Person!
}

type Subscription {
newPerson: Person!
}

type Person {
name: String!
age: Int!
posts: [Post!]!
}

type Post {
title: String!
author: Person!
}

Architecture

  • ransport-layer agnostic: TCP, WebSockets, etc.
  • doesn’t care database
  • doesn’t care data source

The sole purpose of a resolver function is to fetch the data for its field.

When fetching data from a REST API:

  • construct and send HTTP request (e.g. with fetch in Javascript)
  • receive and parse server response
  • store data locally (either simply in memory or persistent)
  • display data in the UI

With the ideal declarative data fetching approach:

  • describe data requirements
  • display data in UI

All the lower-level networking tasks as well as storing the data should be abstracted away and the declaration of data dependencies should be the dominant part.

Clients

  • Directly Sending Queries and Mutations: let the system take care of sending the request and handling the response
  • View Layer Integrations & UI updates
  • Caching Query Results: Concepts and Strategies
    • naive approach: put the results of GraphQL queries into the store
    • normalize the data beforehand: query result gets flattened and the store will only contain individual records that can be referenced with a globally unique ID
  • Build-time Schema Validation & Optimizations
  • Colocating Views and Data Dependencies: allows you to have UI code and data requirements side-by-side

Server

GraphQL execution

The query is traversed field by field, executing “resolvers” for each field:

  • First, every field in the query can be associated with a type
  • Then, run for every field. The execution starts at the query type and goes breadth-first.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# schema
type Query {
author(id: ID!): Author
}

type Author {
posts: [Post]
}

type Post {
title: String
content: String
}
# query
query {
author(id: "abc") {
posts {
title
content
}
}
}
# ==>
Query.author(root, { id: 'abc' }, context) -> author
Author.posts(author, null, context) -> posts
for each post in posts
Post.title(post, null, context) -> title
Post.content(post, null, context) -> content

Batched Resolving

If a resolver fetches from a backend API or database, that backend might get called many times during the execution of one query. We can wrap our fetching function in a utility that will wait for all of the resolvers to run, then make sure to only fetch each item once.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# query
query {
posts {
title
author {
name
avatar
}
}
}
# many times query, maybe like:
fetch('/authors/1')
fetch('/authors/2')
fetch('/authors/1')
fetch('/authors/2')
fetch('/authors/1')
fetch('/authors/2')
# wrap our fetching
authorLoader = new AuthorLoader()
# Queue up a bunch of fetches
authorLoader.load(1);
authorLoader.load(2);
authorLoader.load(1);
authorLoader.load(2);
# Then, the loader only does the minimal amount of work
fetch('/authors/1');
fetch('/authors/2');
# even better
fetch('/authors?ids=1,2')

More Concepts

Enhancing Reusability with Fragments

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
type User {
name: String!
age: Int!
email: String!
street: String!
zipcode: String!
city: String!
}
# information relate to address into a fragment
fragment addressDetails on User {
name
street
zipcode
city
}
# query
{
allUsers {
... addressDetails
}
}
# just like
{
allUsers {
name
street
zipcode
city
}
}

Parameterizing Fields with Arguments

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
type Query {
allUsers: [User!]!
}
type User {
name: String!
age: Int!
}
# query, default argument value
type Query {
allUsers(olderThan: Int = -1): [User!]!
}
# user `olderThan` argument
{
allUsers(olderThan: 30) {
name
age
}
}

Named Query Results with Aliases

send multiple queries in a single request

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# error, since it’s the same field but different arguments. 
{
User(id: "1") {
name
}
User(id: "2") {
name
}
}
# use alias
{
first: User(id: "1") {
name
}
second: User(id: "2") {
name
}
}

Advanced SDL

  • Object & Scalar Types
    • Scalar types represent concrete units of data. The GraphQL spec has five predefined scalars: as String, Int, Float, Boolean, and ID.
    • Object types have fields that express the properties of that type and are composable. Examples of object types are the User or Post types we saw in the previous section.
  • Enums
    • express the semantics of a type that has a fixed set of values.
    • technically enums are special kinds of scalar types.
  • Interface: used to describe a type in an abstract way
  • Union Types: express that a type should be either of a collection of other types.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
# enum
enum Weekday {
MONDAY
TUESDAY
WEDNESDAY
THURSDAY
FRIDAY
SATURDAY
SUNDAY
}
# interface
interface Node {
id: ID!
}
type User implements Node {
id: ID!
name: String!
age: Int!
}
# union
type Adult {
name: String!
work: String!
}
type Child {
name: String!
school: String!
}
union Person = Adult | Child
# retrieve information with *conditional fragments*:
{
allPersons {
name # works for `Adult` and `Child`
... on Child {
school
}
... on Adult {
work
}
}
}

Tooling and Ecosystem

GraphQL allows clients to ask a server for information about its schema. GraphQL calls this introspection.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# querying the __schema meta-field
query {
__schema {
types {
name
}
}
}
# query a single type using the __type meta-field and ask for its name and description.
{
__type(name: "Author") {
name
description
}
}

Security

Timeout

defend against large queries.

Pros:

  • Simple to implement.
  • Most strategies will still use a timeout as a final protection.

Cons:

  • Damage can already be done even when the timeout kicks in.
  • Sometimes hard to implement. Cutting connections after a certain time may result in strange behaviours.

Maximum Query Depth

By analyzing the query document’s abstract syntax tree (AST), a GraphQL server is able to reject or accept a request based on its depth.

Pros: Since the AST of the document is analyzed statically, the query does not even execute, which adds no load on GraphQL server.

Cons: Depth alone is often not enough to cover all abusive queries.

Query Complexity

Define how complex these fields are, and to restrict queries with a maximum complexity. A common default is to give each field a complexity of 1.

1
2
3
4
5
6
7
query {
author(id: "abc") { # complexity: 1
posts { # complexity: 1
title # complexity: 1
}
}
}

Also can set a different complexity depending on arguments!

1
2
3
4
5
6
7
query {
author(id: "abc") { # complexity: 1
posts(first: 5) { # complexity: 5
title # complexity: 1
}
}
}

Pros:

  • Covers more cases than a simple query depth.
  • Reject queries before executing them by statically analyzing the complexity.

Cons:

  • Hard to implement perfectly.
  • If complexity is estimated by developers, how do we keep it up to date? How do we find the costs in the first place?
  • Mutations are hard to estimate. What if they have a side effect that is hard to measure, like queuing a background job?

Throttling

In most APIs, a simple throttle is used to stop clients from requesting resources too often.

  • Throttling Based on Server Time

    • A good estimate of how expensive a query is the server time it needs to complete. We can use this heuristic to throttle queries.
    • Throttling based on time is a great way to throttle GraphQL queries since complex queries will end up consuming more time meaning you can call them less often, and smaller queries may be called more often since they will be very fast to compute.
  • Throttling Based on Query Complexity

    • We can come up with a maximum cost (Bucket Size) per time a client can use.
    • The GitHub public API actually uses this approach.

References